Cellular Network Failover

Most of my devices route internet traffic over cellular, and use ethernet for modbus TCP.

I have some devices with spotty cellular coverage where I’d like to route internet traffic over ethernet.

I don’t have control over these ethernet networks. If the ethernet networking changes at a site, and I lose access to the internet, I’d like to failover to cellular. This will allow me to reconfigure a device without visiting the site.

It sounds like the recommended approach for this scenario is to configure routing for both network interfaces, and to use a failover script to change the routing metric if internet access via ethernet is lost.

Does anyone here have experience with this (or a different approach), that they’d be willing to share?

Are there any gotcha’s I should watch out for?

Note: I’m using networkd, and trying to keep my images small so I can OTA update via cellular.

Cheers,
Greg

FWIW, here’s an example script from Gemini (I’d run it as a service with systemd).

TARGET="amazonaws.com"
INTERFACE="eth0"
CHECK_INTERVAL=5
MAX_FAILURES=3
FAILURE_COUNT=0

PRIMARY_METRIC=10
FAILOVER_METRIC=2000

while true; do
    # Ping once, timeout after 2 seconds, specifically on eth0
    if ping -I "$INTERFACE" -c 1 -W 2 "$TARGET" > /dev/null 2>&1; then
        if [ "$FAILURE_COUNT" -ne 0 ]; then
            echo "Internet recovered on $INTERFACE. Restoring primary route."
            networkctl metric "$INTERFACE" "$PRIMARY_METRIC"
            FAILURE_COUNT=0
        fi
    else
        ((FAILURE_COUNT++))
        echo "Check failed on $INTERFACE ($FAILURE_COUNT/$MAX_FAILURES)"

        if [ "$FAILURE_COUNT" -eq "$MAX_FAILURES" ]; then
            echo "Internet dead on $INTERFACE. Switching to cellular."
            networkctl metric "$INTERFACE" "$FAILOVER_METRIC"
        fi
    fi
    sleep "$CHECK_INTERVAL"
done