Mender Authentication Fails over Cellular Modem Interface (wwan0)

Hello!

I’ve been managing a small fleet of Debian devices using a basic Mender plan along with the “Configure” and “Troubleshoot” add-ons. The devices are field-deployed sensors using Raspberry Pi Zero 2W.

Recently, I’ve prepared a software update enabling support of LTE modems connected directly to the devices themselves, meaning that they now use a wwan0 network interface (a QMI modem managed by modemmanager) rather than a WiFi or Ethernet network interface (wlan0 or eth0). I’ve discovered an interesting issue - when wwan0 is the only network connection that’s online, the devices are not able to connect to Mender. The mender-configure logs report:

Jun 16 15:48:05 device1 mender[5373]: time="2024-06-16T15:48:05-07:00" level=info msg="Output (stderr) from command "/usr/share/mender/inventory/mender-inventory-mender-configure": An authentication token could not be obtained over DBus."
Jun 16 15:48:05 device1 mender[5373]: time="2024-06-16T15:48:05-07:00" level=warning msg="Inventory tool /usr/share/mender/inventory/mender-inventory-mender-configure wait failed: exit status 1"
Jun 16 15:48:07 device1 mender[5373]: time="2024-06-16T15:48:07-07:00" level=info msg="Device unauthorized; attempting reauthorization"

Here are some things I’ve tested and verified while troubleshooting:

  • My device identity script is not using a value that changes depending on the network interface or MAC address (it’s fetching the Raspberry Pi serial number)
  • The device can ping the configured nameservers (8.8.8.8, 8.8.4.4)
  • It can also reach internet locations such as mender.io
  • Other functionality requiring a healthy interface on wwan0 seems to be working normally
  • I can even see a response from the hosted mender server if I test using curl.
  • There are no new pending authorization requests in mender server
  • The token is valid and my device connects normally if not using the wwan0 interface

I initially suspected a system-time related issue, but my system time syncs properly with NTP internet time servers when the modem connection comes online. Restarting mender-client and mender-connect or waiting to start them until after the modem is already online has no effect.

I also tried upgrading from Mender 3.5.2 to Mender 4.0.2, which splits the tasks performed by mender-client into mender-authd and mender-updated. This has changed the error messages, but in essence the behavior seems to be the same.

Here are the new error messages produced by mender-authd:

Jun 17 12:29:44 device1 mender-auth[555]: record_id=1 severity=info time="2024-Jun-17 12:29:44.207472" name="Global" msg="Successfully loaded private key from /var/lib/mender/mender-agent.pem"
Jun 17 12:29:44 device1 mender-auth[555]: record_id=2 severity=info time="2024-Jun-17 12:29:44.261079" name="Global" msg="The authentication daemon is now ready to accept incoming authentication request"
Jun 17 12:29:45 device1 mender-auth[555]: record_id=3 severity=info time="2024-Jun-17 12:29:45.025369" name="Global" msg="Signing with: /var/lib/mender/mender-agent.pem"
Jun 17 12:29:51 device1 mender-auth[555]: record_id=4 severity=info time="2024-Jun-17 12:29:51.422003" name="Global" msg="Authentication error trying server 'https://hosted.mender.io/': No route to host: POST https://hosted.mender.io/api/devices/v1/authentication/auth_requests: "
Jun 17 12:29:51 device1 mender-auth[555]: record_id=5 severity=error time="2024-Jun-17 12:29:51.422293" name="Global" msg="Failed to fetch new token: Authentication error: No more servers to try for authentication"
Jun 17 12:29:51 device1 mender-auth[555]: record_id=6 severity=info time="2024-Jun-17 12:29:51.459794" name="Global" msg="Signing with: /var/lib/mender/mender-agent.pem"

And here are the error messages produced by mender-updated:

Jun 17 12:29:42 device1 systemd[1]: Started Mender OTA update service.
Jun 17 12:29:44 device1 mender-update[558]: record_id=1 severity=info time="2024-Jun-17 12:29:44.427786" name="Global" msg="Running Mender client 4.0.3"
Jun 17 12:29:44 device1 mender-update[558]: record_id=2 severity=info time="2024-Jun-17 12:29:44.430473" name="Global" msg="The update client daemon is now ready to handle incoming deployments"
Jun 17 12:29:44 device1 mender-update[648]: An authentication token could not be obtained over DBus.
Jun 17 12:29:44 device1 mender-update[558]: record_id=3 severity=error time="2024-Jun-17 12:29:44.727532" name="Global" msg="'/usr/share/mender/inventory/mender-inventory-mender-configure' failed: Process exited with status 1"
Jun 17 12:29:51 device1 mender-update[558]: record_id=4 severity=error time="2024-Jun-17 12:29:51.423967" name="Global" msg="Failed to set new authentication data on HTTP request"
Jun 17 12:29:51 device1 mender-update[558]: record_id=5 severity=error time="2024-Jun-17 12:29:51.424341" name="Global" msg="Request to push inventory data failed: Cannot submit API request"
Jun 17 12:29:51 device1 mender-update[558]: record_id=6 severity=error time="2024-Jun-17 12:29:51.424544" name="Global" msg="Failed to submit inventory: Unauthorized error: Cannot submit API request"
Jun 17 12:31:07 device1 mender-update[558]: record_id=7 severity=warning time="2024-Jun-17 12:31:07.431356" name="Global" msg="Timed-out waiting for a new token"
Jun 17 12:31:07 device1 mender-update[558]: record_id=8 severity=error time="2024-Jun-17 12:31:07.431742" name="Global" msg="Failed to obtain authentication credentials"
Jun 17 12:31:07 device1 mender-update[558]: record_id=9 severity=error time="2024-Jun-17 12:31:07.431960" name="Global" msg="Request to check new deployments failed: Timed-out waiting for a new token"
Jun 17 12:31:07 device1 mender-update[558]: record_id=10 severity=error time="2024-Jun-17 12:31:07.432147" name="Global" msg="Error while polling for deployment: Authentication error: Timed-out waiting for a new token"

This has been a head-scratcher for sure. I’d appreciate any tips!