Mender authorization fails after update

Hi,

I tested a recent deployment of the RFS to a device in the lab and it succeeded (I still owe you the board integration for the Digi iMX6 module). However, when we try to update devices out in the field, it looks like it succeeds in the update, but there’s an issue communicating with the server. The device gives up after numerous attempts and rolls back.

I see a bunch of these errors:

2025-03-05 13:37:48 +0000 UTC info: Device unauthorized; attempting reauthorization
2025-03-05 13:37:48 +0000 UTC info: Output (stderr) from command "/usr/share/mender/identity/mender-device-identity": using interface /sys/class/net/eth0
2025-03-05 13:37:48 +0000 UTC error: Failure occurred while executing authorization request: Method: Post, URL: https://hosted.mender.io/api/devices/v1/authentication/auth_requests
2025-03-05 13:37:48 +0000 UTC error: Failed to authorize with "https://hosted.mender.io": Unknown url.Error type: dial tcp: lookup hosted.mender.io on [::1]:53: read udp [::1]:41638->[::1]:53: read: connection refused
2025-03-05 13:37:48 +0000 UTC warning: Reauthorization failed with error: transient error: authorization request failed
2025-03-05 13:37:48 +0000 UTC error: Failed to report status: transient error: authorization request failed
2025-03-05 13:37:48 +0000 UTC error: error reporting update status: reporting status failed: transient error: authorization request failed
2025-03-05 13:37:48 +0000 UTC error: Failed to send status report to server: transient error: reporting status failed: transient error: authorization request failed

Eventually it gives up and rolls back:

2025-03-05 13:38:48 +0000 UTC error: transient error: Tried sending status report maximum number of times.
2025-03-05 13:38:48 +0000 UTC info: State transition: update-pre-commit-status-report-retry [ArtifactCommit_Enter] -> rollback [ArtifactRollback]
2025-03-05 13:38:48 +0000 UTC info: Performing rollback
2025-03-05 13:38:48 +0000 UTC info: Rolling back to the inactive partition (2).
2025-03-05 13:38:48 +0000 UTC info: State transition: rollback [ArtifactRollback] -> rollback-reboot [ArtifactRollbackReboot_Enter]

Can you help on what’s going wrong?

Hi @mabembedded,

The default behavior is that re-starting the Mender Client and being able to contact the backend are the success condition. So if the Client is not able to reach the server after reboot, it will roll back.
The line

2025-03-05 13:37:48 +0000 UTC error: Failed to report status: transient error: authorization request failed

sounds strange. Maybe a long shot, but possibly there is a problem with the persistent data. The keys for authorizing with the backend are on /data by default, and if for whatever reason those cannot be accessed after a reboot the behavior would look exactly like that. So my first advice would be to check the mounting vs. client startup ordering, and make sure the persistent data location is actually as expected.

Greetz,
Josef

Ok, the issue ended up being specific to how we’re networked. Essentially, the WiFi network (which is what most users use in the field) has Internet connection but the Ethernet connection does not (it only has local network connectivity).

Prior to an update, the user configures the WiFi through the normal use case of the device (eventually via NetworkManager/nmcli). However, since the connection information is stored in /etc/, it goes away when we perform an update. And since the Ethernet connection has an IP address but can’t reach the Mender backend, I imagine the client thinks that something is wrong and reverts.

I’m working on moving NetworkManager settings to /data/.