Ok, testing again after several changes to the Mender artifact and the deployment mostly works!
The system rebooted and the deployment was stuck on “rebooting” status. I had to manually approve the device. But then the deployment downloads the artifact again and reboots the device.
So there is some weird update loop. This is after creating an artifact with a unique name? The client and server have at times refused to move forward because the artifact was either already applied or uploaded.
that sounds very much like the device identity is not stable, e.g. the script
/usr/share/mender/identity returns different values before and after reboot. By default it uses the first mac address, but if that changes over time you need to replace it with a unique identifier such as a serial number or comparable.
The MAC address and interface are stable. The address is provided to Linux from U-Boot.
It is weird to me that it is possible to go from rebooting state to downloading automatically.
It seems like this more likely related to Mender not saving state between updates? Given that the following is always logged:
Returning artifact name from /etc/mender/artifact_info file. This is a fallback, in case the information can not be retrieved from the database, and is only expected when an update has never been installed before.
And the fact that I have to manually approve the device after every update.
And if the identity was changing wouldn’t that mean that the deployment would be stuck on rebooting? The deployment is based on a static group with one device.
/etc/mender/artifact_info exists which contains
device-rootfs and the actual Mender artifact I built is named
Should I write the unique artifact name I pass to
/etc/mender/artifact_info? That doesn’t seem to match this example:
artifact_name=1.0 is always written to
artifact_info and basically a git tag is passed to
mender-artifact write rootfs-image --artifact-name .....
Ok, so now I have the same value (i.e. a git hash) passed to
-n and in
artifact_info on the RFS. It doesn’t go into a boot loop, but the deployment fails with the following:
failure: Device provided conflicting request data.
And the board receives a 409:
Error receiving scheduled update data: failed to check update info on the server.
And the debug logs:
Nov 22 16:13:15 buildroot mender: time="2022-11-22T16:13:15Z" level=debug msg="Request: \"\" \"\" \"https\" \"hosted.mender.io\" \"/api
Nov 22 16:13:15 buildroot mender: time="2022-11-22T16:13:15Z" level=debug msg="request not accepted by the server: (POST https://hosted
.mender.io/api/devices/v2/deployments/device/deployments/next): Response code: 409"
This feels like some state that isn’t persisted between updates. The client seems to be checking for new updates, while the server is in the middle of an update.
Looks like I need
/var/lib/mender to persist…
It was in fact that
/var/lib/mender was not persisted. I highly recommend using the upstream service file rather than the one currently in Buildroot. If using systemd, add a
data.mount unit and update the service file to include
ExecStart=/usr/bin/mender --data "/data/mender" daemon.