Mender_boot_part and mender_boot_part_hex variables are not updating after successful FOTA

Dear community,

I’m using Yocto SUMO version meta-mender for our FOTA update.

We are observing a strange issue. i.e., even after successful FOTA, i mean partition is switching from old partition to new partition. But, the uboot environment variables “mender_boot_part” & “mender_boot_part_hex” are not getting updated after this. Due to this, partition is rolling back after another power cycle (reboot).

For example : Assume we are in partition-1(old software) and performed FOTA. After power cycle the partition value is updated to partition-2(new software). But the uboot env variable are like this : mender_boot_part = 1 & mender_boot_part_hex = 1. After another power cycle, due to mender variables are not getting updated, the partition rolls back to partition-1(old software).

Can you help here to find the reason why this uboot environment variable “mender_boot_part” & “mender_boot_part_hex” are not getting updated ?

Hello @vasujois,

Thanks for getting in touch! From a very superficial point of view, it sounds like writing to the u-boot environment fails. Can you please share the mender-client logs on such an update process in order to diagnose?
For better understanding, which board is this? Which integration steps are you following?

Greetz,
Josef

PS: on a side note, sumo is end of life since years by now, so you should definitely upgrade, if only for security reasons.

Please find the below mender logs

time=“2023-01-21T07:00:32Z” level=debug msg=“Configuration file does not exist: /var/lib/mender/mender.conf” module=config
time=“2023-01-21T07:00:32Z” level=debug msg=“Reading Mender configuration from file /etc/mender/mender.conf” module=config
time=“2023-01-21T07:00:32Z” level=info msg=“Loaded configuration file: /etc/mender/mender.conf” module=config
time=“2023-01-21T07:00:32Z” level=debug msg=“Merged configuration = &main.menderConfig{menderConfigFromFile:main.menderConfigFromFile{ClientProtocol:"", ArtifactVerifyKey:"", HttpsClient:struct { Certificate string; Key string; SkipVerify bool }{Certificate:"", Key:"", SkipVerify:false}, RootfsPartA:"/dev/mmcblk0p1", RootfsPartB:"/dev/mmcblk0p2", UpdatePollIntervalSeconds:5, InventoryPollIntervalSeconds:5, RetryPollIntervalSeconds:30, StateScriptTimeoutSeconds:0, StateScriptRetryTimeoutSeconds:0, StateScriptRetryIntervalSeconds:0, ModuleTimeoutSeconds:0, ServerCertificate:"/etc/mender/server.crt", ServerURL:"https://docker.mender.io", UpdateLogPath:"", TenantToken:"dummy", Servers:client.MenderServer{client.MenderServer{ServerURL:"https://docker.mender.io"}}}, ModulesPath:"/usr/share/mender/modules/v3", ModulesWorkPath:"/var/lib/mender/modules/v3", ArtifactInfoFile:"/etc/mender/artifact_info", ArtifactScriptsPath:"/var/lib/mender/scripts", RootfsScriptsPath:"/etc/mender/scripts"}” module=config
time=“2023-01-21T07:00:32Z” level=debug msg=“Have U-Boot variable: mender_check_saveenv_canary=1” module=bootenv
time=“2023-01-21T07:00:32Z” level=debug msg=“List of U-Boot variables:map[mender_check_saveenv_canary:1]” module=bootenv
time=“2023-01-21T07:00:33Z” level=debug msg=“Have U-Boot variable: mender_saveenv_canary=1” module=bootenv
time=“2023-01-21T07:00:33Z” level=debug msg=“List of U-Boot variables:map[mender_saveenv_canary:1]” module=bootenv
time=“2023-01-21T07:00:33Z” level=debug msg=“Have U-Boot variable: mender_boot_part=2” module=bootenv
time=“2023-01-21T07:00:33Z” level=debug msg=“List of U-Boot variables:map[mender_boot_part:2]” module=bootenv
time=“2023-01-21T07:00:33Z” level=debug msg=“Setting active partition from mount candidate: /dev/mmcblk0p2” module=partitions
time=“2023-01-21T07:00:33Z” level=info msg=“Mender running on partition: /dev/mmcblk0p2” module=main
time=“2023-01-21T07:00:33Z” level=debug msg=“I/O read error for entry mender-agent.pem: open /var/lib/mender/mender-agent.pem: no such file or directory” module=dirstore
time=“2023-01-21T07:00:33Z” level=debug msg=“private key does not exist” module=keystore
time=“2023-01-21T07:00:33Z” level=debug msg=“ModuleTimeoutSeconds not set. Defaulting to 14400 seconds” module=modules
time=“2023-01-21T07:00:33Z” level=debug msg=“Starting device update.” module=standalone
time=“2023-01-21T07:00:33Z” level=info msg=“Start updating from local image file: [/logs/app/media/bosch/mbs/data/fwmanager/downloads/2A_CMI_EDGE_OS_07_130_20_dev_820230111123707_ota.pkg]” module=standalone
time=“2023-01-21T07:00:33Z” level=debug msg=“Fetching update from file results: [&{0x4420082b90}], 50937344, ” module=standalone
time=“2023-01-21T07:00:33Z” level=debug msg=“Read data from device manifest file: device_type=imx8dx_ccu” module=device
time=“2023-01-21T07:00:33Z” level=debug msg=“Current manifest data: imx8dx_ccu” module=device
time=“2023-01-21T07:00:33Z” level=debug msg=“statescript: timeout for executing scripts is not defined; using default of 1h0m0s seconds” module=executor
time=“2023-01-21T07:00:33Z” level=info msg=“no public key was provided for authenticating the artifact” module=installer
time=“2023-01-21T07:00:33Z” level=debug msg=“checking if device [imx8dx_ccu] is on compatibile device list: [imx8dx_ccu]\n” module=installer
time=“2023-01-21T07:00:33Z” level=debug msg=“installer: successfully read artifact [name: 07.130.20; version: 3; compatible devices: [imx8dx_ccu]]” module=installer
time=“2023-01-21T07:00:33Z” level=debug msg=“Trying to install update of size: 260046848” module=“dual_rootfs_device”
time=“2023-01-21T07:00:33Z” level=debug msg=“Active partition: /dev/mmcblk0p2” module=partitions
time=“2023-01-21T07:00:33Z” level=debug msg=“Detected inactive partition /dev/mmcblk0p1, based on active partition /dev/mmcblk0p2” module=partitions
time=“2023-01-21T07:00:33Z” level=info msg=“native sector size of block device /dev/mmcblk0p1 is 512, we will write in chunks of 1048576” module=“dual_rootfs_device”
time=“2023-01-21T07:00:33Z” level=info msg=“opening device /dev/mmcblk0p1 for writing” module=“block_device”
time=“2023-01-21T07:00:33Z” level=info msg=“partition /dev/mmcblk0p1 size: 260046848” module=“block_device”
time=“2023-01-21T07:01:12Z” level=info msg=“wrote 260046848/260046848 bytes of update to device /dev/mmcblk0p1” module=“dual_rootfs_device”
time=“2023-01-21T07:01:12Z” level=debug msg=“statescript: timeout for executing scripts is not defined; using default of 1h0m0s seconds” module=executor
time=“2023-01-21T07:01:12Z” level=debug msg=“statescript: timeout for executing scripts is not defined; using default of 1h0m0s seconds” module=executor
time=“2023-01-21T07:01:12Z” level=debug msg=“Inactive partition: /dev/mmcblk0p1” module=partitions
time=“2023-01-21T07:01:12Z” level=debug msg=“Marking inactive partition (/dev/mmcblk0p1) as the new boot candidate.” module=“dual_rootfs_device”
time=“2023-01-21T07:01:12Z” level=info msg=“Enabling partition with new image installed to be a boot candidate: 1” module=“dual_rootfs_device”
time=“2023-01-21T07:01:12Z” level=debug msg=“Have U-Boot variable: mender_check_saveenv_canary=1” module=bootenv
time=“2023-01-21T07:01:12Z” level=debug msg=“List of U-Boot variables:map[mender_check_saveenv_canary:1]” module=bootenv
time=“2023-01-21T07:01:12Z” level=debug msg=“Have U-Boot variable: mender_saveenv_canary=1” module=bootenv
time=“2023-01-21T07:01:12Z” level=debug msg=“List of U-Boot variables:map[mender_saveenv_canary:1]” module=bootenv
time=“2023-01-21T07:01:13Z” level=debug msg=“Marking inactive partition as a boot candidate successful.” module=“dual_rootfs_device”
time=“2023-01-21T07:01:13Z” level=debug msg=“statescript: timeout for executing scripts is not defined; using default of 1h0m0s seconds” module=executor

I see you are using standalone mode. Did you run mender commit after booting into the new partition? This is what makes the change permanent. If you did, perhaps you could post the log from mender --log-level debug commit as well.

Yes, we are using standalone mode.

We observed a new behavior recently. It is like this… After performing FOTA and power cycle (reboot), the partition is switching to new partition. After booting from the new partition, we performed 30 reboots without performing mender commit. Here, the device is booted from the latest partition in 29 iterations out 30 iterations. But, in one iteration we observed the roll back which means the device booted with old partition. Why it is so ?

Does it not required commit after FOTA ?
If yes, then why there is no roll back observed in our test 29 times ? why only one time ?
What technically mender commit does ?
Under which logical conditions mender will update “mender_boot_part” & “mender_boot_part_hex” u-boot environment variables back to old partition values ?

Thank You.

Hi @vasujois,

I’m really sorry, but given the facts that you are on a

  • very outdated Yocto version (Sumo is unmaintained since 2019)
  • presumably outdated Mender client version - you didn’t name it, but the last one we had on Sumo is 2.2.x
  • using it in standalone mode with custom scripting around it
    means that we are dealing with a use case that is highly specific to you.

If you need help in moving forwards with either the client integration/migration in your project, or moving to a newer Yocto release respectively a managed solution, then please get in touch with us directly for professional services.

Thank you!
Josef

Have you got any state scripts installed that could be causing a race condition that may be being run, as I believe from 2.0.0 even in standalone mode the client will run them.

I myself have older products running sumo and mender client that doesn’t exhibit the problems you mention, however I have backported mender 2.5.0 into my sumo builds for those, so could be related to older client, or mender uboot integration code.

1 Like

we are having 2 sets of u-boot environmental variables in the rpmb area. So when we perform the fota one set gets updated from mender.
one observation is seen that after about 100 iteration. Though the mender has updated one of the set, fw_printenv command is referring the old set.
we are using fw_printenv to decide whether the partition has switched or not. Since fw_printenv is pointing to old set of RPMB, “mender_boot_part” and “mender_boot_part_hex” values remains the same. hence we are not going for commit.

  1. is this expected behavior of fw_printenv?
  2. can we use “mount” command instead of fw_printenv? for us “mount” command is giving the updated partition value every time.
  3. do you recommend any other way to get the updated partition value?