Manual install after rollback

bender · December 7, 2021, 8:00pm

For reasons outside mender, one of our deployments to version 2, could not commit and did a rollback to previously active partition and version 1. Inactive partition seems to have version 2 intact (sha256sum reports correctly).
Is there a way to force it without re-downloading the file? The remote machines are over (expensive) satellite link, so we are trying to save resources.
Thanks

lramirez · December 7, 2021, 9:43pm

Hello @bender,

Do you have access to the machines? If so, I would switch the bootloader’s active partition using fw_setenv as described in the step 5 and then reboot the device.

Does this approach work for you?

Regards,
Luis

bender · December 7, 2021, 10:13pm

that is the first step of the plan but how do I get it synced up afterwards? I assume show-artifact would return the version 1’s artifact. Am I wrong?
By the way version 1 has mender client 2.3.0-dirty and version 2 has 2.6.1.

lramirez · December 7, 2021, 10:55pm

Hello @bender,

Assuming everything is OK with the inactive partition and contains exactly the data you expected, this procedure should work. As soon as you switch the active partition, it will get the name of the artifact and the mender-client version from it so that the inventory will get updated with the correct information sooner or later, depending on the polling interval you defined. If you have a development board, I would recommend testing the partition switching to understand if this flow will lead you to the expected output.

Regards,
Luis

bender · December 8, 2021, 10:22am

While in partition 3, we run these:

root@host:~ # mount | head -n 1
/dev/mmcblk0p2 on / type ext4 (ro,relatime)

root@host:~ # sha256sum /dev/mmcblk0p3 
56731593b02978134b078bbee7927f660c9f5294c6e498031fb090d35d43d55c  /dev/mmcblk0p3

root@host:~ # fw_setenv mender_boot_part_hex 3
root@host:~ # fw_setenv mender_boot_part 3
root@host:~ # fw_setenv upgrade_available 1

p3’s hash is correct for version 2.
rebooted to partition 3 and run these:

root@host:~ # mender show-artifact
1
root@host:~ # mender show-provides
artifact_group=
artifact_name=1
rootfs_image_checksum=e8cf8de6f6a56a037753570746264eb78e9b6f6f7a0bd0ecfa4341ad2155afd3

which indeed is the hash of version 1 (and obviously not 2).

Can I “force” mender client to reevaluate?

oleorhagen · December 8, 2021, 12:06pm

The Artifact name and Provides are stored in the database, and are as such persistent across reboots, updates and rollbacks.

If the database is deleted, the device will fall back to using the artifact_name file, which you should check is present, and correct on the new partition.

Then you can go ahead and delete the database (on the test device of course), and see if this works out for you.

Your device provides should now be empty, and the artifact_name be reflected from the value in the artifact file.

bender · December 8, 2021, 12:08pm

Can this procedure cause any other issue? Will we be able to send a differential update afterwards?
Please provide the command to delete the database

oleorhagen · December 8, 2021, 1:09pm

Yes, you’re device provides will be gone afterwards, so the database will have to be re-populated through a new update.

No, same reason as above.

The database can be found at /var/lib/mender/mender-store on the data partition.

bender · December 8, 2021, 1:22pm

A non-differential update costs us the same as resending now the version 2, so this is also not an option. Is there any other alternative way? I assume not, but worth asking

oleorhagen · December 8, 2021, 2:04pm

There is possibly a way around this however, but this is as HACKY, so please be careful here.

But the database is an lmdb database, and it is not big, and the values are stored in text.

You could try and simply and change the text string here for the checksum of the image.

A better approach would probably be something like this and rewrite the db.

However I have never tried this myself, or known of anyone who have done this, so please be very careful when doing this, it subverts all the consistency built into mender by default.

But on a local test-device it would be interesting to see what you get out of it

bender · December 14, 2021, 7:27am

Still struggling to get LMDB to work and open the db

bender · December 14, 2021, 8:59am

Does anyone have any more experience on LMDB? I’m on ubuntu:

$ ls lab
mender-store
$ lmdb_stat lab/
Status of Main DB
  Tree depth: 0
  Branch pages: 0
  Leaf pages: 0
  Overflow pages: 0
  Entries: 0

Topic		Replies	Views
Mender switch root fs partitions? General Discussions	6	3466	February 11, 2022
Force Mender to boot form the passive partition General Discussions	1	101	May 6, 2024
Simple standalone update procedure / Manual rollback possible? General Discussions	3	2081	June 19, 2019
Rollback after commit and reboot General Discussions	4	1911	May 11, 2023
Transient error: Reboot to the new update failed. Active partitions unclear General Discussions deployment	1	313	October 26, 2023

Manual install after rollback

Related topics