Problem with rootfs update with *.mender

Hi Drew and crew :slight_smile:

I think I am almost there with the mender integration in our custom hardware. The last step is still to go, though, as my rootfs update via mender keeps on failing. Here is the situation in detail:

  1. I managed to build my ubimg with the five volumes as intended (rootfs x2, uboot-env x2, data), flashed that into my NAND and I am able to check the integration list , so I am really happy so far
  2. When trying to do a rootfs update, it fails via server deployment[1] and via local update. As a first step, I can access the two ubifs volumes with no problems (ubi0_0 and ubi0_1) after booting. I read this topic and followed Mirzak’s steps for local installation, leading to:

And finally to a successful update:

Horray! But after rebooting, the ubifs is corrupted:

and leads to a stack trace and a unfortunately not to a reboot:

stacktrace

The *.mender and the *.ubimg are results from the same build process via Yocto, extract from my local.conf:

localconf

My question is now: are there any further variables to set to get the *.mender artifact to work to flash the rootfs correctly?

Thanks!

[1] in fact there is another problem now and then with the server deployment which needs further investigation but is not related to this topic here

I don’t have any ideas at the moment. Perhaps @kacf can spot something.
Drew

After the mender install step, but before rebooting, are you able to mount the volume manually?

Thanks for answering @kacf . After updating and before booting, the effect is the same:

Can you give me some more information on how I can investigate the problem further? Overall there are several places where the problem could be. Is there a possibility to test the update process manually? Say perhaps untar the *.mender file and exchange the ubi0_1 volume on my own? Another question is: when defining the MENDER_x variables at my local.conf file, these values are available for the bootloader and the kernel implementation, arent they?

You can try to untar and dd it to the device manually, but I suspect that it won’t work because ubi devices are not regular block devices. But you can try some other things:

  1. Before running an update, can you do md5sum /dev/ubi0_1 sha256sum /dev/ubi0_1. Compare this to the checksum reported by doing mender-artifact read ARTIFACT.mender.
  2. Can you mount the image on ubi0_1 before you run mender install? At compile time they are filled with the same content, so it should be possible to mount it.
  3. What about the size of the device. Does blockdev --getsize64 /dev/ubi0_1 work? Does it match what mender-artifact read reports?
  4. What does /etc/mender/mender.conf contain?
  5. What does /etc/fw_env.config contain?

Yes.

Update: You need to use sha256sum of course, not md5sum.

1 Like

Hi Kristian,
thanks for your reply.

  1. Regarding the sha256 checksum I found out, that the ubi0_1 checksum and the *.mender checksum do not match. However, this is the case for the working version on my device vs. the mender artifact before installing the update. AFTER doing mender install, they do match (but the ubifs is corrupted).
  2. It is totally possible to mount the ubi0_1 volume before updating. As a result, the mender integration list works well, as switching between the volumes with a reboot also does the thing.
  3. I did not manage the get a valid output of the blockdev call. This is the case for any ubi volume. As far as I understand, the mender client can handle the volume correctly (see matching checksum).

  4. My mender.conf (with cut token)
    menderconf
  5. My fw_env:
    fwenv

So according to the checksum comparison, writing to the UBI volume works as intended, the not-matching checksum is strange. Or are there any hardware dependent things that must be accessible for the build process besides PEB/LEB? When flashing my *.ubimg file to the NAND, the u-boot takes care of that.

Appreciate your help!

Yes, this is indeed strange. If we look at the code which builds the ubimg, we can see that it uses the raw filesystem blob as input. It’s not rebuilding the filesystem for the artifact, and then for the ubimg. They are literally the same filesystem. So why is the checksum different? Is U-Boot changing it?

It might be helpful if you could dump the U-Boot flashed content on ubi0_1 somewhere and compare it to the one in the original ubifs. For example, use ubireader_display_info -v <FILE> on both.

Okay, I found out some more things. When flashing my NAND via u-boot I have two different possibilities: either I use the nand write or the nand write.trimffs function (which is also handled here: clicky, more infos: clicky). The first one will lead to a successful boot, but I cannot reboot then due to a ubi_io_read error. with lots of ECC errors. BUT: the checksum of the ubi0_1 volume matches then the one from my host build. On the other side, the second command will lead to a perfectly fine reboot, but the checksum differs (as 0xff data at the end of pages will not be written to the NAND flash). As a result (and as far as I can see) I either have to fix my reboot issue with the normal nand write OR ensure that the kernel writes in a nand write.trimffs mode …

This is a very interesting finding, @buffo. I don’t think the client takes this into account. But it may not need to take it into account if the filesystem size is an exact multiple of the MENDER_UBI_LEB_SIZE (which it should be). Can you check what it is with:

bitbake -e core-image-minimal | grep ROOTFS_SIZE=

You’ll get several numbers, they might all be interesting.

1 Like

Thanks for the information! Currently I am building with my own image, nevertheless I also checked the minimal-image:

As a result, the IMAGE_ROOTFS_SIZE must be a multiple of MENDER_UBI_LEB_SIZE (126976 in my case)?

And the file size inside the artifact, it matches this? (maybe you can just post the output from mender-artifact read)

Ya, the rootfs size matches this number. On the one side the size of the direct build output and on the other side the one inside the mender artifact, as 231,508 * 1,024 = 237,064,192.

I played a little bit with the mtd-utils-ubifs tools on my device. I don’t know if this information is helpful, as I am not aware of the mender-client mechanics. What I did was:

  1. (Skipped attaching ubi_ctrl as its already attached)
  2. Removed the erroneous ubi volume with ubirmvol /dev/ubi0 -n 1
  3. Checked available eraseblocks and their size. Please note the available blocks differ by 1 to my expectation (1868 vs. 1867):
    eraseblocks
  4. Create new volume: ubimkvol /dev/ubi0 -N rootfsb_alt -s 237191168
  5. Mount it and untar the *.tar.gz to the volume.

That did the “light update” for me.

Hi @Alan,
as discussed I tried the following items:

Usage of the update module rootfs-v2:
First I created a .conf file at /etc/mender/rootfs-image-v2.conf with the following content:

#!/bin/sh

MENDER_ROOTFS_PART_A="/dev/ubi0_0"
MENDER_ROOTFS_PART_B="/dev/ubi0_1"

Second I build my related artifact, but when trying to install the artifact, I run in the following problem, that I cannot cat to the inactive ubifs:

root@device:/mnt# mender install my-rootfs-update-1.0.mender 
INFO[0000] Loaded configuration file: /var/lib/mender/mender.conf 
INFO[0000] Loaded configuration file: /etc/mender/mender.conf 
WARN[0000] Could not resolve path link: ubi0_0 Attempting to continue 
WARN[0000] Could not resolve path link: ubi0_1 Attempting to continue 
INFO[0000] Mender running on partition: ubi0_0          
INFO[0000] Start updating from local image file: [my-rootfs-update-1.0.mender] 
Installing Artifact of size 49954816...
INFO[0000] No public key was provided for authenticating the artifact 
INFO[0000] Update module output: /dev/ubi0_1            
INFO[0000] Update module output: streams/swarco-image-scc-air-v2.ext4 
INFO[0000] Update module output: cat: write error: Operation not permitted 
ERRO[0000] Download failed: Payload: can not install Payload: image-device.ext4: Update module terminated abnormally: exit status 1 
ERRO[0000] Payload: can not install Payload: image-device.ext4: Update module terminated abnormally: exit status 1

This is the related snippet (I added some debug output):

case "$1" in
    Download)
        file="$(cat stream-next)"
        echo $passive
        echo $file
        cat "$file" > $passive
        if [ "$(cat stream-next)" != "" ]; then
            echo "More than one file in payload"
            exit 1
        fi
        ;;

What am I doing wrong here, have I to set some permissons first?

Was there a successful manual update (by extracting the content of the mender artifact and flashing it to the inactive partition)?

Trying to flash the ext4 file manually into the inactive partition ends up in the same result as using the rootfs-v2 update module. I cannot write to ubi0_1 due to writing permissions. When I mount it first and untar it to there, it totally works fine.

Are you using ext4? On UBI only ubifs is supported, as far as I know. This could explain why it can’t be mounted, and why creating it manually and untaring on top of it works.

I think you cannot do direct writing to UBI devices like rootfs-image-v2 does. It was only ever made to work with regular block devices. According to this thread you can use this to flash the volume:

ubiupdatevol /dev/ubi1_0 UPDATEFILE

Okay, so I cannot use the rootfs-image-v2 update module to update my pre-made ubifs? As suggested, the ubiupdatevol method works with no problem, but the update via the mender client still fails.

Ok, I don’t know why that is. You can certainly adapt rootfs-image-v2 to work with ubiupdatevol though. Just look for whether the device matches the /dev/ubi* pattern, and if so, use ubiupdatevol instead of cat.

I suspect though, that ubiupdatevol will not work with a pipe, which is what is given to cat (the file variable in the rootfs-image-v2 script). So it may be better to make a new Update Module with the Download state removed, and rely on calling ubiupdatevol with a file instead, as per the file tree API. If you do make this work, it would be a worthwhile addition to Mender’s Update Modules.