Mender update with initramfs

Hello Mender community,

Currently I’m test driving Mender as a possible update solution for our distribution.
We have a compute-module which includes the imx6ull with some onboard flash memory (NAND).
The system uses u-boot, which loads our fitImage that includes the device tree and kernel.

I took inspiration from the community meta layer which included somethings that where required
to get it to run. I’m quite impressed in how quickly I got everything up and running, so far, so good you
might think.

The real hurdle started today where I was testing the system with an initramfs implementation which
we need to support secure boot. I got the following message from the mender client:

mender[1165]: time="2024-03-14T15:33:30Z" level=error msg="Artifact install failed: Payload: can not install Payload: my-image.ubifs: Active root partition matches neither RootfsPartA nor RootfsPartB."

So I dove into the console and verified that all settings where correct, at-least that is what I think ;-).

cat /data/mender/mender.conf 
{
    "RootfsPartA": "ubi0_0",
    "RootfsPartB": "ubi0_1"
}
cat /etc/mender/mender.conf
{
    "InventoryPollIntervalSeconds": 300,
    "RetryPollIntervalSeconds": 300,
    "ServerURL": "perhaps-less-secret",
    "TenantToken": "very-secret",
    "UpdatePollIntervalSeconds": 180
}

When I remove the initramfs from the fitImage and adjust the ubiargs then everything boots fine and
the device can be updated by means of Mender. If reinsert the initramfs then I got the previously
mentioned error.

I believe it has something to do with the changes that are different between the initramfs enabled boot
process and the one without initramfs, but I can’t figure out why the mender-client fails. The mount
command seems OK to me (I only pasted what I think was needed).

/dev/ubi0_1 on / type ubifs (ro,relatime,assert=read-only,ubi=0,vol=1)
/dev/ubi0_2 on /data type ubifs (rw,relatime,assert=read-only,ubi=0,vol=2)
/data/overlay-etc/upper on /etc type overlay (rw,relatime,lowerdir=/etc,upperdir=/data/overlay-etc/upper,workdir=/data/overlay-etc/work)

I use the trial version of the Mender server, the mender-client is version 3.5.2 (runtime: go1.17.13). The
meta-mender repository is fixed to tag “kirkstone-v2023.12”.

Could anyone point me in the correct direction into solving this issue.

Thank you advance, with kind regards,

Jeffrey Simons

Software Engineer

  • Royal Boon Edam International B.V.

Hi @JSimons,

The error message definitely sounds like it should work, with /dev/ubi0_1 as the matching root partition. Can you please post the kernel command line in effect (/proc/cmdline) for both cases, maybe this gives a useful hint.

Greets,
Josef

Hi @TheYoctoJester,

Thank you for your response.
I had to swap hardware modules because I’m currently not working at home, so this devices is still
pointing to ubi0_0 as the primary rootfs.

cat /proc/cmdline
user_debug=30 ubi.mtd=ubi root=/dev/ubi0_0 rw ubi.fm_autoconvert=1 console=tty1 console=ttymxc0,115200n8 consoleblank=0

I hope this gives more insight in my issue.

Jeffrey Simons

  • Royal Boon Edam International B.V.

I had a similar problem in the past. For me, when booting from initramfs the kernel command line changed to root=/dev/loop0 which was not any of the configured A or B partitions.

Hi @ruben ,

Thank you for your suggestion, I validated the cmdline between the initramfs and rootfs switch
but can’t find anything out of the ordinary. I stepped through the mounting process and all looks
OK to me up to the point where it does the switch_root.

Jeffrey

Hi @TheYoctoJester ,

I just had an epiphany ;-). I changed the file mender.conf in /data/mender.

cat mender.conf 
{
    "RootfsPartA": "/dev/ubi0_0",
    "RootfsPartB": "/dev/ubi0_1"
}

Notice the /dev/ that I have inserted before the ubi0_x. That gave me a different message but atleast
I think that I’m on to something (see also the log that I got from mender-client).

Mar 18 13:47:23 cg-6cebdf mender[6535]: time="2024-03-18T13:47:23Z" level=info msg="Mender running on partition: /dev/ubi0_0"
... removed some info
Mar 18 13:47:35 cg-6cebdf mender[6535]: time="2024-03-18T13:47:35Z" level=info msg="successfully received new authorization data from server <REMOVED_INFO>"
Mar 18 13:47:35 cg-6cebdf mender[6535]: time="2024-03-18T13:47:35Z" level=info msg="State transition: inventory-update [Sync] -> check-wait [Idle]"
Mar 18 13:47:35 cg-6cebdf mender[6535]: time="2024-03-18T13:47:35Z" level=info msg="State transition: check-wait [Idle] -> update-check [Sync]"
Mar 18 13:47:35 cg-6cebdf mender[6535]: time="2024-03-18T13:47:35Z" level=info msg="Validating the Update Info: <REMOVED_INFO>"
Mar 18 13:47:35 cg-6cebdf mender[6535]: time="2024-03-18T13:47:35Z" level=info msg="State transition: update-check [Sync] -> update-fetch [Download_Enter]"
Mar 18 13:47:35 cg-6cebdf mender[6535]: time="2024-03-18T13:47:35Z" level=info msg="Running Mender client version: 3.5.2"
Mar 18 13:47:36 cg-6cebdf mender[6535]: time="2024-03-18T13:47:36Z" level=info msg="State transition: update-fetch [Download_Enter] -> update-store [Download_Enter]"
... removed some info
Mar 18 13:47:36 cg-6cebdf mender[6535]: time="2024-03-18T13:47:36Z" level=info msg="Opening device \"/dev/ubi0_1\" for writing"
Mar 18 13:47:36 cg-6cebdf mender[6535]: time="2024-03-18T13:47:36Z" level=info msg="Native sector size of block device /dev/ubi0_1 is 126976 bytes. Mender will write in chunks of 2031616 bytes"
Mar 18 13:47:37 cg-6cebdf mender[6535]: time="2024-03-18T13:47:37Z" level=error msg="Failed to write 172687872 bytes to the new partition"
Mar 18 13:47:37 cg-6cebdf mender[6535]: time="2024-03-18T13:47:37Z" level=info msg="The optimized block-device writer wrote a total of 2 frames, where 2 frames did need to be rewritten (i.e., skipped)"
Mar 18 13:47:37 cg-6cebdf mender[6535]: time="2024-03-18T13:47:37Z" level=error msg="Artifact install failed: Payload: can not install Payload: my-image.ubifs: write /dev/ubi0_1: operation not permitted"

Can I safely assume that mender-client is a bit picky in how it is given the information regarding the
root partition, because that has slightly changed with the initramfs change. (perhaps that was the
thing you referred to @ruben?)

What is your advice into solving this issue?

  • I could replace the rootfs init script from Yocto to match the desired outcome by pre-pending /dev/.
    – or
  • Modifying an option in Mender during the build process.

Thank you in advance.

Jeffrey

Hi @TheYoctoJester ,

I build two setups one running with initramfs and one without. The key difference that I can observe with
my limited knowledge is that the Mender running on partition differs between the two.

Unit with initramfs:
Mar 21 07:42:17 cg-6ce919 mender[1992]: time="2024-03-21T07:42:17Z" level=info msg="Mender running on partition: /dev/ubi0_0"
Unit without initramfs:
Mar 21 08:15:48 cg-6ced19 mender[594]: time="2024-03-21T08:15:48Z" level=info msg="Mender running on partition: ubi0_0"

Adjusting the RootfsPartA/B in the mender.conf most likely changes the point in the filesystem which
denies it to write to that location (not sure here).

Both units have the same kernel cmdline:

user_debug=30 ubi.mtd=ubi root=ubi0_0 rw rootfstype=ubifs ubi.fm_autoconvert=1 console=tty1 console=ttymxc0,115200n8 consoleblank=0

The only difference is the way they are mounted the following has been modified in the rootfs script from initramfs-framework.

elif [ "`echo ${bootparam_root} | cut -c1-5`" != "/dev/" ] && \
		[ ! -e "$bootparam_root" ]; then
	debug "No exact path is supplied, prepending /dev/ to $bootparam_root"
	bootparam_root="/dev/${bootparam_root}"
fi
Debug from shell:
DEBUG: No e2fs compatible filesystem has been mounted, mounting ubi0_0...
DEBUG: No exact path is supplied, prepending /dev/ to ubi0_0
<some ubi mounting messages>
DEBUG: Loading module finish
DEBUG: Running finish_run
Switching root to '/rootfs'...
DEBUG: Moving basic mounts onto rootfs
DEBUG: Moving /dev, /proc and /sys onto rootfs...
PREINIT: Start

Perhaps this will give some insights.

Thank you,

Jeffrey

Hi @JSimons,

I gave it a cursory inspection, and indeed UBIFS requires special handling. That’s also why we have the two main Yocto classes mender-full and mender-full-ubi. :sweat_smile:
Yet as far as I can tell, none of the ready made setups that we have is initramfs-aware. So my advice would be to start with Mender Client 4.0, because there the Update Module for the root filesystem is factored out and can easily be modified. Have a look at mender/support/modules/rootfs-image at master · mendersoftware/mender · GitHub for starters, and you probably can adjust it to your integration straight away.

Greets,
Josef

Hi @TheYoctoJester ,

Thank you for the quick response.
I will take a look at Mender 4.0, I will let you know if that worked.

Kind regards,

Jeffrey

1 Like

Hi @TheYoctoJester ,

I got the 4.0.1 version of Mender integrated and it appears that the message regarding the Rootfs has
disappeared, which is good. Also it seems to me that it wants to start downloading the new update.

Nevermind what I wrote here, the update works!!!

After taking a close look at the rootfs install file I saw mender-flash being included, removed it from the
bin-dir and boom it worked! Now I only need to whisper to Yocto and meta-mender to not include the
mender-flash tool.

Thank you for your support @TheYoctoJester, I can now move on.

Have a nice weekend,

Jeffrey

1 Like

Hi @JSimons,

Thanks for the report! Just so I understand correctly:

  • with Mender Client 4.0 the procedure works out of the box, except,
  • mender-flash needs to be removed…where? From /bin of the root filesystem?
    That definitely sounds strange.

Greets,
Josef

Hi @TheYoctoJester ,

Yes (somewhat out-of-the box), I had to remove mender-flash from some bindir (I just ran rm which mender-flash). So I’m not sure from which directory I removed it :wink: .
Remember I use ubifs which needs the ubiupdatevol command and there is a test for mender-flash,
which prevents the ubiupdatevol to run(see also @line 171 from the rootfs-update script).

I will take a closer look on Monday and I’m happy to provide more information if you want that.

Have a nice weekend.

Jeffrey

@JSimons ah right, the ubi thing. Yes let’s try to dig into it on Monday.:+1:

Hello @TheYoctoJester ,

Sorry it took me bit longer then expected to have some more information regarding the update.

When an update is initiated from the Mender server instance, then mender-flash reports:

Mar 27 10:56:26 cg-6ced19 mender-update[1676]: record_id=9 severity=error time="2024-Mar-27 10:56:26.696076" name="Global" msg="Broken pipe: AsyncWrite failed"

It keeps failing even after executing a rollback followed by a reboot. It just does not want to write the
data towards the flash device.
When I update the already existing version then I get somewhat of a verify that executes fine, that leads me to believe that it only fails when writing (reading is fine). Important, this has been done by
means of an USB flash drive that holds the mender update file.

Installing artifact...
100%record_id=1 severity=info time="2024-Mar-27 11:13:05.619920" name="Global" msg="Update Module output (stdout): ================ STATISTICS ================" 
record_id=2 severity=info time="2024-Mar-27 11:13:05.622764" name="Global" msg="Update Module output (stdout): Blocks written: 0" 
record_id=3 severity=info time="2024-Mar-27 11:13:05.623276" name="Global" msg="Update Module output (stdout): Blocks omitted: 165" 
record_id=4 severity=info time="2024-Mar-27 11:13:05.623659" name="Global" msg="Update Module output (stdout): Bytes  written: 0" 
record_id=5 severity=info time="2024-Mar-27 11:13:05.623990" name="Global" msg="Update Module output (stdout): ============================================" 

I took a quick peek at mender-flash source and it appears to be ubi aware, so I assume that it should work.

Kind regards,

Jeffrey

PS. sorry for the edit post, my finger when to quick towards the enter…

Hi @TheYoctoJester ,

I tiptoed into the C++ implementation (important, I’m no C++ expert) and printed some values to the
stderr to validate what is happening. I think I have found some reason why the mender-flash fails to
program.

I added this check to verify if mender-flash figures out if it is an UBI filesystem.

main.cpp @ line 111
	if (isUBI == 0) {
		std::cerr << "Not UBI!" << std::endl;
	} else {
		std::cerr << "Is UBI!" << std::endl;
	}

Reply:

record_id=6 severity=info time="2024-Mar-27 14:20:11.018863" name="Global" msg="Update Module output (stderr): Not UBI!"

So I dove a bit deeper into the source code and added another debug print.

platformfs.cpp @ line 229
std::cerr << "st_rdev: " << statbuf.st_rdev << UBIMajorDevNo << (S_ISCHR(statbuf.st_mode) && major(statbuf.st_rdev)) << "Printed.";

Reply:

record_id=1 severity=info time="2024-Mar-27 14:20:11.011341" name="Global" msg="Update Module output (stderr): st_rdev: {...}" 
record_id=2 severity=info time="2024-Mar-27 14:20:11.012118" name="Global" msg="Update Module output (stderr): 62722{...}" 
record_id=3 severity=info time="2024-Mar-27 14:20:11.012566" name="Global" msg="Update Module output (stderr): 250{...}" 
record_id=4 severity=info time="2024-Mar-27 14:20:11.012915" name="Global" msg="Update Module output (stderr): 1{...}" 
record_id=5 severity=info time="2024-Mar-27 14:20:11.013234" name="Global" msg="Update Module output (stderr): Printed.{...}" 

The values returned are quite different, so the comparison

return S_ISCHR(statbuf.st_mode) && major(statbuf.st_rdev) == UBIMajorDevNo;

will always be false which leads to the situation where the wrong access is done towards the filesystem.

If I force true to be returned then the implementation appears to be working, so I guess that the
IsUBIDevice mechanism does not work as intended (the path being passed to the method is correct
“/dev/ubi0_1”).

Could one of your experts take a look at this and verify the behaviour?

Thank you,

Jeffrey

Hi @JSimons,

Thanks a lot for the deep dive on this! Pinging @kacf for additional input.
EDIT: the long Easter weekend is right ahead, so please allow for some delay.

Greets,
Josef

Hi @TheYoctoJester ,

no worries, I’ll be having a short Easter break as well (back at the 8th of April).

I do have some additional input that I will share, I dove further into the issue together with a colleague
and we found something that might be interesting. The Major number within the mender-flash tool
that is used to determine if it is a UBI fs (250 in your case) differs from ours which is 245.

I printed the devices list from the kernel and it showed me the following (where ubi0 resided at major number 245):

cat /proc/devices 
Character devices:
  1 mem
  4 /dev/vc/0
  4 tty
  5 /dev/tty
  5 /dev/console
  5 /dev/ptmx
  7 vcs
 10 misc
 13 input
 29 fb
 81 video4linux
 89 i2c
 90 mtd
116 alsa
128 ptm
136 pts
153 spi
180 usb
189 usb_device
207 ttymxc
244 ttyGS
245 ubi0
246 pxp_device
247 hidraw
248 rpmb
249 watchdog
250 iio
251 ptp
252 pps
253 rtc
254 gpiochip

The kernel documentation states following:
image

So I think that the assumption of 250 is not correct and should be changed depending on the platform
it is running on, do you agree on this or did we made a wrong assumption?

Enjoy the Easter holidays,

Kind regards,

Jeffrey

Great investigation, @JSimons! Looks like this is a bug, and I have reported it here. I will ask to get this prioritized too.

2 Likes