Hang at boot "Waiting for root device ${mender_kernel_root}..."

I tried using mender-covert 3.0.2 with docker-mender-convert and the resulting image is failing to boot. One of the last messages printed before it hangs is “Waiting for root device ${mender_kernel_root}…”

Is there some configuration missing or messed up? I would expect the log message to print the actual partition it was waiting for, not what looks like an variable.

I’m going to try to rebuild the image with 3.0.1 (what I had used successfully in the past).

same behavior using 3.0.1 (hangs at boot, same messages)

I have a 3.0.1 based image that I created 2/27 that works and a 3.0.1 based image that I created yesterday 3/19 that doesn’t work. When I boot off of an SD card with the the working image I see “Waiting for root device /dev/mmcblk0p2” and it continues booting while the not-working SD card it shows “Waiting for root device ${mender_kernel_root}” and hangs. If I mount the boot of both SD cards on a different computer and compare them I get 67 files that differ, 262 are identical, and 2 files that are only in the not working (tpm-slb9673.dtbo and bcm2711-rpi-cm4-io.dtb). If I grep for mender_kernel_root it shows up in both kernel7l.img and cmdline.txt with those files being identical. If I grep for mender_setup it shows up in kernel7l.img and boot.scr, once again identical.

Any other ideas of where to look?

Hi @chris_wd,

This sounds like an evaluation problem: ${mender_kernel_root} is a variable that u-boot is supposed to expand to /dev/mmcblk0p2. So the first thing to dig into would probably halt the booting process early, and then dump it with printenv. If you put the resulting environments into a post here, I can have a look too. For the sake of completeness, which base distribution are you using, and which versions?

Greetz,
Josef

Not sure how to halt the boot process early, the USB keyboard Im using doesn’t seem to be recognized until after the processes is started. I’ll look into a serial console

The base OS is 2023-02-21-raspios-bullseye-armhf-lite.img.xz plus apt update and some packages being installed. Installation is on a rpi4

Yes, please attach a serial console to halt u-boot and start investigating there.

I’m using a serial console and am able to reliably halt booting when using my “good” image but haven’t been able to on my “bad” image.

On my “good” image I see:

U-Boot 2020.01-g83cf4883ec (Dec 08 2022 - 09:34:13 +0000)

DRAM:  1.9 GiB
RPI 4 Model B (0xb03111)
MMC:   mmcnr@7e300000: 1, mmc@7e340000: 0
Loading Environment from MMC... *** Warning - bad CRC, using default environment

In:    serial
Out:   serial
Err:   serial
Net:   Net Initialization Skipped
No ethernet found.

Then Hit any key to stop autoboot with a countdown from 2.

On my “bad” image I see none of the above with the first text being Booting Linux on physical CPU ex@000600000 [0x410fd083] after the “rainbow screen”.

I have a monitor connected as well, and in both cases I get some display as soon as it powers on, and that text is the same except for the size and hash of start4.elf and fixup4.dat.

“good” image:

“bad” image:

Is it possible that somehow my “bad” image has a different boot loader, different build, different version, etc? Is the boot loader built by mender-covert or pulled in through the docker container?

I copied start4.elf and fixup4.dat from the card with the “good” image to the one with the “bad” image and it now boots successfully. I don’t know where these come from in the mender image process, but there have been changes to the repo between my “good” image on 2/27 and by “bad” image on 3/19 as well as changes afterward - History for boot/start4.elf - raspberrypi/firmware · GitHub

I’ll try starting over from scratch and :crossed_fingers: that I just hit a bad artifact somewhere in the stack that has been resolved.

TLDR - There’s an issue with raspberrypi-bootloader and/or raspberrypi-kernel version 1.20230317-1, possibly also with 1.20230306-1.

Booting my “good” image and doing apt-get upgrade shows it wants to upgrade the following packages:

linux-libc-dev raspberrypi-bootloader raspberrypi-kernel vcdbg

Checking the versions of raspberrypi-bootloader and raspberrypi-kernel on the “good” image, both are 1.20230106-1 with the latest versions being 1.20230317-1.

Booting my “bad” image (after replacing start4.elf and fixup4.dat as described above) and doing apt-get upgrade shows everything is up to date.

To verify 1.20230106-1 vs 1.20230317-1 is the issue I did the following:

  • Start with a fresh 2023-02-21-raspios-bullseye-armhf-lite.img.xz
  • Run my OS setup scripts (includes an apt-get upgrade)
  • Remove card and run scripts to generate mender image
  • Burn the mender image to an SD card
  • Boot and verify “Waiting for root device ${mender_kernel_root}…”
  • Repeat steps from above except before apt-get upgrade run the following:
sudo apt-mark hold raspberrypi-bootloader raspberrypi-kernel
  • Verify the resulting mender image boots without issue
1 Like

Hi @chris_wd ,

Thanks a lot for the perfect description of your findings! I’ll see if we can find the reason behind the bootloader and kernel versions being modified.

Greetz,
Josef

You can fix it as follows:

On raspberry pi: modify config.txt to have: arm_64bit=0

See here:
Ref1: RPi 4B full upgrade od 32-bit OS suddenly changes the kernel architecture to aarch64 · Issue #5402 · raspberrypi/linux · GitHub
Ref2: Raspberry Pi Documentation - The config.txt file

Please update the official mender documentation accordingly. And/or better yet: please mention this in the readme here: mender-convert/README.md at master · mendersoftware/mender-convert · GitHub