I tried using mender-covert 3.0.2 with docker-mender-convert and the resulting image is failing to boot. One of the last messages printed before it hangs is “Waiting for root device ${mender_kernel_root}…”
Is there some configuration missing or messed up? I would expect the log message to print the actual partition it was waiting for, not what looks like an variable.
I’m going to try to rebuild the image with 3.0.1 (what I had used successfully in the past).
same behavior using 3.0.1
(hangs at boot, same messages)
I have a 3.0.1
based image that I created 2/27 that works and a 3.0.1
based image that I created yesterday 3/19 that doesn’t work. When I boot off of an SD card with the the working image I see “Waiting for root device /dev/mmcblk0p2” and it continues booting while the not-working SD card it shows “Waiting for root device ${mender_kernel_root}” and hangs. If I mount the boot
of both SD cards on a different computer and compare them I get 67 files that differ, 262 are identical, and 2 files that are only in the not working (tpm-slb9673.dtbo
and bcm2711-rpi-cm4-io.dtb
). If I grep for mender_kernel_root
it shows up in both kernel7l.img
and cmdline.txt
with those files being identical. If I grep for mender_setup
it shows up in kernel7l.img
and boot.scr
, once again identical.
Any other ideas of where to look?
Hi @chris_wd,
This sounds like an evaluation problem: ${mender_kernel_root}
is a variable that u-boot
is supposed to expand to /dev/mmcblk0p2
. So the first thing to dig into would probably halt the booting process early, and then dump it with printenv
. If you put the resulting environments into a post here, I can have a look too. For the sake of completeness, which base distribution are you using, and which versions?
Greetz,
Josef
Not sure how to halt the boot process early, the USB keyboard Im using doesn’t seem to be recognized until after the processes is started. I’ll look into a serial console
The base OS is 2023-02-21-raspios-bullseye-armhf-lite.img.xz
plus apt update
and some packages being installed. Installation is on a rpi4
Yes, please attach a serial console to halt u-boot
and start investigating there.
I’m using a serial console and am able to reliably halt booting when using my “good” image but haven’t been able to on my “bad” image.
On my “good” image I see:
U-Boot 2020.01-g83cf4883ec (Dec 08 2022 - 09:34:13 +0000)
DRAM: 1.9 GiB
RPI 4 Model B (0xb03111)
MMC: mmcnr@7e300000: 1, mmc@7e340000: 0
Loading Environment from MMC... *** Warning - bad CRC, using default environment
In: serial
Out: serial
Err: serial
Net: Net Initialization Skipped
No ethernet found.
Then Hit any key to stop autoboot
with a countdown from 2.
On my “bad” image I see none of the above with the first text being Booting Linux on physical CPU ex@000600000 [0x410fd083]
after the “rainbow screen”.
I have a monitor connected as well, and in both cases I get some display as soon as it powers on, and that text is the same except for the size and hash of start4.elf
and fixup4.dat
.
“good” image:
“bad” image:
Is it possible that somehow my “bad” image has a different boot loader, different build, different version, etc? Is the boot loader built by mender-covert
or pulled in through the docker container?
I copied start4.elf
and fixup4.dat
from the card with the “good” image to the one with the “bad” image and it now boots successfully. I don’t know where these come from in the mender image process, but there have been changes to the repo between my “good” image on 2/27 and by “bad” image on 3/19 as well as changes afterward - History for boot/start4.elf - raspberrypi/firmware · GitHub
I’ll try starting over from scratch and that I just hit a bad artifact somewhere in the stack that has been resolved.
TLDR - There’s an issue with raspberrypi-bootloader
and/or raspberrypi-kernel
version 1.20230317-1
, possibly also with 1.20230306-1
.
Booting my “good” image and doing apt-get upgrade
shows it wants to upgrade the following packages:
linux-libc-dev raspberrypi-bootloader raspberrypi-kernel vcdbg
Checking the versions of raspberrypi-bootloader
and raspberrypi-kernel
on the “good” image, both are 1.20230106-1
with the latest versions being 1.20230317-1
.
Booting my “bad” image (after replacing start4.elf
and fixup4.dat
as described above) and doing apt-get upgrade
shows everything is up to date.
To verify 1.20230106-1
vs 1.20230317-1
is the issue I did the following:
- Start with a fresh
2023-02-21-raspios-bullseye-armhf-lite.img.xz
- Run my OS setup scripts (includes an
apt-get upgrade
)
- Remove card and run scripts to generate mender image
- Burn the mender image to an SD card
- Boot and verify “Waiting for root device ${mender_kernel_root}…”
- Repeat steps from above except before
apt-get upgrade
run the following:
sudo apt-mark hold raspberrypi-bootloader raspberrypi-kernel
- Verify the resulting mender image boots without issue
1 Like
Hi @chris_wd ,
Thanks a lot for the perfect description of your findings! I’ll see if we can find the reason behind the bootloader and kernel versions being modified.
Greetz,
Josef