Trouble getting RPi4 to work

Thanks for this tutorial! I’m trying to get a basic build with Mender on a Pi 4. I originally tried using mender-convert, but I couldn’t get it working. Yocto seems like a better move overall anyway, so I’m trying my luck here instead.

However, I can’t seem to get the Yocto build to work. I get a .sdimg file that seems broken. When written to an SD card, it has all of the partitions but the “B” partition is empty. When I try to boot it, I get an endless boot loop.

Here are some quick details:

  • I followed the instructions above without modification, on a freshly cloned meta-mender-community repo.
  • I made no modifications nor did I add any additional layers.
  • I’ve tried a few variations after failing (e.g. using kas to build), but made no real progress
  • I tried two different machines, my personal laptop running an OS running Pop_OS 22.04, and a fresh Azure VM running Ubuntu 22.04. Both have the same results.
  • Not sure if this is relevant, but the /boot file “cmdline.txt” has “root=${mender_kernel_root}”. Not sure if it failed to get replaced, or if this is part of how Mender does the dynamic switch between partitions?

I’m relatively new to Yocto and Mender both, and I really appreciate any pointers. If there’s something I can try, or if I missed something obvious, please let me know.

As a side note, should the kas/raspberrypi4-64.yml have “machine: raspberrypi4-64”? It’s missing the -64 if so. (I also tried building both, the 64bit version doesn’t boot loop but it never fully boots either and has an empty partition too)

Edit: Forgot to add, I successfully did a Yocto build without Mender from the meta-raspberrypi repo

Hi @nickfh7,

Thanks for the heads up! I’ve looked into it and can reproduce the booting bug, but don’t have the solution yet.

Concerning the other points:

  • the machine: raspberrypi4 line in kas/raspberrypi4-64.yml is indeed wrong, fixed already.
  • the secondary (inactive) partition is empty by default on a Yocto build to allow for more efficient image compression and faster writing. This is intended behavior.

I’ll ping you once the fix for booting is available.

Greetz,
Josef

1 Like

Hey @TheYoctoJester,

Just checking to see if you’ve had any luck finding a fix. I found an issue with psplash-systemd.service, and disabling it in a bbappend allows the 64-bit version to boot. Any thoughts here? Could something else still be wrong? I’m not entirely sure why this happened, but I saw mention somewhere that it could be related to a frame buffer race condition.

Edit:
I’m now experiencing the issue where root will not log in via console (requires a password, but it doesn’t have one). I also see some errors related to the Mender service that grows the data filesystem. The partition grow service was successful (I verified that the partition grew) but the fs grow service has a few errors:

[    7.784148] EXT4-fs (mmcblk0p4): resizing filesystem from 131073 to 26929152 blocks
[    8.402889] EXT4-fs warning (device mmcblk0p4): verify_reserved_gdb:796: reserved GDT 59 missing grp 1 (8251)
[    8.413104] EXT4-fs warning (device mmcblk0p4): ext4_resize_fs:2193: error (-22) occurred during file system resize
[    8.423935] EXT4-fs (mmcblk0p4): resized filesystem to 131073

However the Mender client works and successfully attached to the server! There just seems to be some issue with the filesystem.

Hi @nickfh7,

On the variety of topics:

  • the boot bug on 32bit raspberrypi4 I have not identified yet. Its a bit obscure because the exact same u-boot configuration works on kirkstone :frowning:
  • the root login being disabled, that is expected. As a security best practice, it is disabled by default on Yocto. You can get a passwordless one for testing purposes by adding this to your local.conf:
EXTRA_IMAGE_FEATURES:append = " debug-tweaks "
  • what is the bug with psplash? So it can be eventually fixed :slight_smile:

Greetz,
Josef

Hi @nickfh7,

to expand a bit: I just built core-image-weston for raspberrypi4-64, and while it has some rough edges (like psplash blocking boot for some time) and it boots extremely slow (multiple minutes!), it otherwise seems to be working fine.

Some additions I needed to local.conf:

LICENSE_FLAGS_ACCEPTED = "synaptics-killswitch"
DISTRO_FEATURES:append = " pam "

IMAGE_OVERHEAD_FACTOR = "1.0"

The resulting image, as said, is super slow given the microSD card and limited RAM, but those are essentially application respectively architecture problems. You should always have the UART ready to log into it over serial and check what’s going on, but as far as I can tell, the problems in your case are on a higher software stack level.

Greetz,
Josef

Thanks again for the responses! I have a version of my system working based on meta-raspberry-pi’s kas file, and I missed the debug-tweaks addition that enable root login. Going to the Mender version, I thought I was experiencing this issue. Looks like this isn’t the case though, so that’s good.

Debugging over UART is how I found the psplash issue, but I just got the hardware module to do this yesterday and haven’t had much time to dig in. Only thing I know about the issue is that I saw some lines like this (before disabling psplash):

[ TIME ] Timed out waiting for device /dev/mmcblk0p4.
[DEPEND] Dependency failed for /data.
[DEPEND] Dependency failed for Local File Systems.
[DEPEND] Dependency failed for Grow File System on /data.
[DEPEND] Dependency failed for File System Check on /dev/mmcblk0p4.
[ TIME ] Timed out waiting for device /sys/devices/platform/gpu/graphics/fb0.
[DEPEND] Dependency failed for Start psplash boot splash screen.
[DEPEND] Dependency failed for Start psplas…temd progress communication helper.

As for the /data growfs issue I’ve been having, I’ve been building with core-image-base, and it works and boots within about 20-30 seconds. I do “preload” data into the /data directory before mender, but I also verified that /data ends up as a mountpoint. I can use /data as normal.

Here is the UART output:

[  OK  ] Mounted /data.
[  OK  ] Mounted /uboot.
[  OK  ] Reached target Local File Systems.
         Starting Grow File System on /data...
[    7.517338] EXT4-fs (mmcblk0p4): resizing filesystem from 131073 to 26929152 blocks
         Starting Create Volatile Files[    7.535141] brcmfmac: brcmf_c_process_txcap_blob: no txcap_blob available (err=-2)
 and Directories[    7.543988] brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM4345/6 wl0: Aug 29 2023 01:47:08 version 7.45.265 (28bca26 CY) FWID 01-b677b91b
...
[    7.558030] EXT4-fs warning (device mmcblk0p4): verify_reserved_gdb:796: reserved GDT 3 missing grp 1 (8195)
[    7.568585] EXT4-fs warning (device mmcblk0p4): ext4_resize_fs:2193: error (-22) occurred during file system resize
[    7.579451] EXT4-fs (mmcblk0p4): resized filesystem to 131073
[  OK  ] Created slice Slice /system/bthelper.
         Starting Load/Save RF Kill Switch Status...
[FAILED] Failed to start Grow File System on /data.
See 'systemctl status mender-systemd-growfs-data.service

And here is the systemd status:

× mender-systemd-growfs-data.service - Grow File System on /data
     Loaded: loaded (/usr/lib/systemd/system/mender-systemd-growfs-data.service; static)
     Active: failed (Result: exit-code) since Thu 1970-01-01 00:00:07 UTC; 54 years 9 months ago
    Process: 315 ExecStart=/lib/systemd/systemd-growfs /data (code=exited, status=1/FAILURE)
   Main PID: 315 (code=exited, status=1/FAILURE)
        CPU: 48ms

Jan 01 00:00:07 localhost systemd[1]: Starting Grow File System on /data...
Jan 01 00:00:07 localhost systemd-growfs[315]: Failed to resize "/data" to 27575451648 bytes: Invalid argument
Jan 01 00:00:07 localhost systemd[1]: mender-systemd-growfs-data.service: Main process exited, code=exited, status=1/FAILURE
Jan 01 00:00:07 localhost systemd[1]: mender-systemd-growfs-data.service: Failed with result 'exit-code'.
Jan 01 00:00:07 localhost systemd[1]: Failed to start Grow File System on /data.

If I try to rerun /lib/systemd/systemd-growfs /data manually, it gives the same error. I’ll keep debugging, but is there anything I could be doing wrong to cause this?

Hello @TheYoctoJester . I have yocto dunfell (3.0) project for RPi 4B with mender, and it works awesome. But I decided to upgrade to scarthgap (5.0) and followed this manual (Raspberry Pi 4 Model B - Yocto 5.0 "scarthgap" and later) to create core-image-minimal image with no customization at all. Image was created but the device doesn’t boot:

U-Boot 2024.04 (Apr 02 2024 - 10:58:58 +0000)

DRAM:  948 MiB (effective 7.9 GiB)
RPI 4 Model B (0xd03114)
Core:  211 devices, 16 uclasses, devicetree: board
MMC:   mmcnr@7e300000: 1, mmc@7e340000: 0
Loading Environment from MMC... *** Warning - bad CRC, using default environment

In:    serial,usbkbd
Out:   serial,vidconsole
Err:   serial,vidconsole
Net:   eth0: ethernet@7d580000
PCIe BRCM: link up, 5.0 Gbps x1 (SSC)
PCI: Failed autoconfig bar 10
starting USB...
Bus xhci_pci:

Previous dunfell-based image successful boot snippet:

U-Boot 2020.01 (Jan 06 2020 - 20:56:31 +0000)

DRAM:  3.9 GiB
RPI 4 Model B (0xd03114)
MMC:   mmcnr@7e300000: 1, emmc2@7e340000: 0
Loading Environment from MMC... OK

I noticed that DRAM size differs. Could I ask why this happens, please? Thanks.

Hi @Vladimir,

Interesting catch concerning the DRAM size. I have also seen some weird behavior, and in my tests it seems to be limited to the 32bit version only. Can you confirm that this is the case here too? And give the 64bit builds a test spin?

Greetz,
Josef

I’ve created raspberrypi4-64 image and I got the next lines during first boot:

U-Boot 2024.04 (Apr 02 2024 - 10:58:58 +0000)

DRAM:  948 MiB (effective 7.9 GiB)
RPI 4 Model B (0xd03114)
Core:  211 devices, 16 uclasses, devicetree: board
MMC:   mmcnr@7e300000: 1, mmc@7e340000: 0
Loading Environment from MMC... *** Warning - bad CRC, using default environment

In:    serial,usbkbd
Out:   serial,vidconsole
Err:   serial,vidconsole
Net:   eth0: ethernet@7d580000
PCIe BRCM: link up, 5.0 Gbps x1 (SSC)
starting USB...
Bus xhci_pci: Register 5000420 NbrPorts 5
Starting the controller
USB XHCI 1.00
scanning bus xhci_pci for devices... 2 USB Device(s) found
       scanning usb for storage devices... 0 Storage Device(s) found
Hit any key to stop autoboot:  0
Card did not respond to voltage select! : -110
** Booting bootflow 'mmc@7e340000.bootdev.part_1' with script
Working FDT set to 2eff2600
Saving Environment to MMC... Writing to MMC(0)... OK
switch to partitions #0, OK
mmc0 is current device
27271680 bytes read in 1361 ms (19.1 MiB/s)
Moving Image from 0x80000 to 0x200000, end=1d20000
## Flattened Device Tree blob at 2eff2600
   Booting using the fdt blob at 0x2eff2600
Working FDT set to 2eff2600
   Using Device Tree in place at 000000002eff2600, end 000000002f002ffc
Working FDT set to 2eff2600

Starting kernel ...

And the system boots normally, I get a login prompt. CRC check passes during every subsequent boot attempt. That makes me think that the only difference is that 64bit image is able to self-recover from CRC-error state and 32bit image isn’t. 64bit image writes env (Saving Environment to MMC... Writing to MMC(0)... OK) and fixes CRC mismatch as a result, while 32bit image hangs at Bus xhci_pci: forever.

Similar behaviour is observed with vanilla u-boot (without mender support). It looks like u-boot issue, not related to mender.

Hi @Vladimir,

The initial CRC error is actually not an error, the wording is unfortunate. The meaning is “no environment found, using the default one”. And that’s absolutely fine and intended, as the default environment is explicitly fine for operation (without Mender integration), and on first start writes a defined environment (with Mender integration). So that one, even if it sounds scary, is perfectly normal and not related to the RPi4/32bit not booting. I still have to figure that out :frowning:

Greetz,
Josef