GRUB not working after mender-convert

Hi everyone,

I tried to install a Debian image with mender using mender convert, I will explain exactly what I have and what I did.

Device

I have an IoT device with an Intel Celeron. Unfortunately I haven’t access to the SD card due to the encapsulation of the device, but I can boot from a live USB and install everything from there. The device has 16GB of internal storage and the output of lsblk says the SD card device name is /dev/mmcblk0. The partitions are named /dev/mmcblk0p1 and /dev/mmcblk0p2 after the installation.

Installation of golden image

I have downloaded from the Debian website the Debian DVD image (version 11.1). Then I installed it on the device (with a live USB) without any trouble making only two partitions: one partition of 512 MB for EFI and one of 2.7 GB for /. I booted after the installation and connected an USB drive to extract an image of the SD card with the following command.

dd if=/dev/mmcblk0 of=/mnt/usb/debian_golden.img bs=4M oflag=sync status=progress

Where /mnt/usb/ is the directory where I mounted my USB drive.

mender-convert

After that I downloaded mender-convert and built the docker image.

cd mender-convert
./docker-build

I then created a configuration file under configs/ with the following content.

MENDER_STORAGE_DEVICE_BASE="/dev/mmcblk0p"
MENDER_DEVICE_TYPE="x86_64"
 
MENDER_STORAGE_TOTAL_SIZE_MB=14500
 
MENDER_BOOT_PART_SIZE_MB=512
MENDER_DATA_PART_SIZE_MB=8500
IMAGE_ROOTFS_SIZE=-1
 
MENDER_ADDON_CONNECT_INSTALL=y
MENDER_ADDON_CONFIGURE_INSTALL=y
 
MENDER_COPY_BOOT_GAP="n"
 
function platform_modify() {
    if [ ! -e work/rootfs/lib64 ]; then
        run_and_log_cmd "ln -s /lib work/rootfs/lib64"
    fi
}

The ‘p’ at the end of MENDER_STORAGE_DEVICE_BASE is because of the name of the partitions. After that I also used the bootstrap-rootfs-overlay-production-server.sh modified with my server’s certificate and URL in order to have the client working (but this is not important for the problem I had as it has nothing to do with the booting).

Then I executed the container with the following command.

MENDER_ARTIFACT_NAME=base-image-1 ./docker-mender-convert --disk-image input/debian_golden.img --config configs/nano-iot-config --overlay rootfs_overlay_production/

The output log can be observed here.

I want to focus attention on line 372, where the following can be seen:

2021-12-01 09:42:28 [INFO] [mender-convert-modify] Using root device A in mender.conf: /dev/mmcblk02
2021-12-01 09:42:28 [INFO] [mender-convert-modify] Using root device B in mender.conf: /dev/mmcblk03

Which means that the above modification on MENDER_STORAGE_DEVICE_BASE had no actual effect on the result.

Installation on the device

Nevertheless, I copied the resulting image in deploy/debian_golden-x86_64-mender.img to an USB drive and prepared another USB drive with a Linux Mint live USB (in order to copy the image to the SD card). I then booted on the IoT device with that USB with Linux Mint and copied the image to the SD card with the following command.

dd if=/mnt/usb/debian_golden-x86_64-mender.img of=/dev/mmcblk0 oflag=sync bs=4M status=progress

Once it ended I unmounted the devices and booted wihout any USB connected. The result was a grub terminal.

I tried to boot manually in many ways:

Using grub.cfg

set root=(hd0,gpt1)
set prefix=(hd0,gpt1)/efi/boot
configfile (hd0,gpt1)/efi/boot/grub.cfg

This resulted in an error that says there is no hashsum function.

Manually booting the first root partition

set root=(hd0,gpt2)
set prefix=(hd0,gpt2)/boot
linux (hd0,gpt2)/boot/vmlinuz-XXXXX
initrd (hd0,gpt2)/boot/initrd-XXXXXX
insmod normal
boot

The XXXXX means the rest of the vmlinuz and initrd filenames. I tried this both with linux and initrd and with linuxefi and initrdefi. Both giving the same result, that is the booting stuck at initramfs.

Changing grub.cfg

I also tried changing the device name on the grub.cfg under /efi/boot/grub.cfg from /dev/mmcblk0 to /dev/mmcblk0p, which gave the same result. Then erasing all the references to hashsum, giving the same result. And then copying the BOOTx64.CSV from /efi/debian/ to /efi/boot/ and changing the content to point bootx64.efi. That didn’t work either.

I bet the error is in the configuration of mender-convert but if the conversion was successful as it’s shown on the log, the kernel and initrd should be okay and hence it should be possible to boot manually. Am I missing something?

Best,

Before trying to boot manually, can you check the contents of the bootargs variable? It may be that this is set incorrectly and it doesn’t find the root filesystem.

I’m curious, why is it we keep seeing issues like this, similar answers, but never an update in the documentation warning users of the possible pitfall? Additionally (and this is on posters), it’d be nice to see WHAT ENDED UP MAKING THINGS WORK! Or, if things didn’t work, a post about that. This community seems very nice, and helpful, until I started noticing that a lot of the answers could have been solved with better documentation and/or users posting the result of the suggestions posted.

Disclaimer: I’ve got an ongoing issue, and have been trying to find answers here and it’s been more frustrating than trying to herd cats sifting through the hub. If I a) resolve my issue, I’ll write it up; or b) the client dumps mender or the hardware we’re currently using, I’ll write that up too.

I’d bet you are, but I also don’t think you had documentation telling you about the thing you’re missing. It seems to be that we’re supposed to be flying with a blindfold on half the time…