Error: grub-mender-grubenv:do_sign failed

Hi all,

I’m working on updating our old Yocto image to from Honnister all the way to Scarthgap.
My troubles are with getting meta-mender and meta-secure-core to play nice.

I have resolved numerous issues (and fixed several bugs in meta-secure-core in the process),
but mender’s grubenv is being particularly stubborn.

At first it would just not be built, so i had to manually place it in IMAGE_INSTALL.
That got me a situation where MENDER_BOOT_PART_MOUNT_LOCATION would not be set, leading to this gem of an error:

ERROR: mc:pa5:grub-mender-grubenv-1.3.0+giteeb7ebd9e6558cf6bbe661b4f2e4e45d52efa305-r0 do_sign:
Unable to sign /opt/yocto/build/tmp/work/corei7-64-poky-linux/grub-mender-grubenv/1.3.0+giteeb7ebd9e6558cf6bbe661b4f2e4e45d52efa305/image${MENDER_BOOT_PART_MOUNT_LOCATION}/grub-mender-grubenv/mender_grubenv1/lock

Sometimes, this also manifested like this:

ERROR: mc:pa5:grub-mender-grubenv-1.3.0+giteeb7ebd9e6558cf6bbe661b4f2e4e45d52efa305-r0 do_sign: Error executing a python function in exec_func_python() autogenerated:
The stack trace of python calls that resulted in this exception/failure was:
File: 'exec_func_python() autogenerated', lineno: 2, function: <module>
     0001:
 *** 0002:do_sign(d)
     0003:
File: '/opt/yocto/meta-mender/meta-mender-core/recipes-bsp/grub-mender-grubenv/grub-mender-grubenv.inc', lineno: 125, function: do_sign
     0121:    uks_bl_sign("%s%s/grub.cfg" % (d.getVar("D"), d.getVar("GRUB_CONF_LOCATION")), d)
     0122:    uks_bl_sign("%s%s/mender_grubenv1/lock" % (d.getVar("D"), d.getVar("GRUB_ENV_LOCATION")), d)
     0123:    uks_bl_sign("%s%s/mender_grubenv2/lock" % (d.getVar("D"), d.getVar("GRUB_ENV_LOCATION")), d)
     0124:    uks_bl_sign("%s%s/mender_grubenv1/lock.sha256sum" % (d.getVar("D"), d.getVar("GRUB_ENV_LOCATION")), d)
 *** 0125:    uks_bl_sign("%s%s/mender_grubenv2/lock.sha256sum" % (d.getVar("D"), d.getVar("GRUB_ENV_LOCATION")), d)
     0126:}
     0127:do_sign[prefuncs] += "${@bb.utils.contains('DISTRO_FEATURES', 'efi-secure-boot', 'check_deploy_keys', '', d)}"
     0128:do_sign[prefuncs] += "${@'check_boot_public_key' if d.getVar('GRUB_SIGN_VERIFY', True) == '1' else ''}"
     0129:
Exception: TypeError: can only concatenate str (not "NoneType") to str

Manually setting that variable via a .bbappend file got me to the signing step, where the grub.cfg succeeds, but the mender_grubenv1/lock file fails without further info:

NOTE: Signing /opt/yocto/build/tmp/work/corei7-64-poky-linux/grub-mender-grubenv/1.3.0+giteeb7ebd9e6558cf6bbe661b4f2e4e45d52efa305/image/boot/EFI/BOOT/grub.cfg with the key
  /opt/yocto/meta-safion/conf/distro/secure-boot/user-keys/uefi_sb_keys/DB.key ...
NOTE: Running cmd:
  LD_LIBRARY_PATH=/opt/yocto/build/tmp/work/corei7-64-poky-linux/grub-mender-grubenv/1.3.0+giteeb7ebd9e6558cf6bbe661b4f2e4e45d52efa305/recipe-sysroot-native/usr/lib:$LD_LIBRARY_PATH /opt/yocto/build/tmp/work/corei7-64-poky-linux/grub-mender-grubenv/1.3.0+giteeb7ebd9e6558cf6bbe661b4f2e4e45d52efa305/recipe-sysroot-native/usr/bin/selsign
  --key /opt/yocto/meta-safion/conf/distro/secure-boot/user-keys/uefi_sb_keys/DB.key
  --cert /opt/yocto/meta-safion/conf/distro/secure-boot/user-keys/uefi_sb_keys/DB.crt
  /opt/yocto/build/tmp/work/corei7-64-poky-linux/grub-mender-grubenv/1.3.0+giteeb7ebd9e6558cf6bbe661b4f2e4e45d52efa305/image/boot/EFI/BOOT/grub.cfg
NOTE: Signing /opt/yocto/build/tmp/work/corei7-64-poky-linux/grub-mender-grubenv/1.3.0+giteeb7ebd9e6558cf6bbe661b4f2e4e45d52efa305/image/boot/EFI/grub-mender-grubenv/mender_grubenv1/lock with the key
  /opt/yocto/meta-safion/conf/distro/secure-boot/user-keys/uefi_sb_keys/DB.key ...
NOTE: Running cmd: LD_LIBRARY_PATH=/opt/yocto/build/tmp/work/corei7-64-poky-linux/grub-mender-grubenv/1.3.0+giteeb7ebd9e6558cf6bbe661b4f2e4e45d52efa305/recipe-sysroot-native/usr/lib:$LD_LIBRARY_PATH
  /opt/yocto/build/tmp/work/corei7-64-poky-linux/grub-mender-grubenv/1.3.0+giteeb7ebd9e6558cf6bbe661b4f2e4e45d52efa305/recipe-sysroot-native/usr/bin/selsign
  --key /opt/yocto/meta-safion/conf/distro/secure-boot/user-keys/uefi_sb_keys/DB.key --cert /opt/yocto/meta-safion/conf/distro/secure-boot/user-keys/uefi_sb_keys/DB.crt /opt/yocto/build/tmp/work/corei7-64-poky-linux/grub-mender-grubenv/1.3.0+giteeb7ebd9e6558cf6bbe661b4f2e4e45d52efa305/image/boot/EFI/grub-mender-grubenv/mender_grubenv1/lock
ERROR: Unable to sign /opt/yocto/build/tmp/work/corei7-64-poky-linux/grub-mender-grubenv/1.3.0+giteeb7ebd9e6558cf6bbe661b4f2e4e45d52efa305/image/boot/EFI/grub-mender-grubenv/mender_grubenv1/lock

At this point, I have to assume I’m doing something fundamentally wrong, but i don’t see what.
Any help would be very much appreciated!

The aforementioned grub-mender-grubenv_%.bbappend file
PACKAGECONFIG += "debug-log"
MENDER_BOOT_PART_MOUNT_LOCATION = "/boot/EFI"
My mender config
inherit mender-full
IMAGE_INSTALL:append = " mender-connect mender-configure mender-client docker"
ARTIFACTIMG_FSTYPE = "ext4"
MENDER_SERVER_URL = "https://hosted.mender.io"
MENDER_TENANT_TOKEN = ***
MENDER_UPDATE_POLL_INTERVAL_SECONDS = "1800"
MENDER_INVENTORY_POLL_INTERVAL_SECONDS = "28800"
MENDER_CONNECT_USER = "root"
MENDER_FEATURES_ENABLE:append = " mender-grub mender-image-uefi"
MENDER_FEATURES_DISABLE:append = " mender-uboot mender-image-sd mender-grow-fs-data"
MENDER_ARTIFACT_NAME = "sometest"
MENDER_STORAGE_DEVICE = "/dev/sda"
MENDER_STORAGE_TOTAL_SIZE_MB = "6000"
MENDER_KERNEL_PART_SIZE_MB = "256"
My secure boot config
DISTRO_FEATURES_NATIVE:append = " tpm2 efi-secure-boot"
DISTRO_FEATURES:append = " tpm2 efi-secure-boot modsign"
EFI_PROVIDER = "grub-efi"
INITRAMFS_IMAGE = "secure-core-image-initramfs"
# meta-secure-core key configuration
SIGNING_MODEL := "user"
require /opt/yocto/meta-safion/conf/distro/secure-boot/user-keys/keys.conf
# Useful for debugging. Prints which files are being signed with which keys
USER_KEY_SHOW_VERBOSE = "true"
IMAGE_INSTALL:append = " packagegroup-core-boot kernel-initramfs dnf packagegroup-efi-secure-boot packagegroup-tpm2"
IMAGE_INSTALL:append = " kernel-image-bzimage"
GRUB_SIGN_VERIFY = "0"
UEFI_SB = "1"
UEFI_SELOADER = "1"
MOK_SB = "0"
PACKAGE_CLASSES = "package_rpm"
IMAGE_ROOTFS_SIZE = "16384"

Ok, after some more digging I might have narrowed this down:

  1. GRUB_ENV_LOCATION is composed of, in part, MENDER_BOOT_PART_MOUNT_LOCATION
  2. MENDER_BOOT_PART_MOUNT_LOCATION is set by meta-mender/meta-mender-core/classes/mender-setup-image.inc:28 to /boot/efi
  3. grub-mender-grubenv:do_compile and grub-mender-grubenv:do_install pass GRUB_CONF_BARE_LOCATION (equal to EFIDIR) and BOOT_DIR_LOCATION (equal to EFI_PREFIX) to oe_runmake, which in the end places the grub-mender-grubenv directory at ${EFI_PREFIX}${EFIDIR}. In this case, all that evaluates to /boot/EFI/BOOT (ei, equal to GRUB_CONF_LOCATION and EFI_FILES_PATH), with the grubenv landing at /boot/grub-mender-grubenv
  4. The comments for BOOT_DIR_LOCATION state that it is “almost always equal to MENDER_BOOT_PART_MOUNT_LOCATION (which EFI_PREFIX is also equal to)”. In reality though, mender-setup-image sets this variable to /boot/efi (NOT /boot/EFI for some reason), and within the context of grub-mender-grubenv.inc it evaluates to null.
  5. Setting MENDER_BOOT_PART_MOUNT_LOCATION = "${EFI_PREFIX}" via a .bbappend file actually allows the build to complete and successfully sign the entire grubenv.

At this point, the discrepancies between the efi path names as well the comments lead to me to believe that this is a regression.

However, after all that, the build still winds up failing when creating the rootfs:

Error: Transaction test error:
  file /boot/EFI/BOOT/grub.cfg conflicts between attempted installs of grub-mender-grubenv-1.3.0+giteeb7ebd9e6558cf6bbe661b4f2e4e45d52efa3050+eeb7ebd9e6-r0.corei7_64 and grub-bootconf-1.00-r0.congatec_tca5_64

I’m not sure if it that worth investigating much further until the other issues are resolved, i suspect they might be related (i have verified that my bsp is not at fault, despite the boardname suffix in the error above).

@TheYoctoJester I’ll just go ahead and ping you here, i hope that’s ok. I suspect you’re the person to talk to about this, or at least know who is.

SoTLDR:

  1. What should the correct value be for MENDER_BOOT_PART_MOUNT_LOCATION? /boot, /boot/efi or /boot/EFI?
  2. Any idea why that variable shows up as set by mender-setup-image.inc when traced with bitbake-getvar, but evaluates to null within grub-mender-grubenv.inc?
  3. The grubenv is being deployed to /boot/grub-mender-grubenv, but the grub config lands at /boot/EFI/BOOT/grub.cfg. Is that correct?
  4. Should i be looking into that collision on grub.conf, or does that look like it’s related to the other issue?

Alright, so i did go ahead an debugged that collision further, turns out the PROVIDES and RPROVIDES weren’t being set.

Some injected debug outputs show MENDER_FEATURES and MENDER_FEATURES_ENABLE to be empty as fas as the grub-mender-grubenv recipe is concerned, causing @mender_feature_is_enabled checks to return false. Printing them anwhere else, tracing with bitbake -e and checking with bitbake-getvar all shows those variables to be set properly.

That’s now three variables that grub-mender-grubenv sees as empty when they’re not.
When i manually set the PROVIDES and RPROVIDES via a .bbappend file, the build actually succeeds.

I have run bitbake -c cleanall against grub-mender-grubenv and anything else i could think of countless times, and nuked the entire temp directory several times.
I recreate the entire build environment for every run, the only things i carry over are the sstate cache and download dir. Either there is something wrong in some spot i somehow never clean out, or somwthing is going weirdly wrong with this recipe.
Any ideas?

Hi @Adrian,

Thanks for the heads up! Unfortunately my experience with secure boot in this context is very limited. I’ll try to replicate this, right from top of my head nothing that hits me immediately.

Maybe @annalenamarx? I think you have used x86 with secure boot + Mender already successfully.

Greets,
Josef

Hi @adrian and @TheYoctoJester,
I’m still not sure on your actual approach, but I think it went on a hard to solve track right from the beginning.
May you can use Mender and efi-secure-boot on intel-corei7-64 - #12 by esscrb as an inspiration, this is where I oriented myself while implementing and added the parts missing for me to get it on the road.

I think you mixed up several issues here, so it is hard to see where the root cause came in. I had to patch grub-mender-grubenv, too, but in another context - I needed to patch the grub entry in order to load the kernel from the correct, signed location - and could not remember issues with getting it signed (working on kirkstone branches). If it is not crucial for your use-case, I would suggest following the approach in the mentioned thread with no initramfs used, as I think Mender would also need some additional love for proper handling. And I would not try to change the MENDER_BOOT_PART_MOUNT_LOCATION, keep the Mender parts within the root partition and EFI seperated on boot. For updating the signed kernel are Mender state scripts a really good solution.

I hope this helps.
Secure Boot and meta-secure-core is not that well documented and you’re completely on your own with everything besides exactly that use-case the original authors had in mind.

Best,
Anna

1 Like

Hi @annalenamarx ,

thank you for your reply! My approach was just supposed to be meta-efi-secure-boot for secure boot and later on IMA, and mender for OTA updates (USB-only is, unfortunately, not an option for us).

As far as i know, the only non standard thing I did on the secure boot side was to to use SELoader without MOK Secure boot, but that is the same as the approach you linked.

The initramfs was stated by meta-secure-core as required, but I’ll definitely drop it as per your recommendation and see where that gets me.

Regarding MENDER_BOOT_PART_MOUNT_LOCATION, I am not trying to change it, it is simply empty. The mender-setup-image class sets that variable, but within grub-efi and grub-mender-grubenv, it evaluates to null, causing it to not be substituted in shell code:
Unable to sign /opt/yocto/build/tmp/work/corei7-64-poky-linux/grub-mender-grubenv/1.3.0+giteeb7ebd9e6558cf6bbe661b4f2e4e45d52efa305/image${MENDER_BOOT_PART_MOUNT_LOCATION}

and python code to crash with a TypeError:
TypeError: can only concatenate str (not "NoneType") to str

When I inject some bbwarn statements, it also prints as empty.
I’m guessing something about my configuration causes Bitbakes variable replacement to go haywire, but I’m at a bit of a loss.

That’s why I experimented with manually “fixing” it by setting the variable to what i think it’s supposed to be, which at least got me a successful build and confirmed that there are no other issues lurking under the surface.

1 Like

Hi @adrian ,

we have OTA running with this approach including custom Mender state scripts to handle the signed kernel gets updated in /boot/efi. Works fine! Even without initramfs :smiley:

I think MENDER_BOOT_PART_MOUNT_LOCATION is a question for @TheYoctoJester, I’m not aware of this issue with kirkstone, maybe it’s a bug with scarthgap.

Good luck!

1 Like

Well, i finally got it.
Turns out my primary mistake was inheriting mender-full in my image rather than local.conf.
That lead to MENDER_FEATURES being empty for anything other than my image recipe, and thus everything went haywire.

A second problem, was mender setting the wrong root device. To fix that, i had to move MENDER_STORAGE_DEVICE into local.conf as well. All other mender configurations seem to apply just fine from the image recipe, including MENDER_STORAGE_TOTAL_SIZE_MB, just these two were causing issues.

I was trying to structure my code logically, and isolate sections as much as possible to avoid the bloat to local.conf, but BitBake has other ideas it seems.

Thanks for your time @annalenamarx and @TheYoctoJester :slight_smile:

1 Like

Hi @Adrian,

ah indeed, that’s a sometimes unexpected effect. Let me explain, there’s actually a good reason for it.

In a nutshell, the rules are:

  1. “everything in a conf file is global, and visible in the whole build”
  2. “everything not in a conf file, and specifically in a recipe, is local to that recipe only”

What does that mean: as an image is effectively just a recipe (note the .bb suffix), what ever you do in the file is not visible outside it. You can’t fine-tune systemd, mender, or anything like those in an image file. Why is that?

Because of reproducibility. bitbake aims at full reproducibility. Think a bit about it: you can also bitbake a recipe only. So if it would be allowed to have one recipe affect another, for example, your-image could affect mender, then you would actually get two different resulting mender packages if you invoke bitbake mender, versus bitbake your-image. This must be avoided, and hence, for any given state of metadata and configuration, each package is well defined, regardless of the specific bitbake invocation. And that’s why the rules are the way they are :slight_smile:

Hope that helps,
Josef

Hi @TheYoctoJester ,

thanks for that explainer. I have a strong philosophical problem with the way Yocto overuses local.conf, but that is something I’d have to take up with the Yocto mailing list.
I marked my last post as an answer for anyone running into this in the future.

However, I’m not quite out of the woods yet.
When booting, i get a strange error message:


After that, it just keeps booting normally.
That path looks like something got appended incorrectly, but I have trouble finding out whether it’s coming out of meta-mender, grub-mender-grubenv, meta-secure-core or SELoader.
Given that it still manages to boot, I’d be inclined to ignore it, but trying to apply any mender artifact yields another, and presumably related, error:

Update Module output (stderr): Mounted root does not match boot loader environment (/dev/sda3)!

I have found some people running into a similar error when manually partitioning images, but the fix there was always to just… not do that. Since I’m not, I’m a loss once again.
Any insights or would be appreciated!

Hi @Adrian,

hehe, no, it’s not local.conf. The key wording is “configuration file”, not “local.conf”. I completely agree, an empty local.conf is a good local.conf. Having anything in there that does not immediately relate to either selecting MACHINE and DISTRO or handling peculiarities of the build host specific setup is a sign that you should clean up, and figure out which things actually belong to the MACHINE, which into the DISTRO.

Concerning the grub error message, no immediate idea though. A very first guess would be that meta-mender/meta-mender-core/recipes-bsp/grub/files/cfg at e8b6b3e554392f6d0af3eb18667d93e658205250 · mendersoftware/meta-mender · GitHub is somehow applied twice.

Greets,
Josef

I think I had the same issue, this is how I solved it Mender and efi-secure-boot on intel-corei7-64 - #13 by annalenamarx

Hi @annalenamarx
Thanks for the suggestion!
Unfortunately, that patch didn’t work either :frowning:
With it, the machine fails to verify the kernel image and refuses to boot

Going through the mender clients code and doing some testing, it seems the switch over from A to B partition during reboot fails.
The .mender artifact I’m installing requests a reboot, but after the reboot the A partition is still mounted

Well, I don’t know how I feel about this.
Somehow, the issue disappeared.

I don’t know why, it did coincide with a bunch of Yocto patches coming in from upstream, mainly regarding Gnome and various libraries, but I don’t know whether that’s related.

Fwiw, i did not need to use @annalenamarx 's patch in the end, in case that titbit heps anyone in the future.

Anyways, @annalenamarx and @TheYoctoJester :
Massive thanks to you both for your time!

Greetings
Martin

1 Like