Raspberry Pi Firmware

Greetings,

I was looking at this issue on the Pi GitHub page and it got me wondering what additional changes would be needed to eliminate the need for u-boot entirely. It looks like it would handle almost everything for the A/B boot as-is, but I suspect something like a special file to select between them (rather than parsing config.txt) and a way to check for failed boots (bootcount in u-boot, I think?) would be required additions. I don’t know how willing they would be to add extra stuff, but it seems it would be worth a try. Does anyone have any suggestions or advice? I’m happy to bring it up with the Pi devs if I can get some help on the actual requirements. Let me know what you think!

Thanks,
Trevor

Hi @stiltr. Thank you for bringing this up.

I think you covered the important parts.

We rely heavily on the bootcount feature in U-Boot to trigger roll-backs, which is also important to be able to trigger a rollback from user-space in case e.g the Mender server is not able to connect to the Mender server, it will just reboot the device without running mender -commit and the U-Boot bootcount will trigger a rollback.

The Mender client must also be able to parse and modify the configuration in user-space.

It would be prefererad if there is built-in redundancy (similar to U-Boot env), to be able to perform updates
to the configuration file atomically and in a fail-safe way.

Hi @mirzak,

Thanks for the reply! Sorry mine was so delayed.

I just posted a comment on the github issue asking about the possibility of adding these features to the Pi firmware. I’ll keep you posted on what I hear back. I’m worried that there might not be write support built into the firmware which would make the boot count difficult to accomplish, but I’m totally guessing there.

Thanks again!

Thanks for initiating the discussion @stiltr.

No problem!

So it looks like a bootcount may be possible (limited to two bits). I would think this would be sufficient since the mender default is a bootlimit of 1.

They’re asking for an overview of how this would all work and what would be required to be added. I’ve outlined it to the best of my ability below and would appreciate any feedback you could give. I tried to make the flag names as generic as possible (ideally most of this could be used for users who just want a recovery partition to boot after bootlimit number of failed boots).

The root partition, kernel, etc. would be selected via the recently added os_prefix flag. Two folders would hold the necessary files for each root fs (the root partition being specified in the respective cmdline.txt files).

A 2-bit bootcount would be added to wherever it lives and a method to read and reset it from user-space would be created.

Three new optional flags would be added to config.txt: upgrade_available, bootlimit and recovery_os_prefix. (This is for the simplest and most generic setup.)

The os_prefix and recovery_os_prefix flags in config.txt would be managed from user-space by the mender client. bootlimit would be set to 1.

During boot, the firmware would check bootcount, bootlimit and upgrade_available to select the proper os_prefix. (Please forgive the horrible psuedo code and formatting…)
if(bootcount<bootlimit && upgrade_available==0)
_ //Normal boot
_ boot();
elseif(bootcount<bootlimit && upgrade_available==1)
_ //Upgrade is pending, boot to it
_ os_prefix=recovery_os_prefix;
_ boot();
elseif(bootcount>=bootlimit && upgrade_available==1)
_ //Upgrade failed, boot normal os
_ boot();
elseif(bootcount>=bootlimit && upgrade_available==0)
_ //Normal boot failed, fall back to recovery
_ os_prefix=recovery_os_prefix;
_ boot();

Once the OS has booted, Mender would check bootcount and upgrade_available and act accordingly. If the boot was a success, bootcount is set to 0. If this is the first successful boot after an update, clear upgrade_available and swap os_prefix and recovery_os_prefix. If the boot is a failure, don’t change anything and reboot.

I think that about covers it, but it’s been a long day and I’m sure I missed something along the way. If you can sum this more elegantly, please feel free. Thanks for taking the time to look this over!

Hi @mirzak,

I forgot to tag you in the last reply, so I’m not sure if you’ve seen it. Do you have any input?

Thanks!

Apologies for the delay, I did see your write up but just been struggling to find time to look it over.

Overall I think you have covered the the important bits.

This,

if(bootcount<bootlimit && upgrade_available==0)
_ //Normal boot
_ boot();

can probably be:

if(upgrade_available==0)
_ //Normal boot
_ boot();

as Mender logic will only do rollback/alternative boot when upgrade_available=1. Same would apply to this statement,

elseif(bootcount>=bootlimit && upgrade_available==0)
_ //Normal boot failed, fall back to recovery
_ os_prefix=recovery_os_prefix;
_ boot();

When using an A/B update strategy there is not garanti that the “recover_os” is a functional image. So it would simply be:

if(upgrade_available==0)
_ //Normal boot
_ boot();
elseif(bootcount<bootlimit && upgrade_available==1)
_ //Upgrade is pending, boot to it
_ os_prefix=recovery_os_prefix;
_ boot();
elseif(bootcount>=bootlimit && upgrade_available==1)
_ //Upgrade failed, boot normal os
_ //Set  upgrade_available=0
_ boot();

So it looks like a bootcount may be possible (limited to two bits)

Do you know if this is persistent across power cycle? I trying to figure out what happens when you lose power.

In the U-Boot case, you store the U-Boot environment on persistent storage using an A/B strategy as well. This is required for it be atomic and thus be able to handle a power loss while updating variables.

This means that one needs to have two copies of config.txt and cmdline.txt (normal os, and recovery os). Maybe that was your intention as well but was not clear to me from the text.

Thank you so much for following up on this!