Mender variables are updated when mender installation is not completed

Dear Community,

I’m using Yocto SUMO version meta-mender for our FOTA update.

We are performing negative scenario where we are performing abrupt power off of the device during OS fota installation still in progress. After device boot up we observed that mender_boot_part, mender_boot_part_hex variables updated to boot from next partition. But the new partition contents were not completely written by mender as there was abrupt power off during the installation was in progress.
But question is why was the mender_boot_part, mender_boot_part_hex variables updated here though the new partition was not completed written. Please assist. Thanks

Hello @achyuthr,

We have identified one bug in our sync code, which I have posted here. However, hitting this bug is quite rare. Could you post me the line from your client log which contains “Native sector size of block device”?

Hello @kacf ,
Below is the log snippet of the client which contains “Native sector size”

INFO[0000] native sector size of block device /dev/mmcblk0p3 is 512, we will write in chunks of 1048576 module=dual_rootfs_device

Now coming the scenario:
We have powered off abruptly during the mender installation phase. So the installation could be for example 20% complete or 60% complete. And it may not be at the very end of mender installation. Since installation is partially complete, we cannot boot from the new partition as kernel or rootfs packages may not be present.

So question is:
Would the mender_boot_part, mender_boot_part_hex be updated to boot from new partition, if abruptly powered off during completion of only 60% or 20% (mender installation)?

Thanks for the log output, it’s pretty unlikely you are affected by the mentioned bug then.

Mender does not change mender_boot_part and mender_boot_part_hex until all of the download is both finished and synced to disk. Is it possible you have any state scripts, either in /etc/mender/scripts, or attached to the artifact, which could change the variables?

@kacf ,

we have not modified the state scripts that would change these variables.
Thanks you

Ok, then I will need more information, since I can’t immediately see what would cause the problem.

Tell me more about your setup:

  1. Are you using Mender in daemon mode or standalone mode?
  2. Can you post the following information:
    1. The output from this command, before you start the update:
      grub-mender-grubenv-print || fw_printenv; mount
      
    2. The commands you use to invoke the update, or the steps in the UI, if that’s what you use.
    3. After the deployment, the output from the same command again:
      grub-mender-grubenv-print || fw_printenv; mount
      
  3. Can you post the deployment log? In standalone mode, this is just the output from the command. In daemon mode, the log can be found at /var/lib/mender/deployments.0000.<UID>.

Hi @kacf ,

  1. we are using the Mender in standalone mode.

    1. we have not captured the fw_printenv output before the update.
    2. mender install command below:
      mender -log-level debug -log-file -install
    3. Since the device was hung at uboot, we were able get the “printenv” from Uboot
      altbootcmd=run mender_altbootcmd; run bootcmd
      baudrate=115200
      board_name=CCU
      board_rev=iMX8DX
      boot=container0
      boot_version=0xD000
      bootargs=console=ttyLP0,115200 earlycon=lpuart32,0x5a060000,115200 rootwait rootfstype=ext4 ro
      bootcmd=run mender_setup; setenv bootargs root=${mender_kernel_root} ${bootargs}; if test “${fdt_addr_r}” != “”; then load ${mender_uboot_root} ${loadaddr_os_cntr} /boot/${os_container}; fi; auth_cntr ${loadaddr_os_cntr}; ${mender_boot_kernel_type} ${loadaddr} - ${fdt_addr_r}; run mender_try_to_recover
      bootcmd_mfg=run mfgtool_args;if iminfo ${initrd_addr}; then if test ${tee} = yes; then bootm ${tee_addr} ${initrd_addr} ${fdt_addr}; else booti ${loadaddr} ${initrd_addr} ${fdt_addr}; fi; else echo “Run fastboot …”; fastboot 0; fi;
      bootcount=1
      bootdelay=1
      bootlimit=1
      commit_atf=1cb68fa
      commit_mkimage=dd023400
      commit_scfw=a5df0112
      commit_secofw=d1489a99
      ethaddr=00:01:02:03:04:05
      ethprime=eth0
      fastboot_dev=mmc0
      fdt_addr_r=0x83000000
      fdt_file=fsl-imx8dx-ccu.dtb
      fdt_high=0xffffffffffffffff
      fdtcontroladdr=bda8d030
      initrd_addr=0x83100000
      initrd_high=0xffffffffffffffff
      kboot=booti
      kernel=Image
      loadaddr=0x80280000
      loadaddr_m4=0x88000000
      loadaddr_m4_cntr=0x85000000
      loadaddr_os_cntr=0x86000000
      loadm4image_0=load mmc 0:6 ${loadaddr_m4_cntr} ${m4_0_image};auth_cntr ${loadaddr_m4_cntr}
      m4_0_image=m4_ccu_app_signed.bin
      m4boot_0=run loadm4image_0; dcache flush; bootaux ${loadaddr_m4} 0
      mender_altbootcmd=if test ${mender_boot_part} = 2; then setenv mender_boot_part 3; setenv mender_boot_part_hex 3; else setenv mender_boot_part 2; setenv mender_boot_part_hex 2; fi; setenv upgrade_available 0; saveenv; run mender_setup
      mender_boot_kernel_type=booti
      mender_boot_part=3
      mender_boot_part_hex=3
      mender_check_saveenv_canary=1
      mender_dtb_name=fsl-imx8dx-ccu.dtb
      mender_kernel_name=Image
      mender_pre_setup_commands=run m4boot_0;ahab_status
      mender_saveenv_canary=1
      mender_setup=if test “${mender_saveenv_canary}” != “1”; then setenv mender_saveenv_canary 1; saveenv; fi; if test “${mender_pre_setup_commands}” != “”; then run mender_pre_setup_commands; fi; setenv mender_kernel_root /dev/mmcblk0p${mender_boot_part}; if test ${mender_boot_part} = 2; then setenv mender_boot_part_name /dev/mmcblk0p2; else setenv mender_boot_part_name /dev/mmcblk0p3; fi; setenv mender_kernel_root_name ${mender_boot_part_name}; setenv mender_uboot_root mmc 0:${mender_boot_part_hex}; setenv mender_uboot_root_name ${mender_boot_part_name}; setenv expand_bootargs “setenv bootargs \”${bootargs}\“”; run expand_bootargs; setenv expand_bootargs; if test “${mender_post_setup_commands}” != “”; then run mender_post_setup_commands; fi
      mender_try_to_recover=if test ${upgrade_available} = 1; then reset; fi
      mender_uboot_boot=mmc 0:1
      mender_uboot_dev=0
      mender_uboot_if=mmc
      mfgtool_args=setenv bootargs console=${console},${baudrate} rdinit=/linuxrc clk_ignore_unused
      os_container=os_cntr_signed.bin
      sec_boot=yes
      soc_type=imx8qxp
      stderr=serial@5a060000
      stdin=serial@5a060000
      stdout=serial@5a060000
      upgrade_available=0
  2. since device was hung at Uboot, we could not get the deploment logs. we had to reflash the device and hence we lost the logs.

I noticed that upgrade_available=0 exists in the output you posted. This is unexpected for an update which has not booted successfully yet.

The order is supposed to be:

  1. Install update.
  2. Bootloader environment is switched to;
    bootcount=0
    mender_boot_part=3
    upgrade_available=1
    
  3. Reboot.
  4. Bootloader switches environment to:
    bootcount=1
    mender_boot_part=3
    upgrade_available=1
    
  5. Boot proceeds.
    • If boot fails, bootloader switches environment to:
      bootcount=1
      mender_boot_part=2
      upgrade_available=0
      
    • If boot succeeds instead, mender commit switches the environment to
      bootcount=1
      mender_boot_part=3
      upgrade_available=0
      

As you can see, the only scenario where the environment is switched to what you have in your log, is when Mender has successfully run from the new partition. Bootloader environment updates are atomic and checksummed, so I don’t think partial updates can explain this. This leads me to believe there is something wrong in the bootloader setup itself, but it’s difficult to be specific. You might want to inspect the environment at various points to make sure it is as expected.

Sorry I can’t be of more help.