Different commit behavior of rootfs-image-v2 VS rootfs-image

I noticed a difference in the behavior of “rootfs-image-v2” https://github.com/mendersoftware/mender/blob/master/tests/rootfs-image-v2 when trying to commit after an upgrade fails, but only in standalone mode. Consider the following scenario:

  • Install the artifact manually in standalone mode
  • Trigger a reboot
  • The new update is unable to boot (any method can be used to simulate this, such as causing a kernel panic, or pulling the plug while the kernel is being loaded)
  • Boot again (the bootloader reverts to the old OS, reverting the environment variables)
  • From the old OS, attempt to commit in standalone mode

With an artifact of type “rootfs-image”, the commit fails: mender returns “2” and the log says there is nothing to commit. However, if you are using “rootfs-image-v2”, the commit succeeds (!) and there is no hint that something very bad occurred (!!!).

I looked at the code that runs when an artifact of type “rootfs-image” is committed (installer/dual_rootfs_device.go: CommitUpdate) and found an explanation for this difference: this function checks if the “upgrade_available” bootloader environment variable is “1”. If not, the code does not commit and returns ErrorNothingToCommit, which in turn causes the application to return an exit code of 2. The “rootfs-image-v2” module, on the other hand, performs no check and just sets “upgrade_available” to 0.

Now, while it is rather easy to just patch the “rootfs-image-v2” module to perform the same check and “exit 2” if there is nothing to commit, I can’t actually get the mender CLI to return 2. Again, the answer is in the code (installer/modules.go: CommitUpdate): the precise exit code of the update module is not checked, it merely checks if the command was successful or failed. So “exit 1” or “exit 2” makes no difference.

cc @kacf (would you like a PR that fixes that module anyways, even if I can’t get mender CLI to return 2?)

P.S.: The issue doesn’t occur in “managed” (OTA) mode, probably because “ArtifactVerifyReboot” does that same check and kills the update (it triggers a rollback, so commit is never called). “ArtifactVerifyReboot” is not called in standalone mode.

This is actually a bug in the client, the built in rootfs-image module should return a regular failure, not ErrorNothingToCommit. The latter is reserved for when no Artifact installation is in progress. Consider the case where an Artifact has an ArtifactRollback state script, this will not execute unless you also call mender -rollback; the rollback from the boot loader alone is not enough. So you need a regular error to trigger this.

Fixes for both the Update Module and the wrongly returning ErrorNothingToCommit would be appreciated!

Nice investigation, btw! :slight_smile:

Here: https://github.com/mendersoftware/mender/pull/605
And here: https://github.com/mendersoftware/mender/pull/607