Delta updates checksum mismatch

Dear,

We are currently on a Mender Professional plan and are adding support for a new (community supported) device: a Jetson Xavier NX.

When trying to apply a delta update on this device, it fails. The delta update has the following properties:

  • Rootfs size: 6973030400
  • rootfs_image_checksum: 592a7f320564105dbe57e3e668139fc4b28e996904da5168e61f2b7ec8db865f

To verify this, I ran the following command, where /dev/mmcblk0p11 is the currently active partition and 13619200 is the number of blocks in the rootfs (i.e. 6973030400 / 512):

jetson-xavier-nx-nobi-revision-a:~# dd if=/dev/mmcblk0p11 bs=512 count=13619200 2>/dev/null | sha256sum
592a7f320564105dbe57e3e668139fc4b28e996904da5168e61f2b7ec8db865f  -

As far as I can see, the checksum is thus correct. However, I get the following output in journalctl:

Feb 24 09:11:00 nobi-d5eab244-1e36-44e6-adc5-c54a73d42a8e mender[5039]: time=“2021-02-24T09:11:00Z” level=info msg=“Installer: authenticated digital signature of artifact”
Feb 24 09:11:07 nobi-d5eab244-1e36-44e6-adc5-c54a73d42a8e mender[5039]: time=“2021-02-24T09:11:07Z” level=info msg=“Update module output: xdelta3: target window checksum mismatch: XD3_INVALID_INPUT”
Feb 24 09:11:07 nobi-d5eab244-1e36-44e6-adc5-c54a73d42a8e mender[5039]: time=“2021-02-24T09:11:07Z” level=info msg=“Update module output: xdelta3: normally this indicates that the source file is incorrect”
Feb 24 09:11:07 nobi-d5eab244-1e36-44e6-adc5-c54a73d42a8e mender[5039]: time=“2021-02-24T09:11:07Z” level=info msg=“Update module output: xdelta3: please verify the source file with sha1sum or equivalent”
Feb 24 09:11:07 nobi-d5eab244-1e36-44e6-adc5-c54a73d42a8e mender[5039]: time=“2021-02-24T09:11:07Z” level=info msg=“Update module output: Failed to apply the delta, err: 1”
Feb 24 09:11:07 nobi-d5eab244-1e36-44e6-adc5-c54a73d42a8e mender[5039]: time=“2021-02-24T09:11:07Z” level=error msg=“Artifact install failed: Update module terminated abnormally: exit status 1”

After flashing the device, I first applied a full update since I know delta updates don’t work as a first update. Then I built and tried to apply the delta update. All of our images have a read-only root filesystem.

Is there anything else I can do to verify that the active rootfs is unchanged and to make the delta update apply properly? What method is used by the delta updater to verify this checksum?

Thanks in advance!
Niels Avonds

I have identified at least one regression in mender-binary-delta 1.2.0, related to integer overflow. Is this the one you are using?

If so, can you try to revert to version 1.1.1, and see if it works?

Thanks for your reply.

We are currently using 1.2.0. However, I cannot test with 1.1.1 because of this issue: https://tracker.mender.io/browse/MEN-4246

One last note: we are using the same version (1.2.0) on different hardware (Jetson TX2) and it works fine there.

Hey @nielsavonds, MEN-4246 is fixed in mender-binary-delta 1.1.1, it is only 1.1.0 which has this problem.

I did another test using mender-binary-delta 1.1.1, which yields the same results. So the test I did was the following:

  • Flash the root filesystem. This build still contained mender-binary-delta 1.2.0 but I figured this wouldn’t matter
  • Do a full upgrade to a build that contains mender-binary-delta 1.1.1. Verified the version of mender-binary-delta by logging in and running:
    nobi-bc9df4eb-9c04-4f1c-8943-a4cd1f5241ee:/# /usr/share/mender/modules/v3/mender-binary-delta --version mender-binary-delta 1.1.1
  • Do a delta upgrade to a new build that is actually identical to the last one.

I’m still getting the checksum mismatch errors. Once again, I verified the checksum of the filesystem and it seems fine:
dd if=/dev/mmcblk0p11 bs=512 count=13619200 2>/dev/null | sha256sum

I’m wondering if the issue is in the “empty space” that’s present on /dev/mmcblk0p11? My question remains: how does mender-binary-delta verify the checksum? Does it only read the first bytes (according to rootfs size) of the filesystem, or does it read the entire filesystem?

Hey @nielsavonds,

The issue I mentioned previously, although a bug, doesn’t seem to be related to your issue.

I’ve made multiple attempts at trying to reproduce this, but I have come up empty-handed. Is there any chance I could get my hands on the delta artifact? For the moment I do not need the source partition, so there should limited risk of leaking any real content, since it is only a binary diff. I would like to look at some of the internal header data which is not available using the mender-artifact command.

We can arrange it off-list if you desire. My email is kristian.amlie@northern.tech.

Thanks for the image @nielsavonds, with this I was able to rule out several potential problems. The internal headers match the expected size of 6973030400 exactly, so it should ignore the “empty space” that you asked about earlier, since this is outside of that range. And all other aspects of the headers seem correct as well. I think it’s pretty safe to conclude that the artifact is correctly generated, and the problem must be in the setup somewhere.

Unfortunately it also means that I still don’t know what causes the problem.

My question remains: how does mender-binary-delta verify the checksum? Does it only read the first bytes (according to rootfs size) of the filesystem, or does it read the entire filesystem?

It reads the number of bytes from the rootfs_file_size header, so 6973030400 bytes in your case.

I’m suspecting that the problem lies elsewhere, can you post these two pieces of information:

  • Contents of /etc/mender/mender.conf (remember to remove TenantToken if present, since it is a private piece of information).
  • The output of fw_printenv.

Hi Kristian,

Thanks for your response. This is the /etc/mender/mender.conf:

{
"ArtifactVerifyKey": "/etc/mender/artifact-verify-key.pem",
"InventoryPollIntervalSeconds": 28800,
"RetryPollIntervalSeconds": 300,
"ServerURL": "https://hosted.mender.io",
"StateScriptRetryTimeoutSeconds": 86400,
"TenantToken": "REDACTED",
"UpdatePollIntervalSeconds": 1800
}

Here is /data/mender/mender.conf:

{
    "RootfsPartA": "/dev/mmcblk0p1",
    "RootfsPartB": "/dev/mmcblk0p11"
}

Finally, fw_printenv only prints this:

mender_boot_part=0

This is because there is no U-boot bootloader in this setup. Instead, we use cboot and a fake fw_printenv script as provided by the meta-mender-tegra layer.

Figuring this is probably the issue, I used an overlayfs to edit the fake fw_printenv script to make it return “mender_boot_part=1” instead and this seems to have fixed it!

I will take this up further with the maintains of the meta-mender-tegra layer to get a proper fix in (see Delta upgrade issue when using cboot · Issue #10 · OE4T/meta-mender-community · GitHub )

Thank you very much for pointing me in the right direction!
Best regards,
Niels

1 Like

I’m very happy to hear that you worked it out. Indeed it was a setup error, but mender-binary-delta should have produced better error messages to uncover this. I have made some changes to the partition detection code so this should be easier to reveal if it happens again. Thanks for the update!

1 Like

Can this be related to my issue?