Race Condition verifying header.tar.gz in manifest with single-file update

I’m using mender client v3.0 on raspberry pi. This is only reproduceable via mender daemon, not on standalone. Server poll time is 1 seconds. ext4 commit = 5s. I am using signed artifacts.

If i have a “single-file” artifact, i can deploy no problem. But if it has 4 state scripts, 66% of the time, the deployment will fail with the following error

time="2021-09-20T17:43:37+01:00" level=info msg="State transition: update-fetch [Download_Enter] -> update-store [Download_Enter]"
time="2021-09-20T17:43:38+01:00" level=info msg="Installer: authenticated digital signature of artifact"
time="2021-09-20T17:43:38+01:00" level=error msg="Fetching Artifact headers failed: installer: failed to read Artifact: readHeaderV3: handleHeaderReads: reader: reading header error: invalid checksum; expected: [c835d3bcf2d7bfc4da2ef563b73c8f6ef34093316de7ce9a3579c39de0eaf925]; actual: [30cbb07a7b719f1da4379a6e2c10e368a4567b15017bab9cd8ce473637043ca6]"
time="2021-09-20T17:43:38+01:00" level=info msg="State transition: update-store [Download_Enter] -> cleanup [Error]"
time="2021-09-20T17:43:38+01:00" level=info msg="State transition: cleanup [Error] -> update-status-report [none]"
time="2021-09-20T17:43:38+01:00" level=info msg="State transition: update-status-report [none] -> idle [Idle]"

The checkum its complaining about is the hearder.tar.gz. Sometimes this will work, but it appears there is a race condition where the header.tar.gz does not match the checkum in the manifest.

More info: This will only happen if the payload is small. So in the example below server.crt is 3k. If I use a larger payload such that the update process spends significant time in the download state, it is not reproduceable.

This is the script i use to generate artifact. Mender-artifact v3.5. All the state script files just contain a single line: #!/bin/bash

set -e
ARTIFACT_NAME=aiden1
ARTIFACT_FILE=./${ARTIFACT_NAME}.mender
PAYLOAD=./payloadfiles/server.crt
SCRIPT=single-file-artifact-gen

curl https://raw.githubusercontent.com/mendersoftware/mender/master/support/modules-artifact-gen/$SCRIPT -o ./$SCRIPT
chmod +x ./$SCRIPT

./$SCRIPT \
  -n $ARTIFACT_NAME \
  -t "armhf" \
  -t "amd64" \
  -d "/opt/mydirectory" \
  -o $ARTIFACT_FILE \
  $PAYLOAD --\
  --script "./statescripts/ArtifactInstall_Enter_01" \
  --script "./statescripts/ArtifactInstall_Leave_01" \
  --script "./statescripts/ArtifactCommit_Enter_01" \
  --script "./statescripts/ArtifactRollback_Leave_01"

mender-artifact sign -f --key ./signingkey/private.pem $ARTIFACT_FILE
echo "DONE"

I have done a little more instrumentation of the problem as I do not have a development environment. When the update works, I can see in: checksum.go:Read(p []byte) that the (err == io.EOF) condition is true. If there is no EOF, the update will fail with an incorrect checksum. (its the same incorrect checksum all the time)

So it looks like when it passes its reading 514 bytes + EOF (n) and when it fails its only reading 512 bytes and no EOF. So it seems as though reader.go:readHeader is failing to verify the checksum before the reader is even closed. Could this be an edge case with header.tar.gz size being approximately = 512bytes.

I have found the cause of this. The gzip reader for the headers file is not causing the whole file to be read from the menderTarReader. This means the checksum.go does not “Read” all the bytes in order to calculate the correct checksum. If I place the following code before the verification in checksum.go:readHeader() it fixes the problem (or moves the problem somewhere else :smile: ) I hope this helps and that you can find a more elegant solution to this problem

	// GRANDFIELD: Make sure the tar reader is exhausted.
	buf := make([]byte, 32*1024)
	r.Read(buf);

	// Check if header checksum is correct.
	if cr, ok := r.(*artifact.Checksum); ok { .......................
1 Like

Awesome find and research @grandfield!

See the ticket for status on the fix.

Hi guys, quick question, for this change to take effect in mender client 3.1.0, should the change from the mender-artifact repo be merged into mendersoftware/mender/blob/master/vendor/github.com/mendersoftware/mender-artifact/areader/reader.go

Apologies if you have a build process that merges these a build time, but just curious. It seems like a stale clone of reader.go

Thanks again for your help.

Looks like indeed it is old.

@oleorhagen, @lluiscampos: You know why dependabot is not picking this up?

Good catch!

@kacf Dependabot cannot follow mender-artifact dependency updates because we are following “master” and not a given tag.

FWIW I will update it now.