So was trying to poke around with a few things to find vulnerabilities with deploying signed artifacts to devices - in particular, one that fills up the device and bricks it.
So realistically, if I have access to a genuine signed artifact and I create another that’s incorrectly signed but contains a seriously compressed file (e.g. full of zeros) - I can manipulate that artifact (as it’s just a tar) to place the manifest and signature of the genuine artifact inside it, so a device will accept it.
The scenario here is an attacker with access to our infrastructure and server, but doesn’t have access to our signing system.
I know the Mender server will check the manifests when trying to upload to the server (good!), but it may be possible then to directly interact with S3 to get a bad artifact on the server - I haven’t verified that yet, but even if it isn’t possible but some other checks, not everyone might be using a Mender server as means to push down artifacts!
So in this case, the device would be accepting what they thought to be a genuine update, as the signature of the manifest is correct, but it will have to extract everything before it can verify that the manifest itself is correct.
Could this possibly fill up the device and brick it, or are there measures in place in the Mender client to stop extracting and revert?
We were thinking maybe that the manifest could include the extracted size (so that would be signed), for which could be matched against the uncompressed size reported by the .gz of the 0000.tar, if possible.
Kristian is gone for a little while, so you guys are stuck with me
First, without going into the signing specifics.
If I understood this correctly, the fear is that someone can insert an update, which can overwrite the size of the B partition, and thus brick the device?
If so, this is not possible with our client, as we never overwrite the size of the B partition, no matter what is in the payload.
If my understanding of your question is completely off, I will have a re-read
Ah, my mistake - I forgot to mention I’m only referring to the update modules here, and not a multi-partition device.
The idea is that the malicious artifact contains a large compressed file, as the artifact has been tampered with, and the checksum of that file can only be verified once it is fully extracted.
And the extraction would fill up the device to the max capacity.
I’m sorry, I’m not too familiar with the module implementations, so @kacf will have to confirm when he gets back, but:
From looking at the code, it is using fifo's and streams the payloads, so in general, you would not be able to fill up the running memory of your device, from just having a zip-bomb in the Artifact.
So in general the extraction should not be a problem, it seems to me.
For the modules themselves though, I am not sure, as they are free to buffer all the data from the input stream, if they so like.
I will defer the investigation here, and then @kacf will confirm next week if I’m wrong or I’m right
In the meantime, it’d be interesting if you could provice a PoC
Hey @fergaloconnor, good that your are putting our software to the test! It’s important to verify these things.
The issue is known and documented in the Update Modules specification, but the issue about filling up disk space is not explicitly mentioned. The majority of Update Modules do not implement this state, but leaves it to Mender, in which case it will dump the file on data partition. And yes, you could fill up the disk with this method. But as far as I can tell, it is at most a temporary DoS attack vector, where the disk would be filled up until it got an error code, which would immediately trigger the cleanup path, which frees up the space again.
The worst attack I can imagine is if you have some control over the download source, and you know in advance approximately how much free space the device has. Then you could stream the file until the byte right before the disk fills up, and then stall the download there, which would leave the device in a filled up, or almost filled up state for long time. This would go through several rounds of timing out and retrying, which could take hours in some cases, and during this time the device would be filled with garbage. But it’s a very elaborate attack, which requires access to a great deal of the infrastructure the device connects to, and even if executed successfully, the effects would still be temporary and would not do any real damage.
One way that this could be solved is to have a size field somewhere in the header data, but this would have to be added quite carefully, in a way that cannot itself be subject to an “exploding zip” attack. After all, the header is extracted in the same way as the data. Maybe a combination with an upper limit on the header data; 10M so should provide more than enough for the foreseeable future (if you need more you really should be using the data section).
OK - so the client will get triggered into the cleanup path. That’s great! So the vulnerability isn’t really there aside from your example of a worst attack.
In our case with S3, control over the download source there is near impossible I would imagine. So this is more a case by case basis depending on the download source used by others.
And agreed, the size field would solve that issue, as extreme as the attack vector is!