When installing an artifact from the CLI, I get a progress bar, like so:
$ mender install /tmp/update_artifact
Installing Artifact of size 672247296...
/etc/mender/scripts/Download_Enter_01: Update status: '' -> 'INSTALLING'
.............................................................. - 100 %
Which would seem like it is writing the artifact to the “inactive partition”. In this example, the progress bar goes from 0% to 100% in around 70s.
What I don’t get is that after this, it takes another 45s to reach the point where it says stuff like:
ArtifactInstall_Enter_01: Change: 'INSTALLING' -> 'INSTALLING'
ArtifactInstall_Leave_01: Change: 'INSTALLING' -> 'INSTALLED'
Setting up /dev/mmcblk0p6 for use as /var filesystem for root /dev/mmcblk0p3
Use 'commit' to update, or 'rollback' to roll back the update.
At least one payload requested a reboot of the device it updated.
Which means that the progress bar is actually not giving the progress of the install process. Reading through the code, it seems like the progress bar is related to the Reader
on the artifact.
So here is my question: after the installation process has read the full artifact, what takes another 45s before the installation is complete? And would there be a way to reflect that in the progress bar? It feels weird to have a progress bar that only accounts for 2/3 of the time…
More specifically, the progress bar seems to be linked to the image Reader
here, and I see the following message appear around when the installation is actually finished:
Wrote %d/%d bytes to the inactive partition
Now StoreUpdate
takes a image io.Reader
as a parameter, and copies it “to the inactive partition”:
n, err := io.Copy(dev, image)
I am not aware of all the details between the “progress reader” and this “image reader”, but naively I would have expected it to be the same, such that the progress bar reflects that copy.
Am I completely off? And is the image copied twice, then, for some reason?
I had a closer look and, assuming that above, the “progress reader” is indeed the “image reader” (which would make total sense), then the next assumption is that io.Copy
reads the whole image (which makes the progress bar reach 100%), but then the kernel still needs another 40s to finish writing (to sync
, so to say).
So I am trying to see if I can make io.Copy
write in a synchronized way, so that the Reader
represents what’s actually written on disk (and not what’s buffered somewhere on the kernel side).
io.Copy
seems to be using out, err = os.OpenFile(device, flag, 0)
here.
My next question is therefore whether using a flag like O_SYNC
would help .
I tried to add flag |= os.O_SYNC
around here, but it does not seem to make a difference.
I would like to try flag |= os.O_SYNC | os.O_DIRECT
, but I get the error:
installer/block_device.go:116:28: undefined: os.O_DIRECT
I tried setting it by hardcoding 00040000 (as per this resource), i.e. with flag |= os.O_SYNC | 0o40000
, but while it compiles, it ends up giving an error when running mender install <artifact>
:
ERRO[0000] Download failed: Payload: can not install Payload: partition-root: Failed to write the update to the inactive partition: “/dev/mmcblk0p2”: Failed to open the device: “/dev/mmcblk0p2”: open /dev/mmcblk0p2: not a directory
Open to suggestions …
Finally found it: it’s just that the progress bar reports 100% when it reaches 99%, I believe because that ends up looking better in the logs.
In my opinion, that is confusing, and I would rather have the uglier log than have someone wonder what happens during those 40sec after the progress bar reported that it was finished: fix: remove confusing hack in progress bar logic by JonasVautherin · Pull Request #990 · mendersoftware/mender · GitHub
1 Like