Time for an update?

Hello

What is the typical time for a device to update in standalone mode with the files already on the local sd card ?

I use a Raspberry Pi 3B+ and 32GB image, that contains two rootfs of about 13GB. The mender artifact size is about 1GB. However, the update process takes about 30minutes. It seems a bit long.

Is it normal ? Or do you have any idea why it’s so long ?

Thank in advance

Hello @deku welcome to the community,

the artifact gets streamed inside the inactive partition, and depending on some factors (eg download speed client and server side & performance of the client pi) will determine how long an deployment will take.
But 30mins is about right for this case.

1 Like

Concerning performance: the image is already stored locally on the sd card. The SD card write speed on the Raspberry Pi 3B+ is about ~20MB/s. For a file of 2.5GB, the write time should be about 2-3 minutes. If we include the decompressing time, we should be about 5minutes.

What does take more time in the process ? Is the image verified byte per byte ?

Raspberry Pi 3B+ is about ~20MB/s

How did you come to this conclusion? Seems a bit high to me based on my experience with RPi3

that contains two rootfs of about 13GB

Is that 13GB per rootfs or total for both?

Which Mender client version are you using?

I used dd to do some write tests from /dev/urandom to a file on local storage. Using a Sandisk Extreme UHS-3 sd card.

13GB per rootfs, for the size. However, they are only filled about 2.5GB.
I use mender 2.2.0.

@deku I’m fairly certain that the entire 13GB will be written which is a large part of the apparent slowdown. I know there were some planned optimizations for write speed and maybe @kacf can comment further.

Drew

Ok, but then what do the number displayed during a deployment represent ?
number

In addition, the client seems to “hang” on certain line before it goes at full speed again. May that be a part of the problem ?

The SD Card is not malfunctioning and does not have bad blocks.

I don’t know the specifics but I assume the progress reported by Mender in standalone mode is real. Maybe buffering is causing issues? @kacf any ideas?

Do you know if maybe the client does some sort of verifications on the data that it copies ? I mean, during the copy/install process ?

There was a discussion below that may shed a little bit more light on your issue.

The write optimization is already in 2.2.0, but its performance depends on what was already on the memory card. If it’s very different from what is in the artifact, then it might even take longer than a straight “dumb” write. But in most cases it will be faster.

Does a second update take equally long?

Also please share a full update log (in text). This would contain some additional information, e.g how many blocks where written vs how many where skipped (this is one of the optimizations in 2.2.0).

That’s what I would like to do, but the client never seems to log anything even when using the -L and -l options. I use raspbian converted with mender-convert.

There is no difference between a failed and a sucessful upgrade, the client will never log anything

How are you trying to read the logs?

From the managed service you can get:

journalctl -a -u mender

Or if you run it in standalone mode, just run it with:

mender -debug -install ...

Though the output is fairly verbose even without -debug flag.

I run in standalone mode, sorry forgot to tell. This gave me a lot of output:

INFO[1759] The optimized block-device writer wrote a total of 13349 frames, where 3014 frames did need to be rewritten module=block_d
evice
INFO[1760] Wrote 13996392448/13996392448 bytes to the inactive partition module=dual_rootfs_device
100% 1020594 KiB
INFO[1760] Enabling partition with new image installed to be a boot candidate: 2 module=dual_rootfs_device
Use -commit to update, or -rollback to roll back the update.

Only 3014 frames were skipped.

By the way, shouldn’t the client log deployment info on a file in /var/lib/mender or /etc/mender ? During my tests the client won’t log anything.

It does log to above location when running in managed mode.

Only 3014 frames were skipped.

Yeah, so this particular optimization does not help much in this particular user-case. But this largely depends on what is there prior to writing. So if you update to the same image twice it should actually skip 100 % of the blocks (would only be true if the rootfs is read-only though so there will be some blocks written if it is not)

Does the client log elsewhere in standalone mode ?

Almost all blocks are rewritten even when upgrading twice. Tested it before

I find these type of things interesting :), so I ran some tests for reference.

Mender client: 2.2.0
SD Card: Samsung EVO MB-MP32GA 32GB U1 Class 10 MicroSDHC Memory Card

Disk on device:

pi@raspberrypi:~ $ sudo fdisk -l /dev/mmcblk0
Disk /dev/mmcblk0: 29.8 GiB, 32010928128 bytes, 62521344 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xdc6d2ff0

Device         Boot    Start      End  Sectors  Size Id Type
/dev/mmcblk0p1 *       24576   548863   524288  256M  c W95 FAT32 (LBA)
/dev/mmcblk0p2        548864 26476543 25927680 12.4G 83 Linux
/dev/mmcblk0p3      26476544 52404223 25927680 12.4G 83 Linux
/dev/mmcblk0p4      52404224 54501375  2097152    1G 83 Linux

Actual payload content (88 % free space)

pi@raspberrypi:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/root        13G  1.3G   11G  12% /

Raw Performacne test:

# Clear I/O cache
echo 3 | sudo tee /proc/sys/vm/drop_caches

# Write 1 GB of random data
time sudo dd if=/dev/urandom of=/dev/mmcblk0p3 bs=1M count=1024 oflag=dsync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 90.3429 s, 11.9 MB/s

Above should give is an good indication of the “raw” write performance of the SD
card and this is more or less how the Mender client writes the data. Mender client
of course needs to decompress the Artifact while writing which is CPU bound and
not I/O bound (well maybe memory I/O but this should not slow us down).

With the above information writing 12 GB of data in the best case should take: 18 minutes (90 seconds * 12)

Lets try it:

# Clear I/O cache
echo 3 | sudo tee /proc/sys/vm/drop_caches

pi@raspberrypi:~ $ time sudo mender -install 2020-02-05-raspbian-buster-lite-raspberrypi3-mender.mender
INFO[0000] Loaded configuration file: /var/lib/mender/mender.conf  module=config
INFO[0000] Loaded configuration file: /etc/mender/mender.conf  module=config
INFO[0000] Mender running on partition: /dev/mmcblk0p2   module=cli
INFO[0000] Start updating from local image file: [2020-02-05-raspbian-buster-lite-raspberrypi3-mender.mender]  module=standalone
Installing Artifact of size 465903616...
INFO[0000] no public key was provided for authenticating the artifact  module=installer
INFO[0000] opening device /dev/mmcblk0p3 for writing     module=block_device
INFO[0000] native sector size of block device /dev/mmcblk0p3 is 512, we will write in chunks of 1048576  module=block_device
................................   0% 1024 KiB
................................   0% 2048 KiB
................................   0% 3072 KiB
................................   0% 4096 KiB

< cut >

................................  99% 454656 KiB
..........INFO[1724] 0 bytes remaining to be written               module=limited_writer
INFO[1724] The optimized block-device writer wrote a total of 12661 frames, where 3072 frames did need to be rewritten  module=block_device
INFO[1724] Wrote 13274972160/13274972160 bytes to the inactive partition  module=dual_rootfs_device
                       100% 454984 KiB
INFO[1725] Enabling partition with new image installed to be a boot candidate: 3  module=dual_rootfs_device
Use -commit to update, or -rollback to roll back the update.
At least one payload requested a reboot of the device it updated.

real    28m45.767s
user    18m44.459s
sys 0m53.113s

Pretty much the same result that you are sharing so far, in both time and block written/skipped.

I can also confirm that I get similar results running the same update multiple times. Probably something in Raspbian that alters the filesystem image enough for the optimizations not to provide much effect here, e.g the Linux kernel argument fsck.repair=yes could be causing this.

What the time output also tells us is that almost 10 minutes are being wasted on waiting, because the user and sys times gives us the numbers in CPU time spent in user/kernel mode and they do not add up to the real time.

As I am reading the Mender Artifact in the local filesystem I suspected that it is actually “choking” the SDIO bus because we have three operations running on it:

  1. The .mender file is being read from the local file system
  2. The Mender client will read ahead blocks in the inactive partition
  3. The Mender client will write blocks to the inactive partition

So I tried moving my .mender file to a USB flash drive to avoid 1.

# Clear I/O cache
echo 3 | sudo tee /proc/sys/vm/drop_caches

pi@raspberrypi:~ $ time sudo mender -install /mnt/2020-02-05-raspbian-buster-lite-raspberrypi3-mender.mender 
INFO[0000] Loaded configuration file: /var/lib/mender/mender.conf 
WARN[0000] No server URL(s) specified in mender configuration. 
WARN[0000] Server entry 1 has no associated server URL. 
INFO[0000] Mender running on partition: /dev/mmcblk0p2  
INFO[0000] Start updating from local image file: [/mnt/2020-02-05-raspbian-buster-lite-raspberrypi3-mender.mender] 
Installing Artifact of size 465903616...
INFO[0000] No public key was provided for authenticating the artifact 
INFO[0000] Opening device "/dev/mmcblk0p3" for writing  
INFO[0000] Native sector size of block device /dev/mmcblk0p3 is 512 bytes. Mender will write in chunks of 1048576 bytes 
................................   0% 1024 KiB
................................   0% 2048 KiB

< CUT >

................................  99% 454656 KiB
..........INFO[1465] All bytes were successfully written to the new partition 
INFO[1465] The optimized block-device writer wrote a total of 12661 frames, where 566 frames did need to be rewritten (i.e., skipped) 
INFO[1466] Wrote 13274972160/13274972160 bytes to the inactive partition 
                       100% 454984 KiB
INFO[1466] Enabling partition with new image installed to be a boot candidate: 3 
Use -commit to update, or -rollback to roll back the update.
At least one payload requested a reboot of the device it updated.

real    24m26.881s
user    18m31.913s
sys 0m41.828s

Improvement but still ~6 minutes unaccounted for. Overhead of decompression? Overhead of golang? Overhead of reading and comparing blocks? Probably combination of all of them

So the short answer to you question (which was already said :)), what you are seeing is expected behavior of the 2.2.0 client in this environment (Raspbian)

2 Likes

Wow, now that’s what I call investment! This is pretty awesome.

Now the “issues” are clear. Thank you very much for sharing your results :smiley:

One last question… You used the flag “dsync” that means that no caching is involved in the process and that write operations are performed directly on the media. What does the mender client require this ?