Hi,
I have a strange problem which is described below:
I have a platform where the dtb and kernel image are stored in the eMMC partition.
If, I flashed the partitions and booted the platform, then the device tree and kernel image read time from Partition A (This is same for Partition B too) is as follows:
67376 bytes read in 38 ms (1.7 MiB/s)
10629632 bytes read in 212 ms (47.8 MiB/s)
Flattened Device Tree blob at 83000000
Booting using the fdt blob at 0x83000000
Using Device Tree in place at 0000000083000000, end 000000008301372f
If I switch the partition to Partition B using fw_setenv and reboot, then also the above read time is same. In other words, both partitions are giving the same read time for device tree and kernel image.
Now, If I’m in partition A and do mender -rootfs <URI> update, completes it and rebooted, then the Partition B takes different read time for kernel image as given below:
67376 bytes read in 38 ms (1.7 MiB/s)
10629632 bytes read in 23059 ms (449.2 KiB/s)
Flattened Device Tree blob at 83000000
Booting using the fdt blob at 0x83000000
Using Device Tree in place at 0000000083000000, end 000000008301372f
But, If I interrupt the update and rebooted, then the partition A will show expected read time (i.e 212 ms).
This is same for Partition B to Partition A. In other words, If both partitions are updated via mender -rootfs, then both partitions will take more time (around 24 seconds) to load the kernel (which as loading within 212ms earlier)
I’m not doing any change in the boot environments and in all the cases, the dtb and kernel images are inside /boot of rootfs.
I would like to know anyone faced this issue? Is there any fix for this problem?
The simplest workaround is to switch your root filesystem to ext3 using the ARTIFACTIMG_FSTYPE variable. But I have also tried some of the patches mentioned in above links which do seem to work.
I am also currently researching this right now to figure out the extent of it and what our recommendations will be.
Here is a summary of my test results on a IMX8 and 2017.03 U-boot (the one from NXP BSP), and in my case I used an uSD card and would expect the performance to be better on a eMMC.
Stock U-boot (u-boot-imx_2017.03):
u-boot=> ext4load mmc 1:2 $loadaddr /boot/${image}
23065088 bytes read in 79537 ms (282.2 KiB/s)
u-boot=> ext4load mmc 1:2 $loadaddr /boot/${image}
23065088 bytes read in 1403 ms (15.7 MiB/s)
With ext3 as filesystem (ARTIFACTIMG_FSTYPE = “ext3”)
u-boot=> ext4load mmc 1:2 $loadaddr /boot/${image}
23065088 bytes read in 1938 ms (11.3 MiB/s)
With “-O ^extent”/extent disabled on a “ext4” image, note that 1:3 is the file system without extent feature:
u-boot=> ext4load mmc 1:2 $loadaddr /boot/${image}
23065088 bytes read in 82223 ms (273.4 KiB/s)
u-boot=> ext4load mmc 1:3 $loadaddr /boot/${image}
23065088 bytes read in 1927 ms (11.4 MiB/s)
The ext4 extent performance problem has been fixed in upstream, though it seems to just have missed the 2019.04 release (by one day!), meaning it will not be included until the next release which should be 2019.07.
Thank you @mirzak for providing me the links and references. Yes, I’m using i.MX8 device.
But, I would expect the slow read should be always present if the issue is related to the aarch64 and ext4 with U-boot. Why this is happening only when I update the rootfs? If I flashed it using MFG/UUU tools, then the kernel read is faster with the same aarch64 and ext4 !
Whether you are getting the slow read all the time or only once the rootfs got updated with an artifact?
Yeah, the mount output is not that interesting and I am interested of which features of ext4 are enabled. I would recommend to try to install dumpe2fs.
Could you please let me know whether you are facing the same problem?
I do not have the same behavior that it it is “fast” initially and gets slow after update. It is always slow .
But then again I am using the Yocto build that is sdimg and I suspect that the mfg applies different settings and that is why I was interested in the dumpe2fs output.
It will be difficult to add dumpe2fs now and try because we have a release where we need to reduce the rootfs from 250+MB to 80MB . I will try it after the release
Meanwhile, I can able to reduce the delay by switching the root filesystem to ext3 using the ARTIFACTIMG_FSTYPE variable. I’m getting the below figures:
Before update (when it is fast):>
67025 bytes read in 38 ms (1.7 MiB/s)
10786824 bytes read in 216 ms (47.6 MiB/s)
After update (when it is slow):
67025 bytes read in 42 ms (1.5 MiB/s)
10786824 bytes read in 504 ms (20.4 MiB/s)
PS: I’m using NXP’s UUU tool on Linux (not MFG tool).
If the extent feature is present in Filesystem features, that is when the slowdown is occurring. The extent feature is part of the “defaults” when you run mkfs.ext4 which is done in Yocto when it generates the file system image.
This also explains why it works better with ext3 because this feature is not supported.
From man page of mkfs.ext4:
extent
Instead of using the indirect block scheme for storing the location of data blocks in an inode, use extents instead. This is a much more efficient encoding which speeds up filesystem access, especially for large files.
@mirzak, I think I have figured out the reason why UUU loads kernel faster.
The UUU do partitioning as follows:
Initially partition the eMMC using sfdisk with 83 (Linux) type
Then UUU do the mkfs.ext4 as below on the partition(s)
FBK: ucmd mkfs.ext4 -F -j /dev/mmcblk0p1
Here, -j option is the reason why UUU partitioned image loads the kernel faster as compared to mender updated one.
According to the man page of mkfs.ext4
-j : Create the filesystem with an ext3 journal. If the -J option is not specified, the default journal parameters will be used to create an appropriately sized journal (given the size of the filesystem) stored within the filesystem. Note that you must be using a kernel which has ext3 support in order to actually make use of the journal.
At least it is somewhat consistent as we know already that it works just fine with ext3.
I will also give it a try with -j flag to confirm on my side but my guess here is that extens are not used when the journal is ext3 compatible and hence why there is no performance drop.