Mender upload fails for big artifacts

I was used to have to retry 3-4 times to upload a big artifact.
But it seems that starting mid-end of March it’s almost impossible.

time mender-cli artifacts --server https://hosted.mender.io upload ~/build/mender-convert/deploy/1.5.0-    CS-2-x86_64.mender 
Configuration file not found. Continuing.
4.03 GiB / 4.03 GiB [------------------------------------------------------------------------------------------------>] 100.00% 16.81 MiB p/sProcessing uploaded file. This may take around one minute.

4.03 GiB / 4.03 GiB [------------------------------------------------------------------------------------------------------>] 100.00% 0 B p/sFAILURE: POST /artifacts request failed: Post "https://hosted.mender.io/api/management/v1/deployments/artifacts": EOF

real	14m7,417s
user	0m4,667s
sys	0m4,436s

Or this error

FAILURE: artifact upload failed with status 400, reason: {"error":"reading artifact error: Payload: can not install Payload: rootfs.img: AccessDenied: Request has expired\n\tstatus code: 403, request id: 55V7TDB2FGXT8PCS, host id: ZVZZKU2/5Z5429d1lXCSxz7wvzTxMS533qKP8+lIQgJYngvnDgTCSKh/UqXmSUFluEgBuBr/rTw=","request_id":"83253c02-fc1f-4b6d-be39-80fc2c5e973f"}

The artifact has 4.03GB using gzip compression.
The rootfs has ~19GB size.

Friday 26th of March I had to try 20+ times.
Monday 29th of March more than 30 times.

I am using mender-cli 1.6.0 and I was trying also from web interface.

@tranchitella can you help?

and additional error code when upload over web

Artifact couldn't be uploaded. Request failed with status code 502

we managed it only once to upload so far since the change starting mid-end of march.

@peter @tranchitella can you help?

Hello @asansano

thank you for trying Mender.
when it succeeded how long did it take? was it also something like 14 minutes?
in the mean time I am trying to replicate the problem.

best regards,
peter

Hello @peter
Lately I noticed that it works using the web interface. I had only one failure.
I will try next days to use only the cli.

What I noticed is that I upload the artifact in 3-4 minutes.
Then I see the message “Processing uploaded file. This may take around one minute.”
After that message, it seems that if it doesn’t finish in 10 minutes, it fails.

My guess is that there is a 10 minutes timeout.

Hi @peter

as @alin.alexandru said, I think we should focus on the CLI, as over web it works with 1-2 attempts.
there might be some post processings breaking at a certain point over CLI. Again we needed as well some attempts over CLI in the past, but it went worse since last month.

best regards,
andreas

Hello @asansano @alin.alexandru

thank you for the details.
We were unable to replicate the issue, could you please share your artifact with me via support@mender.io (I mean with some kind of download link)?

best regards,
peter

Hello,

I will see if I can share the artifact.
But I made 2 tests now.

time mender-cli artifacts --server https://hosted.mender.io upload ~/build/mender-convert/deploy/1.6.0-RC6-x86_64.mender 
Configuration file not found. Continuing.
3.61 GiB / 3.61 GiB [------------------------------------------------------------------------------------------------->] 99.95% 16.06 MiB p/sProcessing uploaded file. This may take around one minute.

3.61 GiB / 3.61 GiB [------------------------------------------------------------------------------------------------------>] 100.00% 0 B p/supload successful

real	10m39,450s
user	0m10,252s
sys	0m11,893s

The one above worked

time mender-cli artifacts --server https://hosted.mender.io upload ~/build/mender-convert/deploy/1.6.0-RC5-x86_64.mender
Configuration file not found. Continuing.
3.97 GiB / 3.97 GiB [------------------------------------------------------------------------------------------------->] 99.97% 16.53 MiB p/sProcessing uploaded file. This may take around one minute.

3.97 GiB / 3.97 GiB [------------------------------------------------------------------------------------------------------>] 100.00% 0 B p/sFAILURE: POST /artifacts request failed: Post "https://hosted.mender.io/api/management/v1/deployments/artifacts": EOF

real	14m20,032s
user	0m12,439s
sys	0m13,774s

The second one failed.

The differences are:

  • downgraded MENDER_STORAGE_TOTAL_SIZE_MB from 40000 to 20000
  • Removed ~800MB data from golden image

thanks. can the times be related to the network bandwidth? I mean: when you tested with UI, did you use the same network link? does the upload via UI lasts always 14minutes as well?
perhaps you could share how you created the artifact?

peter

I will check on next builds.