Artifact uploads failing

I am setting up a test Mender server instance in AWS EKS.

All the pods have started okay and I can browse to the UI and log in. The issue I have is when I try to upload an artifact, this gets stuck at 20% (of a ~46MB artifact) and after about a minute I usually get a 504 error. I have tried changing the AWS load balancer timeout and the same occurs, just the time taken for a 504 response increases. Very occasionally the progress bar increases past 20% after some delay, but the upload will still fail to complete.

Each time I try to upload an artifact an empty file gets created in the artifact storage AWS S3 bucket. I have tried writing to the S3 bucket from an EC2 instance which is an EKS node and the node can successfully write the file to S3.

I have tried using the mender-cli to do the artifact upload, both from over the internet and from behind the load balancer and get the same behaviour.

When looking at the pod logs I can see the request coming in on the API gateway pod, from what I can tell the deployments pod does not have a matching log entry. Other requests to the deployments pod are completing okay as I can see requests for the releases list in the deployment pod’s logs. When looking at network traffic between the pods there does appear to be an activity spike to the deployments pod when an artifact upload is attempted.

Has anyone had a similar issue with uploads failing and empty files being created in S3, or any suggestions on what may be causing this?

To help me debug further, is there a way to increase logging verbosity for all of the Mender pods, or different locations to look for logs or status messages?


I have some more details since first posting.

When setting AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY as variables for S3 access everything works, with uploads from the browser being stored in S3.

Setting AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY blank so the EC2 instance profile is used (as described here: Storage of the artifacts | Mender documentation) the upload fails, again with an empty file being created in S3.

Setting AWS_SERVICE_ACCOUNT_NAME to a kubernetes service account that maps to an AWS IAM role that has S3 read/write permissions the uploads fail, again creating an empty file in S3. In this case inspecting the deployments pod with kubectl the expected service account is being used.

Permissions are the same between those set for the user for the key, the roles for the node, and role for service account. Also testing the same service account permissions and same node role using an AWS CLI test pod, the test pod can upload a 50-55MB file to S3. For both using Mender and AWS CLI test pod, AWS logs report the expected role is being used and do not have any obvious errors.

Should setting AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY blank to use the node permissions work? Should setting AWS_SERVICE_ACCOUNT_NAME work? Are there any other settings needed to make either of these approaches work?

Is there a complete list of permissions Mender needs when running in AWS?

I forgot to mention before; this is with Mender 3.4.