Mender setup at ECS

I’m working on setting up Mender at AWS ECS at this moment:

  • Mender 2.3.0
  • using S3 as storage backend, no storage-proxy and minio.

Entire setup seems working for me at this moment: I can add a add/decommission the device(s), run the deployment, but I faced a strange issue when trying to upload an artifact over CLI:

FAILURE: artifact upload failed with status 504, reason: <html>
<head><title>504 Gateway Time-out</title></head>

Artifact looks fine for me, and it was generated with last version mender-artifact (3.3.0):

Artifact file 'test-release-1.0.mender' validated successfully

API gateway log:

2020/04/27 12:01:34 [warn] 54#54: *3950 a client request body is buffered to a temporary file /usr/local/openresty/nginx/client_body_temp/0000000007 while sending to client, client: <client>, server: <server>, request: "POST /api/management/v1/deployments/artifacts HTTP/1.1", host: "<server>"

mender-deployments log related entry:

time="2020-04-27T12:01:52Z" level=error msg="unexpected EOF" file=view.go func="view.(*RESTView).RenderError" line=53 request_id=<request_id> user_id=<user>

time="2020-04-27T12:01:52Z" level=info msg="400 539196549μs POST /api/management/v1/deployments/artifacts HTTP/1.0 - Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:75.0) Gecko/20100101 Firefox/75.0" file=middleware.go func="accesslog.(*AccessLogMiddleware).MiddlewareFunc.func1" line=60 request_id=<request_id> user_id=<user_id>

Test demo images loads just fine using both CLI tool and UI.

Questions:

  1. How could I check the artifact other than using mender-artifact validate?
  2. Is it OK to exclude storage-proxy and minio from the setup in case of using AWS S3 as a storage?

Hello @imort welcome to Mender hub.

As far as I know removing the storage-proxy and minio should be ok if using S3, although perhaps @tranchitella or @merlin know better.

Are you able to upload that artifact through the Web UI?

The artifact format is actually just a tarball with the payload and some metadata. You can use “tar” to extract the contents but if the mender-artifact validate command succeeded, I’m not sure that will be helpful.

Drew

@imort if you are uploading large files, please have a look here:

If you are using an AWS load balancer in front of Mender, edit its configuration using the AWS console and increase the idle timeout accordingly.

Thank you for your kind response!
I’ve added timeout to the API gateway nginx config, and increased ALB timeout - artifact upload works now!

Can I post there any other details related to Mender setup at ECS later as it could be helpful for the community?

@imort sure, feel free to add any additional suggestions or details that can be useful to the community!

@tranchitella, regarding API gateway setup:

You’re using a reload-when-hosts-changed script, called in container entrypoint which reloads nginx if IP address of any underlying service is changed.

Since I’m using service discovery and private DNS zone for identifying services, and therefore I’ll have more than one IP address for each service name, it could reload it too often, even if DNS server returns same IP addresses in different order.

Looks like there is a better solution available now:

nginx.conf:

location /ui {
    resolver @RESOLVER@ valid=30s;
    set $backend "http://@MENDER_GUI@:80";
    <skipped>
    proxy_pass $backend;
}

Resolver should be retrieved in entrypoint.sh and passed using sed to the nginx.conf just like MENDER_GUI of course:

nameserver=$(cat /etc/resolv.conf | grep nameserver)
resolver=${nameserver:10}

It will force resolve the MENDER_GUI hostname and use all new IP’s each 30s.

Quick test shows that it works without reloading nginx when GUI container replaced, and distributing requests evenly over two mender-gui containers under same hostname.

@imort

Thanks for sharing! It sounds like an interesting approach, indeed. What about launching a PR with such a change? :slight_smile:

Side note: are you running Mender in Kubernetes, Swarm, Virtual machines? I’m curious about your deployment setup as you mentioned service discovery and multiple instances of the microservices. We can bring this to private communication if you cannot share these details in public.

What about launching a PR with such a change?

I should ask my boss about that first, sorry :slight_smile:

Side note: are you running Mender in Kubernetes, Swarm, Virtual machines?

I’m going to run it at AWS ECS (Fargate), and it seems working fine at the testing stage. Data storage will be deployed separately as ECS containers are stateless by design.