Requesting virtual-style URL for Hosted Mender artifacts on AWS S3

In this post the URLs that Mender client accesses were established to be:

  1. https://hosted.mender.io
  2. https://s3.amazonaws.com/hosted-mender-artifacts

As mentioned in that post, our company has a customer with restrictive network access policies using HTTP proxy servers. They have noted that s3.amazonaws.com resolves to a very large number of IP addresses, which is understandable given AWS S3’s huge footprint. This customer would like to open up the bare minimum of outbound IP address destinations, and they aren’t comfortable with the size of the IP address pool to which that URL resolves.

Are there any plans to switch to using virtual-hosted style URLs to take advantage of subdomains, e.g., hosted-mender.s3.amazonaws.com? It’s possible that approach would vastly reduce the IP address pool and would make our customer less worried. It also appears you may need to put this on your roadmap anyway, because AWS may decide to deprecate path-style URLs in the future.

Follow-up question: are there already undocumented paths that would point to the same S3 bucket, e.g., https://hosted-mender-artifacts.s3.us-east-1.amazonaws.com without any change on your end? I haven’t tested this out, but perhaps your team may already know some tricks that could help me help our end customer without any changes to your existing infrastructure. Thanks!

Upon looking at your client implementation and API documentation, it appears that Artifact.Source.URI may be something that’s fetched over an API at hosted.mender.io and that there is no client-side control to the artifact URI returned.

I suppose I could try to patch the client to rewrite the returned URI on the fly in order to test out alternate paths in the hope that AWS S3 will route them to the same bucket. However this example URI from a tutorial implementing a custom Mender client, it looks like an AWS pre-signed URL, and it specifies that the host header is part of the AWS4 signature (see X-Amz-SignedHeaders=host in the parameters). So that approach is likely to result in AWS rejecting the request if I modify the host string.

"uri": "https://s3.amazonaws.com/hosted-mender-artifacts/5b48937d7e71f600014ab529/af0ccfb3-c410-40f8-9b30-649e5dd7878c?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=AKIAQWI25QR6NDTJ7DLD%2F20191213%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20191213T204753Z&X-Amz-Expires=86400&X-Amz-SignedHeaders=host&response-content-type=application%2Fvnd.mender-artifact&X-Amz-Signature=3923b3238890ea98e51c6943805900570f306d1aaa858eedfd6b3ef1417dfd1b"

It looks like the S3 client in your deployments microservice is aware of AWS’s planned deprecation of path-style URLs. It also appears that you’ve implemented per-tenant storage settings. Is there any way to change the URL for our tenant ID on hosted.mender.io?

Hello @nowls,

thank you for your interest in Mender.
do I understand correctly, that you want to achieve the following: be able to run Mender with only allowed outgoing connections to hosted.mender.io IP port 443? the only problem here is: how does the client (device) get the artifacts? by default we get them, as you pointed out, from Amazon S3. How would you see it? Do you have some kind of per-tenant proxy in mind, to reduce the allowed IPs?
just in passing, there is a good reason why s3.amazon.com resolves to more than one IP: it provides the HA and load balancing.

peter

Hi @peter,

Thanks for your response.

Do you have some kind of per-tenant proxy in mind, to reduce the allowed IPs?

Perhaps, but as a first step, I’d like to see whether virtual-hosted style URLs, i.e., using a subdomain for an S3 bucket instead of a path, are a possibility here. s3.amazonaws.com seems to be too monolithic for our customer’s IT requirements, as they are seeking to minimize the surface of outbound connections.

[Do you want to] be able to run Mender with only allowed outgoing connections to hosted.mender.io IP port 443?

Not necessarily. Having a separate URL for artifact download is not itself a problem.

there is a good reason why s3.amazon.com resolves to more than one IP: it provides the HA and load balancing.

Yes, I understand why they want to do this. However as their blog post pointed out, even Amazon AWS itself want more flexibility in addressing using subdomains instead of paths of a monolithic s3.amazonaws.com domain. That being said, I understand that a subdomain like hosted-mender-artifacts.amazonaws.com may still resolve to a pool of IP addresses for high availability and load balancing.