AWS S3 storage via proxy or interface endpoint

jonah_robinson · February 17, 2021, 2:06pm

Hi All,

We are looking to utilise Mender to update our clients in the field. We have successfully implemented a self-hosted Mender-server and have integrated Mender-client onto our board.

All our server-side infrastructure is hosted on AWS. Most of it is hosted in private subnets for security purposes, and we connect to those servers using an AWS Site-to-Site VPN on top of our Mobile Provider’s LTE network (we have a private APN setup for our SIM card population).

When we initially tested out Mender using public servers and a public SIM card we had no issue connecting to our hosted (public test) server and completing updates. But we have found that our production clients, that use a AWS Site-to-Site VPN, are not able to reach outside of the production AWS Virtual Private Cloud (VPC). Therefore our clients are failing to download the update Mender artifacts from our S3 bucket. From our logs we can see that the clients attempt to download the artifact from an assigned IP from the AWS IP address ranges for S3.

I have come up with a couple of possible solutions around this issue. I was wondering whether anyone could touch on their feasibility.

AWS recently released a new service/construct, which is “Interface endpoints for S3 buckets”. However, upon trying to utilise an endpoint in our setup I have found that the Mender server configuration does not seem to allow an endpoint-url to be specified (as far as I can see). It is rather easy for me to manually specify an endpoint-url using the aws cli or an SDK. Am I missing anything here? I couldn’t see mention of endpoints in the Mender documentation, nor the Mender source code (which admittedly I only briefly scanned).
Alternatively, I believe we could utilise the containerised Minio service and use local block storage attached to the Mender server to host our artifacts. Is this correct?
Finally, I figure it might be possible to access S3 storage using the Mender server as a proxy. I know there are settings for a storage-proxy in the prod.yml file. I have been unable to use these settings to get artifact files from the S3 storage via the Mender server. The server always just passed a public IP for the S3 bucket back to the client. So I’ve hit a dead end with this solution.

Is there anything I’m missing here? Hopefully someone can offer some assistance.

Cheers!

drewmoseley · February 17, 2021, 3:47pm

Hi @jonah_robinson welcome to Mender Hub.

You can configure Mender to use AWS S3 directly rather than using Minio and part of the configuration is a specific URI. Does that not work for the interface endpoints? See here and here for more details.

I’m a noob with AWS stuff so I may be completely off base so if the above doesn’t help, I can loop in others with more expertise.

Drew

jonah_robinson · February 17, 2021, 4:16pm

Hi @drewmoseley

Cheers for your reply. I’ve seen those threads and they did help me get setup to use AWS S3 directly. In that sense my setup works fine!

Its the production AWS security features I’m now attempting to navigate through. Which in my setup essentially means I’m not able to connect clients directly to public IP addresses, for example an S3 bucket directly, either over public internet or via a customer gateway (as with either, ultimately a public IP is doled up to the client).

Does that not work for the interface endpoints?
Unfortunately, to my understanding, interface endpoints dont extend or alter the URI as I had originally hoped when I discovered the service. You instead specify the endpoint at a “higher level” than the URI.

See the following examples from the docs for interface endpoints to get a feel for what I mean:

AWS CLI example:
aws s3 --endpoint-url https://bucket.vpce-1a2b3c4d-5e6f.s3.us-east-1.vpce.amazonaws.com ls s3://my-bucket/

Python SDK example:
s3_client = session.client( service_name='s3', endpoint_url='https://bucket.vpce-1a2b3c4d-5e6f.s3.us-east-1.vpce.amazonaws.com' )

So this leaves me in a tricky situation, where by I either need to figure out whether the Mender server can incorporate an interface endpoint url into the “Get Object” query to S3, or I need to figure out another solution to access a privately hosted artifact storage.

drewmoseley · February 17, 2021, 4:36pm

OK. You are definitely beyond my knowledge here. Perhaps @peter, @0lmi or @tranchitella can help further.

tranchitella · February 17, 2021, 6:32pm

@jonah_robinson I suppose you already tried setting the endpoint URL in this config variable, right?

github.com

mendersoftware/deployments/blob/master/config.yaml#L127


# Defaults to: true (path-style), set false to enable virtual-hosted style
# Overwrite with environment variable: DEPLOYMENTS_AWS_FORCE_PATH_STYLE
# force_path_style: true
# S3 URI
# Defaults to: none (s3.amazonaws.com)
# Overwrite with environment variable: DEPLOYMENTS_AWS_URI
# uri: example.com
# Artifact Tagging
# Defaults to: false
#
# Set to 'true' if you want to tag artifacts on S3 with tenant_id:<tenant_id>
# Note that this does not work on minio, and actually overwrites artifacts to an XML file
#
# tag_artifact: false
#

If yes, I fear we don’t support setting up a custom end-point via configuration file, and the only way to accomplish that is to patch the deployments service to add an additional configuration option.

jonah_robinson · February 18, 2021, 9:14am

Hi @tranchitella

Yes I have tried that. I’ll post my Server config for a sense of completion (I’ve included placeholders for any personal details)

version: '2.1'
services:

mender-workflows-server:
    command: server --automigrate

mender-workflows-worker:
    command: worker --automigrate --excluded-workflows generate_artifact

mender-create-artifact-worker:
    command: --automigrate

mender-useradm:
    command: server --automigrate
    volumes:
        - ./production/keys-generated/keys/useradm/private.key:/etc/useradm/rsa/private.pem:ro
    logging:
        options:
            max-file: "10"
            max-size: "50m"

mender-device-auth:
    command: server --automigrate
    volumes:
        - ./production/keys-generated/keys/deviceauth/private.key:/etc/deviceauth/rsa/private.pem:ro
    logging:
        options:
            max-file: "10"
            max-size: "50m"

mender-inventory:
    command: server --automigrate
    logging:
        options:
            max-file: "10"
            max-size: "50m"

mender-api-gateway:
    ports:
        # list of ports API gateway is made available on
        - "443:443"
    networks:
        - mender
    volumes:
        - ./production/keys-generated/certs/api-gateway/cert.crt:/var/www/mendersoftware/cert/cert.crt:ro
        - ./production/keys-generated/certs/api-gateway/private.key:/var/www/mendersoftware/cert/private.key:ro
    logging:
        options:
            max-file: "10"
            max-size: "50m"
    environment:
        ALLOWED_HOSTS: DOMAIN_NAME

mender-deployments:
    command: server --automigrate
    volumes:
        - ./production/keys-generated/certs/storage-proxy/cert.crt:/etc/ssl/certs/storage-proxy.crt:ro
    environment:
        STORAGE_BACKEND_CERT: /etc/ssl/certs/storage-proxy.crt
        # access key, the same value as MINIO_ACCESS_KEY
        DEPLOYMENTS_AWS_AUTH_KEY: NEEDS_TO_BE_SET
        # secret, the same valie as MINIO_SECRET_KEY
        DEPLOYMENTS_AWS_AUTH_SECRET: NEEDS_TO_BE_SET

        # deployments service uses signed URLs, hence it needs to access
        # storage-proxy using exactly the same name as devices will; if
        # devices will access storage using https://s3.acme.org:9000, then
        # set this to https://s3.acme.org:9000
        DEPLOYMENTS_AWS_REGION: eu-west-2
        DEPLOYMENTS_AWS_URI: https://s3-eu-west-2.amazonaws.com
        DEPLOYMENTS_AWS_BUCKET: mender-test-bucket
    logging:
        options:
            max-file: "10"
            max-size: "50m"

mender-mongo:
    volumes:
        - mender-db:/data/db:rw
volumes:
    # mender artifacts storage
    mender-artifacts:
      external:
          # use external volume created manually
          name: mender-artifacts
    # mongo service database
    mender-db:
      external:
          # use external volume created manually
          name: mender-db

If yes, I fear we don’t support setting up a custom end-point via configuration file, and the only way to accomplish that is to patch the deployments service to add an additional configuration option.

Agreed, I feel this is probably the case if I want to use interface endpoint, which is a shame because it solves my problem in such a nice way. I realise Mender is an open source project so I could probably contribute such a feature if I could find the time. Is the process for feature suggestion described clearly anywhere?

I suppose more importantly, now that I’ve made my issues clear does anyone know if a solution around this problem? Would either of my other suggestions be a reasonable option?

Alternatively, I believe we could utilise the containerised Minio service and use local block storage attached to the Mender server to host our artifacts. Is this correct?

Finally, I figure it might be possible to access S3 storage using the Mender server as a proxy. I know there are settings for a storage-proxy in the prod.yml file. I have been unable to use these settings to get artifact files from the S3 storage via the Mender server. The server always just passed a public IP for the S3 bucket back to the client. So I’ve hit a dead end with this solution.

tranchitella · February 18, 2021, 9:25am

Hello @jonah_robinson,

If you have the possibility to lunch a PR, it is great. You can refer to this document containing the contribution guidelines for the Mender project:

github.com

mendersoftware/mender/blob/master/CONTRIBUTING.md

Contributing to Mender
======================

Thank you for showing interest in contributing to the Mender project.
Connecting with contributors and growing a community is very important to us.
We hope you will find what you need to get started on this page.

## Reporting security issues

If you come across any security issue, please bring it to our team's
attention as quickly as possible by sending an email to
[security@mender.io](mailto:security@mender.io).

Please do not disclose anything in public. Once an issue has been addressed we
will publish the fix and acknowledge your finding on our site if you so wish.


## Proposed tasks to get started

There is a `helpwanted` tag on some tasks in the Mender issue tracker

This file has been truncated. show original

Using minio locally would solve your issue, that’s correct.

Regarding point 2, the storage proxy is basically a reverse-proxy for minio. The deployment service will use the S3 APIs to obtain a pre-signed link, and will pass it over to the device. In any case, you’ll end up using the S3 link directly. Thus, it doesn’t help you in this context.

jonah_robinson · February 18, 2021, 9:38am

Thanks for this @tranchitella

Using minio locally would solve your issue, that’s correct.

I’ll explore this option for now. I do have concerns that simple block storage attached to the Server’s EC2 instance would put a much higher load upon the server when deploying an update than simply passing the path to the object in S3. Am I correct about this? Do you know of the existence of any threads or documentation pages on others using this solution?

I would be interested in figuring out the general resource load on the server once we roll out to production. Our population is expected to grow in large bursts, and I wont be able to test load properly until we are dealing with production updates.

dellgreen · February 18, 2021, 10:14am

Whilst not using AWS, I have been running minio local block storage on a Google cloud instance successfully for several years now without problems.

jonah_robinson · February 18, 2021, 10:28am

Cheers for this @dellgreen

Is there any chance you could make me privy to your population size and server resource allocation (fixed or scaling)? It would give me some peace of mind to know we are comparing apples to apples.

dellgreen · February 18, 2021, 11:43am

Unfortunately due to Covid we are not at scale yet, but we are currently running 18 devices across beta and developer sites on the one mender server instance.

Google Compute Engine:

Custom VM 2Cpu, 5GB Ram
20GB Standard persistent disk ext4 boot/os disk
215GB Standard persistent disk zfs data disk

Currently all metrics have shown that we have a lot of head room with this configuration for now but obviously this will probably need to change depending upon your scale.

Topic		Replies	Views
[Self-hosted Mender 1.7] Configuration to Use Amazon S3 General Discussions	8	2464	November 25, 2022
Managing persistence when self-hosting General Discussions	14	774	November 3, 2020
Services used by mender-client General Discussions	16	1074	October 28, 2020
Working with S3 instead of Minio - no artifacts found and blank releases page General Discussions	10	3226	August 19, 2019
Mender 2.4 on-premise released: New automation features Announcements	9	522	April 6, 2021

AWS S3 storage via proxy or interface endpoint

Related topics