Host validation error on deployment step

Hi there,

I successfully connect a custom device to a self-hosting Mender server but cannot deploy an update.

I already created the artifact by using mender-artifact and uploaded it to the server. After that, I started a deployment but the Mender client fires “Host validation error” when pulling the new update.

Here’s the full error code:

{"level":"error","message":"Update fetch failed: update fetch request failed: Get \"https://s3.docker.mender.io/mender-artifact-storage/ec1d9949-b5c2-4cd8-bb80-7c117cae3e2a?X-Amz-Algorithm=AWS4-HMAC-SHA256\u0026X-Amz-Credential=mender-deployments%2F20210503%2Fus-east-1%2Fs3%2Faws4_request\u0026X-Amz-Date=20210503T205359Z\u0026X-Amz-Expires=86400\u0026X-Amz-SignedHeaders=host\u0026response-content-type=application%2Fvnd.mender-artifact\u0026X-Amz-Signature=768fbfb9e80194494cf9283e9902e2fa1af83b9ec895761cb4e2f24dfe9b9abf\": Host validation error","timestamp":"2021-05-04T00:03:00+03:00"}

Any ideas?

It sounds like a mismatch with server certificates or some such. Can you share the exact steps you took to:

  1. Setup the server.
  2. Configure the client device to run Mender.
  3. Create the artifact.

Drew

Hi there,

  1. Exactly the same steps are written in the Mender production installation
  2. As following

DEVICE_TYPE=“DEVTYPE”
SERVER_URL=“https://menderurl:4430
sudo DEBIAN_FRONTEND=noninteractive dpkg -i mender-client_2.6.0-1_armhf.deb
sudo mender setup
–device-type $DEVICE_TYPE
–server-url $SERVER_URL
–server-cert="/etc/myapp/inc/ssl/certs/server.crt"
–retry-poll 30
–update-poll 5
–inventory-poll 5

  1. mender-artifact write module-image -t DEVTYPE -o FIRMWARE.mender -T FIRMWARE -n FIRMWARE -f FIRMWARE.fw

You mention it’s self hosted, however the storage URL for the artifacts which it’s complaining about is pointing to the mender.io domain. Is this intentional?

TBH, that’s the point where I got confused.

I didn’t set that domain (I suppose you’re referring to s3.docker.mender.io) on the client device but instead set it when I was deploying the mender server. Here’s the configuration from Mender documentation:

API_GATEWAY_DOMAIN_NAME=" menderurl" # replace with your server’s public domain name
STORAGE_PROXY_DOMAIN_NAME=“s3.docker.mender.io” # change if you are using a different domain name than the default one

So, I believe the artifact URL is pushed from the server to the client when there’s a new deployment.

I also registered “s3.docker.mender.io” and “menderurl” domains with the Mender Server IP in the hosts file of the client.

Plus, the client can resolve both domains without any issues.

Any points that I’m skipping?

That all sounds good so far. Does the server certificate contain both domains in it if you are using a single server with a single certificate?

At this point I would normally use openssl s_client command line tool option to verify the entire certificate of trust chain for each domain.

https://docs.pingidentity.com/bundle/solution-guides/page/iqs1569423823079.html

Hmm, I’m a bit new to those SSL sutff, but AFAIUI seems like only Mender server domain is included in the server-side certificate, but not s3.docker.mender.io
image

So how to recreate that certificate on the server-side including minio domain?

– Edit –
Should I use same domain name if minio and Mender instances are served on the same host? If that’ll decrease the complexity I can reinstall the Mender server.

I couldn’t wait and gave it a try :]

Got a backup of current SSL certificate folder and recreate the new ones by using:

CERT_API_CN=mender.myurl.com.foo CERT_STORAGE_CN=mender.myurl.com.foo ../keygen

And also updated prod conf file:

ALLOWED_HOSTS: mender.myurl.com.foo
DEPLOYMENTS_AWS_URI: https://mender.myurl.com.foo

Then copied related certificates to the client, decommissioned, and re-authorized it back again. However, now I’m getting another error on the client side;

time=“2021-05-04T16:40:47+03:00” level=error msg=“Update check
failed: transient error: (request_id: ): Invalid response received from server server error message: fai
led to parse server response: json: cannot unmarshal object into Go struct field .error of type string”

P.S.
Client can successfully send inventory info.

On my first deployment if i recall for simplicity I created a single certificate that had multiple domains in it and configured the mender server apt-gateway and storage-proxy to use the same certificate.

That was a few years ago now. In newer deployments i use separate certificates for both now and I am also my own certificate authority for issuing certificates to my servers. This affords greater flexibility at the cost of complexity.

Are you currently using a self-signed certificate then? if so when you create it you should be able to add multiple “Subject Alt names” for all the domains you use

Yes I’m self signing the certs (I suppose keygen app is self-signing the certs :)) ) and deploying them manually since the system is working in a closed network.

Have you also updated the storage-proxy aliases section?

As for the client GO error, i wouldn’t move on to that problem until you can confirm on your mender client device passes openssl s_client testing with your domains certificates trust chain

Using s3.docker.mender.io does seem strange to me. In previous versions of the docs, that was specified as:

STORAGE_PROXY_DOMAIN_NAME="$API_GATEWAY_DOMAIN_NAME"

but in the 2.7 version it is specified literally:

STORAGE_PROXY_DOMAIN_NAME=“s3.docker.mender.io

I suspect we have an error in our automation setup for the docs. @oleorhagen, @mzedel, @kacf, can you guys comment on this?

Drew

Have you also updated the storage-proxy aliases section?

Yes, it’s updated.

As for the client GO error, i wouldn’t move on to that problem until you can confirm on your mender client device passes openssl s_client testing with your domains certificates trust chain

Is that the result of test that you’re seeking for? (this is from the client-side)

And this is the cert I deployed to the client:
image

And this is a test curl command:

looks ok so far. what about when you run openssl s_client against your storage proxy domain?

api and sotorage domains are same right now, which is mender.myurl.com.foo

Here’s the prod conf I’m using:

version: ‘2.1’
services:

mender-workflows-server:
    command: server --automigrate

mender-workflows-worker:
    command: worker --automigrate --excluded-workflows generate_artifact

mender-create-artifact-worker:
    command: --automigrate

mender-useradm:
    command: server --automigrate
    volumes:
        - ./production/keys-generated/keys/useradm/private.key:/etc/useradm/rsa/private.pem:ro
    logging:
        options:
            max-file: "10"
            max-size: "50m"

mender-device-auth:
    command: server --automigrate
    volumes:
        - ./production/keys-generated/keys/deviceauth/private.key:/etc/deviceauth/rsa/private.pem:ro
    logging:
        options:
            max-file: "10"
            max-size: "50m"

mender-inventory:
    command: server --automigrate
    logging:
        options:
            max-file: "10"
            max-size: "50m"

mender-api-gateway:
    ports:
        # list of ports API gateway is made available on
        - "4430:443"
    networks:
        mender:
            aliases:
                # mender-api-gateway is a proxy to storage
                # and has to use exactly the same name as devices
                # and the deployments service will;
                #
                # if devices and deployments will access storage
                # using https://s3.acme.org:9000, then
                # set this to https://s3.acme.org:9000
                - https://mender.myurl.com.foo
    command:
        - --accesslog=true
        - --providers.file.filename=/config/tls.toml
        - --providers.docker=true
        - --providers.docker.exposedbydefault=false
        - --entrypoints.http.address=:80
        - --entrypoints.https.address=:443
        - --entryPoints.https.transport.respondingTimeouts.idleTimeout=7200
        - --entryPoints.https.transport.respondingTimeouts.readTimeout=7200
        - --entryPoints.https.transport.respondingTimeouts.writeTimeout=7200
        - --entrypoints.http.http.redirections.entryPoint.to=https
        - --entrypoints.http.http.redirections.entryPoint.scheme=https
    volumes:
        - ./tls.toml:/config/tls.toml
        - ./production/keys-generated/certs/api-gateway/cert.crt:/certs/cert.crt:ro
        - ./production/keys-generated/certs/api-gateway/private.key:/certs/private.key:ro
        - ./production/keys-generated/certs/storage-proxy/cert.crt:/certs/s3.docker.mender.io.crt
        - ./production/keys-generated/certs/storage-proxy/private.key:/certs/s3.docker.mender.io.key
    logging:
        options:
            max-file: "10"
            max-size: "50m"
    environment:
        ALLOWED_HOSTS: mender.myurl.com.foo

mender-deployments:
    command: server --automigrate
    volumes:
        - ./production/keys-generated/certs/storage-proxy/cert.crt:/etc/ssl/certs/s3.docker.mender.io.crt:ro
    environment:
        STORAGE_BACKEND_CERT: /etc/ssl/certs/s3.docker.mender.io.crt
        # access key, the same value as MINIO_ACCESS_KEY
        DEPLOYMENTS_AWS_AUTH_KEY: mender-deployments
        # secret, the same valie as MINIO_SECRET_KEY
        DEPLOYMENTS_AWS_AUTH_SECRET: Kaengi3iel8thoh2

        # deployments service uses signed URLs, hence it needs to access
        # storage-proxy using exactly the same name as devices will; if
        # devices will access storage using https://s3.acme.org:9000, then
        # set this to https://s3.acme.org:9000
        DEPLOYMENTS_AWS_URI: https://mender.myurl.com.foo
    logging:
        options:
            max-file: "10"
            max-size: "50m"

minio:
    environment:
        # access key
        MINIO_ACCESS_KEY: mender-deployments
        # secret
        MINIO_SECRET_KEY: Kaengi3iel8thoh2
    volumes:
        # mounts a docker volume named `mender-artifacts` as /export directory
        - mender-artifacts:/export:rw

mender-mongo:
    volumes:
        - mender-db:/data/db:rw

volumes:
# mender artifacts storage
 mender-artifacts:
  external:
      # use external volume created manually
      name: mender-artifacts
# mongo service database
mender-db:
  external:
      # use external volume created manually
      name: mender-db

In 2.7x branch it looks like the mender-api-gateway and storage-proxy nodes have merged into as single mender-api-gateway config. Its not immediately obvious to me how this new config works as i’m running a slightly older version of the server which has a separate storage-proxy config node running on a different port. I’ll have to defer this to @drewmoseley who probably knows why this config has changed in 2.7x and how its supposed to work now.

Hopefully @tranchitella can provide some insight here.

I just found that the mender-deployment service is also throwing error messages.

mender-deployments_1 | time=“2021-05-05T06:54:59Z” level=error msg="error reaching artifact storage service: SerializationError: failed to decode REST XML response\n\tstatus code: 200, request id: \ncaused by: XML syntax error on line 8: element closed by " file=response_helpers.go func=rest_utils.restErrWithLogMsg line=110 request_id=94914e75-a1b1-4a5b-8640-c335ed931174

Found another odd behavior. When I send the following request;

curl --cacert /path_to_cert/mender-server.crt https://mender.myurl.com.tr:4430/api/devices/v1/deployments/device/deployments/next

The server sometimes replies with a proper response (for less then %10 of requests):

{“error”:“no authorization header”,“request_id”:“e3516fea-7ee1-4064-bd9a-b3eea0f4aa19”}

But sometimes the response is broken: