Hi,
My company has been using the hosted Mender server and it has been working fairly well for us, and I was asked to look into hosting our own server on an Amazon EC2 instance. I’m going to go a bit into the background of what I’ve done as I’m not terribly familiar with a lot of the associated software (Docker, Kubernetes, or Amazon’s EC2 instances/security infrastructure) so it is likely I’m doing something stupid with my setup. I know the documentation says that the installation of the cluster and related infrastructure is the responsibility of the user–but I’m out of ideas on what I’m doing wrong, and was hoping for some insight.
I’m trying to follow the instructions for the 3.6 Production Installation with Kubernetes. I created a Ubuntu EC2 instance and it seems like I was able to install Helm, Kubernetes, and the Cert Manager without issue. I originally skipped the Minio installation because I thought it’d be easier to connect an EC2 instance to a S3 bucket with Amazon and I got the impression that installing Minio was optional. When I try to use Helm to install Mender I cannot complete the job “mender-db-data-migration”. I suspected this was due to my connection to the S3 bucket, but I was able to use awscli to view and add items there without issue. I then went and tried to install Minio which seemed to work, but I haven’t been able to get the ingress to work properly to expose the service. I didn’t want to spend a lot of time there as I thought using the Amazon bucket would be a simpler solution.
One side issue of note is that I found I am running out of space on the EC2 instance. I created the instance using the minimal specs required in the Mender documentation and I’m seeing a lot of pods evicted and node disk pressure notifications. I tried to extend my EC2 instance to 15GB but I’m still running into the same issues. Not sure if these issues are occurring since I used minimal specs when generating my EC2 instance or if they’re just a symptom of my overall installation struggles. Any help or insight on where to look to troubleshoot my issues would be appreciated!
Hi @Ben , Mender could run on any K8s setup, including Minikube, K3s, KinD, and also managed ones like Amazon EKS.
There’s a lot of potential issues on a self-hosted setup on a single EC2 machine, so it’s hard to troubleshoot your specific case. May I suggest a step-by-step approach? For example by running a K3s setup on your local workstation with Minio, then move to EC2, then move to S3, then play with the Ingress and so on…
Hi Rob,
I recognize that is a sensible approach but I can’t find a lot of tutorials that break the process down into smaller steps. I am running on K3s (as per the Mender Documentation), and I think I was able to follow the tutorial successfully until I tried to use the Helm upgrade install command. I’ve tried to troubleshoot using the kubectl logs and kubectl describe commands and they both seem to suggest an issue with my connection to my Mongo database. The install script itself quits with an error for mender-db-data-migration. On the plus side, I was able to resolve the space issues (by adding more space to the EC2 instance, not exactly rocket science but nice to have those errors vanish at least). Not sure what a reasonable next step will be so I’ve been trying to spend as much time as I can working on other projects. I’m perfectly happy with the hosted Mender and hoping that my company will just decide to stick with that. I do appreciate the help though!
Hi @Ben ,
what about the error for mender-db-data-migration? Also, could you please share the values file, without any secrets?
Thanks
Hi @robgio
This is the error I received from the mender-db-data-migration:
Defaulted container “deployments-migration” out of: deployments-migration, device-auth-migration, deviceconfig-migration, deviceconnect-migration, inventory-migration, useradm-migration, workflows-server-migration, iot-manager-migration
time=“2023-10-04T15:47:57Z” level=warning msg=“‘presign.secret’ not configured. Generating a random secret.” file=config.go func=config.Setup line=238
failed to connect to db: Error reaching mongo server: connection() error occurred during connection handshake: auth error: unable to authenticate using mechanism “SCRAM-SHA-256”: (AuthenticationFailed) Authentication failed.
I think the value file refers to the yaml file that I used for the helm upgrade–sorry if I’m wrong there, but that is the file I’ve been manipulating most of the time.
I should comment that the mongodb URL and nats URL aren’t filled in for this example. This is how I configured the yaml file for this attempt (I think that should set it to a default value), though I have also tried to put in “mongodb://mender-mongo:27017” and “mongodb://mongo-deployments:27017” as explicit values. The environment variables for the rootPassword and replicaSetKey were set immediately prior to attempting the helm install using export MONGODB_ROOT_PASSWORD=$(pwgen 32 1) and export MONGODB_REPLICA_SET_KEY=$(pwgen 32 1). I verified that the environment variables were being set–I got reasonable seeming values when I tried to echo them myself.
Thanks for the review–let me know if I’m providing the right information here.
Hi @Ben ,
environment variables inside the values file won’t work. In this example they are expanded within the cat
command, so the resulted values file is containing the actual env variables content.
You could verify the created secret with a command like this one:
kubectl get secret mongodb-common -o jsonpath='{.data.MONGO_URL}' | base64 -d
Also, you have to check if MongoDB is running fine.
I did not realize that and that is really helpful. I had just been creating the environment variables separately prior to the cat command. I’ll have to update my .yml file and see if that fixes anything for me.