Hello Everyone
We are trying to deploy the mender installation of Kubernetes, following the guidelines for 3.2: Production installation with Kubernetes | Mender documentation
Everything works except for 3 mender containers:
- 2 times workflows-server
- 1 time create-artifact-worker
The containers are stuck in a CrashLoopBackOff.
❯ kubectl get pods --namespace application-peripherals
NAME READY STATUS RESTARTS AGE
api-gateway-756685fdc9-vj7rd 1/1 Running 0 22h
cert-manager-54b9fc686-hbc4x 1/1 Running 0 22h
cert-manager-cainjector-89487b959-8x9n6 1/1 Running 0 22h
cert-manager-webhook-85f96c57dd-2nhvm 1/1 Running 0 22h
create-artifact-worker-6676fd594-z7rqb 0/1 CrashLoopBackOff 247 (70s ago) 21h
deployments-88d4d87-66c2x 0/1 Running 23 (20h ago) 22h
device-auth-77ffc8688c-nf7lt 0/1 Running 0 22h
deviceconfig-7ccbfb857d-fk5pc 1/1 Running 0 22h
deviceconnect-5468dd6c54-qw4mh 1/1 Running 0 21h
gui-7b6988cb96-xwp9n 1/1 Running 0 22h
inventory-7454868b78-dmlm4 1/1 Running 0 22h
iot-manager-5465779b4-w5zkg 1/1 Running 0 22h
minio-operator-6c984995c9-lldss 1/1 Running 0 22h
minio-operator-console-9d9cbbcc8-flbmf 1/1 Running 0 22h
minio-ss-0-0 1/1 Running 0 22h
minio-ss-0-1 1/1 Running 0 22h
mongodb-0 1/1 Running 0 22h
mongodb-arbiter-0 1/1 Running 0 22h
nats-0 3/3 Running 0 22h
nats-box-67786894bd-hszrk 1/1 Running 0 22h
useradm-65db46c846-xjz59 1/1 Running 0 22h
workflows-server-db8fd468d-mb8w7 0/1 CrashLoopBackOff 254 (2m14s ago) 21h
workflows-worker-8657585498-7tcr2 0/1 CrashLoopBackOff 247 (71s ago) 21h
When we check the logs I get the following:
create-artifact-worker-6676fd594-z7rqb
❯ kubectl logs create-artifact-worker-6676fd594-z7rqb --namespace application-peripherals
time="2022-02-02T09:39:11Z" level=info msg="migrating workflows" file=entry.go func="logrus.(*Entry).Infof" line=351
time="2022-02-02T09:39:11Z" level=info msg="migration to version 1.0.0 skipped" db=workflows file=entry.go func="logrus.(*Entry).Infof" line=351
time="2022-02-02T09:39:11Z" level=info msg="DB migrated to version 1.0.0" db=workflows file=entry.go func="logrus.(*Entry).Infof" line=351
2022/02/02 09:39:16 context deadline exceeded
workflows-server-db8fd468d-mb8w7
❯ kubectl logs workflows-server-db8fd468d-mb8w7 --namespace application-peripherals
time="2022-02-02T09:38:22Z" level=info msg="migrating workflows" file=entry.go func="logrus.(*Entry).Infof" line=351
time="2022-02-02T09:38:22Z" level=info msg="migration to version 1.0.0 skipped" db=workflows file=entry.go func="logrus.(*Entry).Infof" line=351
time="2022-02-02T09:38:22Z" level=info msg="DB migrated to version 1.0.0" db=workflows file=entry.go func="logrus.(*Entry).Infof" line=351
2022/02/02 09:38:27 context deadline exceeded
workflows-worker-8657585498-7tcr2
❯ kubectl logs workflows-worker-8657585498-7tcr2 --namespace application-peripherals
time="2022-02-02T09:39:16Z" level=info msg="migrating workflows" file=entry.go func="logrus.(*Entry).Infof" line=351
time="2022-02-02T09:39:16Z" level=info msg="migration to version 1.0.0 skipped" db=workflows file=entry.go func="logrus.(*Entry).Infof" line=351
time="2022-02-02T09:39:16Z" level=info msg="DB migrated to version 1.0.0" db=workflows file=entry.go func="logrus.(*Entry).Infof" line=351
2022/02/02 09:39:21 context deadline exceeded
The context deadline exceeded
is probably a GOLANG error. Which makes the error logs ambiguous.
Deviations we have from the installation documentation (Production installation with Kubernetes | Mender documentation)
- We don’t use AWS Kubernetes, we have a bare-metal Kubernetes
- we use ingress-nginx as a reverse proxy and TLS termination
- we use kube-flannel for networking
- We use MinIO for S3 (deployed as explained in the mender documentation)
Questions:
- What can we do to debug the ‘context deadline exceeded’ errors?
- It isn’t mentioned in the documentation, do I have to create the MinIO bucket or does mender take care of this?
- Is the
nats://nats:4222
connection string mentioned in the documentation correct?- Don’t we have to use the internal DNS of the nats service?
- e.g. nats://pod-0.nats.application-peripherals.svc.cluster.local:4222