Mender Enterprise

Hello,

I need some help, I’m trying to install Mender Enterprise using helm but i’m facing a lot off issues, below pods are with error:

auditlogs
level=error byteswritten=208 clientip=10.244.0.129 error=“useradm service unhealthy: error checking useradm liveliness: Get “http://mender-useradm:8080/api/internal/v1/useradm/alive”: context canceled” file=middleware_gin.go func=accesslog.Middleware.func1 line=153 method=GET path=/api/internal/v1/auditlogs/health qs= request_id=c9f6cd69-b787-4f44-912c-59042b423d42 responsetime=1000295us status=503 ts=“2021-12-16T18:46:26Z” type=HTTP/1.1 useragent=kube-probe/1.21

create-artifact-worker
time=“2021-12-16T18:46:36Z” level=info msg=“migrating workflows” file=entry.go func=“logrus.(*Entry).Infof” line=351
time=“2021-12-16T18:46:36Z” level=info msg=“migration to version 1.0.0 skipped” db=workflows file=entry.go func=“logrus.(*Entry).Infof” line=351
time=“2021-12-16T18:46:36Z” level=info msg=“DB migrated to version 1.0.0” db=workflows file=entry.go func=“logrus.(*Entry).Infof” line=351
2021/12/16 18:46:46 nats: no responders available for request

deployments
me=“2021-12-16T18:45:38Z” level=info msg=“migration to version 1.2.6 skipped” db=deployment_service file=migrator_simple.go func=“migrate.(*SimpleMigrator).Apply” line=125
time=“2021-12-16T18:45:38Z” level=info msg=“migration to version 1.2.7 skipped” db=deployment_service file=migrator_simple.go func=“migrate.(*SimpleMigrator).Apply” line=125
time=“2021-12-16T18:45:38Z” level=info msg=“DB migrated to version 1.2.7” db=deployment_service file=migrator_simple.go func=“migrate.(*SimpleMigrator).Apply” line=140
RequestError: send request failed
caused by: Put “https:///mender-artifact-storage”: http: no Host in request URL

device-auth
level=error msg=“Workflows service unhealthy: Get “http://mender-workflows-server:8080/api/v1/health”: context canceled” file=response_helpers.go func=rest_utils.restErrWithLogMsg line=110 request_id=1ff967fc-d283-488a-88b9-a78b45796184

tenantadm
level=error msg=“Workflows not healthy: Get “http://mender-workflows-server:8080/api/v1/health”: context canceled” file=response_helpers.go func=rest_utils.restErrWithLogMsg line=110 request_id=a813cd9c-9062-425d-8e7e-8f8cab9a6b89

useradm
level=error msg=“Workflows service unhealthy: Get “http://mender-workflows-server:8080/api/v1/health”: context canceled” file=response_helpers.go func=rest_utils.restErrWithLogMsg line=110 request_id=55a6e083-184f-4cb0-9b9a-1da5e6d14e68

workflows-server
time=“2021-12-16T18:52:17Z” level=info msg=“migrating workflows” file=entry.go func=“logrus.(*Entry).Infof” line=351
time=“2021-12-16T18:52:17Z” level=info msg=“migration to version 1.0.0 skipped” db=workflows file=entry.go func=“logrus.(*Entry).Infof” line=351
time=“2021-12-16T18:52:17Z” level=info msg=“DB migrated to version 1.0.0” db=workflows file=entry.go func=“logrus.(*Entry).Infof” line=351
2021/12/16 18:52:27 nats: no responders available for request

workflows-worker
time=“2021-12-16T18:51:53Z” level=info msg=“migrating workflows” file=entry.go func=“logrus.(*Entry).Infof” line=351
time=“2021-12-16T18:51:53Z” level=info msg=“migration to version 1.0.0 skipped” db=workflows file=entry.go func=“logrus.(*Entry).Infof” line=351
time=“2021-12-16T18:51:53Z” level=info msg=“DB migrated to version 1.0.0” db=workflows file=entry.go func=“logrus.(*Entry).Infof” line=351
2021/12/16 18:52:03 nats: no responders available for request

Anybody has any idea about that?

Thanks

It seems from the logs your nats deployment is not there, or the name of the service doesn’t match the configuration of Mender.

See: GitHub - mendersoftware/mender-helm: Mender Helm charts

By default, Mender will try to contact nats on nats://nats:4222.
Does the Kubernetes service nats exist?

thanks for your response, this is nats service
nats ClusterIP None 4222/TCP,6222/TCP,8222/TCP,7777/TCP,7422/TCP,7522/TCP 16h

or in details:
Name: nats
Namespace: default
Labels: app.kubernetes.io/instance=nats
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=nats
app.kubernetes.io/version=2.3.1
helm.sh/chart=nats-0.8.2
Annotations: meta.helm.sh/release-name: nats
meta.helm.sh/release-namespace: default
Selector: app.kubernetes.io/instance=nats,app.kubernetes.io/name=nats
Type: ClusterIP
IP: None
Port: client 4222/TCP
TargetPort: 4222/TCP
Endpoints: 10.244.0.23:4222
Port: cluster 6222/TCP
TargetPort: 6222/TCP
Endpoints: 10.244.0.23:6222
Port: monitor 8222/TCP
TargetPort: 8222/TCP
Endpoints: 10.244.0.23:8222
Port: metrics 7777/TCP
TargetPort: 7777/TCP
Endpoints: 10.244.0.23:7777
Port: leafnodes 7422/TCP
TargetPort: 7422/TCP
Endpoints: 10.244.0.23:7422
Port: gateways 7522/TCP
TargetPort: 7522/TCP
Endpoints: 10.244.0.23:7522
Session Affinity: None
Events:

Deployment:
nats-box 1/1 1 1 16h

sts:
nats 1/1 16h

Nats pods:

nats-0 3/3 Running 0 16h
nats-box-56c684cb47-5jhmd 1/1 Running 0 16h

Can you share the logs of the nats-0 pod? Can you enter one of the Mender pods (e.g. workflows-server) and check if it can contact the nats service?

when i try form another pod:

/ # nslookup nats
Server: 10.96.5.5
Address: 10.96.5.5:53

** server can’t find nats.cluster.local: NXDOMAIN

Name: nats.default.svc.cluster.local
Address: 10.244.0.23

Address: 10.244.0.23 is the ip of nats pod
nats-0 3/3 Running 0 17h 10.244.0.23

logs nats-0 -c nats
[8] 2021/12/16 16:55:37.324053 [INF] Starting nats-server
[8] 2021/12/16 16:55:37.324107 [INF] Version: 2.3.1
[8] 2021/12/16 16:55:37.324112 [INF] Git: [907fef4]
[8] 2021/12/16 16:55:37.324117 [INF] Name: nats-0
[8] 2021/12/16 16:55:37.324122 [INF] ID: NCYIJ2CVZXOUPPY4RAJCBJ7LWVLXR3SWHRQ3M7ASARVS57ZBP3EGLQCS
[8] 2021/12/16 16:55:37.324130 [INF] Using configuration file: /etc/nats-config/nats.conf
[8] 2021/12/16 16:55:37.327283 [INF] Starting http monitor on 0.0.0.0:8222
[8] 2021/12/16 16:55:37.327400 [INF] Listening for client connections on 0.0.0.0:4222
[8] 2021/12/16 16:55:37.327959 [INF] Server is ready
[8] 2021/12/16 16:55:37.328018 [INF] Cluster name is mxVDWJxhEdLQ7WGVuaxYsB
[8] 2021/12/16 16:55:37.328023 [WRN] Cluster name was dynamically generated, consider setting one
[8] 2021/12/16 16:55:37.328081 [INF] Listening for route connections on 0.0.0.0:6222

logs nats-0 -c reloader
2021/12/16 16:55:37 Starting NATS Server Reloader v0.6.1
2021/12/16 16:55:37 Live, ready to kick pid 8 (live, from 8 spec) based on any of 1 files

logs nats-0 -c reloader
2021/12/16 16:55:37 Starting NATS Server Reloader v0.6.1
2021/12/16 16:55:37 Live, ready to kick pid 8 (live, from 8 spec) based on any of 1 files

logs nats-0 -c metrics
[34] 2021/12/16 16:55:37.691826 [INF] Prometheus exporter listening at http://0.0.0.0:7777/metrics

i cant ssh to workflows-server pod cause is 0/1 CrashLoopBackOff 181

api-gateway-78774fccf7-ztg6f 1/1 Running 0 15h
auditlogs-d66bcf696-cx75f 0/1 Running 0 15h
azure-iot-manager-57485586cd-zdr2z 1/1 Running 0 15h
cert-manager-5d7f97b46d-wsns5 1/1 Running 0 16h
cert-manager-cainjector-69d885bf55-xtlbt 1/1 Running 0 16h
cert-manager-webhook-f697cc96d-mnvm2 1/1 Running 0 16h
create-artifact-worker-5767df7f9b-rqx74 0/1 CrashLoopBackOff 176 15h
deployments-86d55f4fb6-2cvm5 0/1 CrashLoopBackOff 181 15h
device-auth-76cdfb4f57-s998m 0/1 Running 0 15h
deviceconfig-5986746bf7-g42q7 1/1 Running 0 15h
deviceconnect-69f44658-lzqt4 1/1 Running 0 15h
devicemonitor-7dcdc7656d-bvpln 1/1 Running 0 15h
gui-5c57b7869c-9rscn 1/1 Running 0 15h
inventory-585dfbd7b6-kcxrv 1/1 Running 0 15h
minio-operator-7d6dbc7b58-57f62 1/1 Running 0 16h
minio-operator-console-7f9489b7c4-2bpgt 1/1 Running 0 16h
minio-ss-0-0 1/1 Running 0 15h
minio-ss-0-1 1/1 Running 0 15h
mongodb-0 1/1 Running 0 16h
mongodb-arbiter-0 1/1 Running 0 16h
nats-0 3/3 Running 0 16h
nats-box-56c684cb47-5jhmd 1/1 Running 0 16h
tenantadm-589dc668db-pb824 0/1 Running 0 15h
useradm-796859cf54-4qjxk 0/1 Running 0 15h
workflows-server-7b9476885f-wqhm6 0/1 CrashLoopBackOff 186 15h
workflows-worker-6f756b9f94-tkr4d 0/1 CrashLoopBackOff 176 15h

Thanks @tranchitella , maybe mender-helm/README.md at master · mendersoftware/mender-helm · GitHub should be checked. Following the official doc Production installation with Kubernetes | Mender documentation and doing some small changes during the deployment solved the problems i was facing. Have a nice day :slight_smile:

Hi @kubernetes could you share what was the changed you did to solved the problem. I’m facing the same issue installing on kubernetes by following the official installation guide. I’m getting nats: no responders available for request on
workflows-worker
workflows-server
create-artifact-worker
deployments

thanks!

@kingman which version of Mender are you trying to install?

the latest 3.2

mmm…, i have installed 3.1, anyway check nats logs , also try to contact nats service form one of the Mender Pods, if wou still have issues please share the files and the commands you used to deploy with helm, i can share mine too if you want :slight_smile:

using the following command to install nats instead, seems have resolved it for me
helm install nats nats/nats --version 0.8.2 --set "nats.image=nats:2.6.5-alpine" --set "nats.jetstream.enabled=true"

thanks!