Mender Enterprise

kubernetes · December 16, 2021, 6:55pm

Hello,

I need some help, I’m trying to install Mender Enterprise using helm but i’m facing a lot off issues, below pods are with error:

auditlogs
level=error byteswritten=208 clientip=10.244.0.129 error=“useradm service unhealthy: error checking useradm liveliness: Get “http://mender-useradm:8080/api/internal/v1/useradm/alive”: context canceled” file=middleware_gin.go func=accesslog.Middleware.func1 line=153 method=GET path=/api/internal/v1/auditlogs/health qs= request_id=c9f6cd69-b787-4f44-912c-59042b423d42 responsetime=1000295us status=503 ts=“2021-12-16T18:46:26Z” type=HTTP/1.1 useragent=kube-probe/1.21

create-artifact-worker
time=“2021-12-16T18:46:36Z” level=info msg=“migrating workflows” file=entry.go func=“logrus.(*Entry).Infof” line=351
time=“2021-12-16T18:46:36Z” level=info msg=“migration to version 1.0.0 skipped” db=workflows file=entry.go func=“logrus.(*Entry).Infof” line=351
time=“2021-12-16T18:46:36Z” level=info msg=“DB migrated to version 1.0.0” db=workflows file=entry.go func=“logrus.(*Entry).Infof” line=351
2021/12/16 18:46:46 nats: no responders available for request

deployments
me=“2021-12-16T18:45:38Z” level=info msg=“migration to version 1.2.6 skipped” db=deployment_service file=migrator_simple.go func=“migrate.(*SimpleMigrator).Apply” line=125
time=“2021-12-16T18:45:38Z” level=info msg=“migration to version 1.2.7 skipped” db=deployment_service file=migrator_simple.go func=“migrate.(*SimpleMigrator).Apply” line=125
time=“2021-12-16T18:45:38Z” level=info msg=“DB migrated to version 1.2.7” db=deployment_service file=migrator_simple.go func=“migrate.(*SimpleMigrator).Apply” line=140
RequestError: send request failed
caused by: Put “https:///mender-artifact-storage”: http: no Host in request URL

device-auth
level=error msg=“Workflows service unhealthy: Get “http://mender-workflows-server:8080/api/v1/health”: context canceled” file=response_helpers.go func=rest_utils.restErrWithLogMsg line=110 request_id=1ff967fc-d283-488a-88b9-a78b45796184

tenantadm
level=error msg=“Workflows not healthy: Get “http://mender-workflows-server:8080/api/v1/health”: context canceled” file=response_helpers.go func=rest_utils.restErrWithLogMsg line=110 request_id=a813cd9c-9062-425d-8e7e-8f8cab9a6b89

useradm
level=error msg=“Workflows service unhealthy: Get “http://mender-workflows-server:8080/api/v1/health”: context canceled” file=response_helpers.go func=rest_utils.restErrWithLogMsg line=110 request_id=55a6e083-184f-4cb0-9b9a-1da5e6d14e68

workflows-server
time=“2021-12-16T18:52:17Z” level=info msg=“migrating workflows” file=entry.go func=“logrus.(*Entry).Infof” line=351
time=“2021-12-16T18:52:17Z” level=info msg=“migration to version 1.0.0 skipped” db=workflows file=entry.go func=“logrus.(*Entry).Infof” line=351
time=“2021-12-16T18:52:17Z” level=info msg=“DB migrated to version 1.0.0” db=workflows file=entry.go func=“logrus.(*Entry).Infof” line=351
2021/12/16 18:52:27 nats: no responders available for request

workflows-worker
time=“2021-12-16T18:51:53Z” level=info msg=“migrating workflows” file=entry.go func=“logrus.(*Entry).Infof” line=351
time=“2021-12-16T18:51:53Z” level=info msg=“migration to version 1.0.0 skipped” db=workflows file=entry.go func=“logrus.(*Entry).Infof” line=351
time=“2021-12-16T18:51:53Z” level=info msg=“DB migrated to version 1.0.0” db=workflows file=entry.go func=“logrus.(*Entry).Infof” line=351
2021/12/16 18:52:03 nats: no responders available for request

Anybody has any idea about that?

Thanks

tranchitella · December 17, 2021, 7:42am

It seems from the logs your nats deployment is not there, or the name of the service doesn’t match the configuration of Mender.

See: GitHub - mendersoftware/mender-helm: Mender Helm charts

By default, Mender will try to contact nats on nats://nats:4222.
Does the Kubernetes service nats exist?

kubernetes · December 17, 2021, 9:02am

thanks for your response, this is nats service
nats ClusterIP None 4222/TCP,6222/TCP,8222/TCP,7777/TCP,7422/TCP,7522/TCP 16h

or in details:
Name: nats
Namespace: default
Labels: app.kubernetes.io/instance=nats
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=nats
app.kubernetes.io/version=2.3.1
helm.sh/chart=nats-0.8.2
Annotations: meta.helm.sh/release-name: nats
meta.helm.sh/release-namespace: default
Selector: app.kubernetes.io/instance=nats,app.kubernetes.io/name=nats
Type: ClusterIP
IP: None
Port: client 4222/TCP
TargetPort: 4222/TCP
Endpoints: 10.244.0.23:4222
Port: cluster 6222/TCP
TargetPort: 6222/TCP
Endpoints: 10.244.0.23:6222
Port: monitor 8222/TCP
TargetPort: 8222/TCP
Endpoints: 10.244.0.23:8222
Port: metrics 7777/TCP
TargetPort: 7777/TCP
Endpoints: 10.244.0.23:7777
Port: leafnodes 7422/TCP
TargetPort: 7422/TCP
Endpoints: 10.244.0.23:7422
Port: gateways 7522/TCP
TargetPort: 7522/TCP
Endpoints: 10.244.0.23:7522
Session Affinity: None
Events:

Deployment:
nats-box 1/1 1 1 16h

sts:
nats 1/1 16h

Nats pods:

nats-0 3/3 Running 0 16h
nats-box-56c684cb47-5jhmd 1/1 Running 0 16h

tranchitella · December 17, 2021, 9:18am

Can you share the logs of the nats-0 pod? Can you enter one of the Mender pods (e.g. workflows-server) and check if it can contact the nats service?

kubernetes · December 17, 2021, 9:27am

when i try form another pod:

/ # nslookup nats
Server: 10.96.5.5
Address: 10.96.5.5:53

** server can’t find nats.cluster.local: NXDOMAIN

Name: nats.default.svc.cluster.local
Address: 10.244.0.23

Address: 10.244.0.23 is the ip of nats pod
nats-0 3/3 Running 0 17h 10.244.0.23

logs nats-0 -c nats
[8] 2021/12/16 16:55:37.324053 [INF] Starting nats-server
[8] 2021/12/16 16:55:37.324107 [INF] Version: 2.3.1
[8] 2021/12/16 16:55:37.324112 [INF] Git: [907fef4]
[8] 2021/12/16 16:55:37.324117 [INF] Name: nats-0
[8] 2021/12/16 16:55:37.324122 [INF] ID: NCYIJ2CVZXOUPPY4RAJCBJ7LWVLXR3SWHRQ3M7ASARVS57ZBP3EGLQCS
[8] 2021/12/16 16:55:37.324130 [INF] Using configuration file: /etc/nats-config/nats.conf
[8] 2021/12/16 16:55:37.327283 [INF] Starting http monitor on 0.0.0.0:8222
[8] 2021/12/16 16:55:37.327400 [INF] Listening for client connections on 0.0.0.0:4222
[8] 2021/12/16 16:55:37.327959 [INF] Server is ready
[8] 2021/12/16 16:55:37.328018 [INF] Cluster name is mxVDWJxhEdLQ7WGVuaxYsB
[8] 2021/12/16 16:55:37.328023 [WRN] Cluster name was dynamically generated, consider setting one
[8] 2021/12/16 16:55:37.328081 [INF] Listening for route connections on 0.0.0.0:6222

logs nats-0 -c reloader
2021/12/16 16:55:37 Starting NATS Server Reloader v0.6.1
2021/12/16 16:55:37 Live, ready to kick pid 8 (live, from 8 spec) based on any of 1 files

logs nats-0 -c metrics
[34] 2021/12/16 16:55:37.691826 [INF] Prometheus exporter listening at http://0.0.0.0:7777/metrics

i cant ssh to workflows-server pod cause is 0/1 CrashLoopBackOff 181

api-gateway-78774fccf7-ztg6f auditlogs-d66bcf696-cx75f azure-iot-manager-57485586cd-zdr2z cert-manager-5d7f97b46d-wsns5 cert-manager-cainjector-69d885bf55-xtlbt cert-manager-webhook-f697cc96d-mnvm2 create-artifact-worker-5767df7f9b-rqx74 deployments-86d55f4fb6-2cvm5 device-auth-76cdfb4f57-s998m deviceconfig-5986746bf7-g42q7 deviceconnect-69f44658-lzqt4 devicemonitor-7dcdc7656d-bvpln gui-5c57b7869c-9rscn inventory-585dfbd7b6-kcxrv minio-operator-7d6dbc7b58-57f62 minio-operator-console-7f9489b7c4-2bpgt minio-ss-0-0 minio-ss-0-1 mongodb-0 mongodb-arbiter-0 nats-0 nats-box-56c684cb47-5jhmd tenantadm-589dc668db-pb824 useradm-796859cf54-4qjxk workflows-server-7b9476885f-wqhm6 workflows-worker-6f756b9f94-tkr4d 1/1 Running 0 15h
0/1 Running 0 15h
1/1 Running 0 15h
1/1 Running 0 16h
1/1 Running 0 16h
1/1 Running 0 16h
0/1 CrashLoopBackOff 176 15h
0/1 CrashLoopBackOff 181 15h
0/1 Running 0 15h
1/1 Running 0 15h
1/1 Running 0 15h
1/1 Running 0 15h
1/1 Running 0 15h
1/1 Running 0 15h
1/1 Running 0 16h
1/1 Running 0 16h
1/1 Running 0 15h
1/1 Running 0 15h
1/1 Running 0 16h
1/1 Running 0 16h
3/3 Running 0 16h
1/1 Running 0 16h
0/1 Running 0 15h
0/1 Running 0 15h
0/1 CrashLoopBackOff 186 15h
0/1 CrashLoopBackOff 176 15h

kubernetes · December 17, 2021, 1:06pm

Thanks @tranchitella , maybe mender-helm/README.md at master · mendersoftware/mender-helm · GitHub should be checked. Following the official doc Production installation with Kubernetes | Mender documentation and doing some small changes during the deployment solved the problems i was facing. Have a nice day

kingman · February 10, 2022, 12:29pm

Hi @kubernetes could you share what was the changed you did to solved the problem. I’m facing the same issue installing on kubernetes by following the official installation guide. I’m getting nats: no responders available for request on
workflows-worker
workflows-server
create-artifact-worker
deployments

thanks!

kubernetes · February 10, 2022, 3:29pm

@kingman which version of Mender are you trying to install?

kingman · February 10, 2022, 4:10pm

the latest 3.2

kubernetes · February 11, 2022, 7:59am

mmm…, i have installed 3.1, anyway check nats logs , also try to contact nats service form one of the Mender Pods, if wou still have issues please share the files and the commands you used to deploy with helm, i can share mine too if you want

kingman · February 15, 2022, 11:43pm

using the following command to install nats instead, seems have resolved it for me
helm install nats nats/nats --version 0.8.2 --set "nats.image=nats:2.6.5-alpine" --set "nats.jetstream.enabled=true"

thanks!

Topic		Replies	Views
Mender on K8S problems General Discussions mender-server	31	1921	April 10, 2024
Deployment of AKS based mender-server Version 3.6.2 not completely successful General Discussions mender-server	31	695	September 8, 2023
After upgrade to Kubernetes 1.29, Mender no longer works General Discussions kubernetes	2	68	September 27, 2024
Open Source Mender Server Tutorial General Discussions mender , mender-server	2	86	January 23, 2025
Workflows-worker ran for 41days but stuck in CrashLoopBackOff after restart General Discussions mender-server , kubernetes	3	376	October 20, 2023

Mender Enterprise

Related topics