Hi,
I have been experiencing a new error in the last few weeks which I haven’t seen before.
Some information about my setup
- I have around 120 IoT devices connected to https://hosted.mender.io/
- I have an automated process that accepts them once they are in a
Pending
state - Using this automated process I decommission devices quite often, mainly for development purposes
The problem is that suddenly, when I decommission few devices, one of them is not getting back for hours, maybe days without a reason other than 401 Unauthorized 401
.
Notes regarding tests
- The token is 100% valid,
- This device was already connected to Mender and our other devices are configured exactly the same. At some point the device will recover and authenticate without any human intervention
Problem and logs
- Running dbus seems weird:
sudo dbus-send \
--system \
--print-reply \
--dest=io.mender.AuthenticationManager \
/io/mender/AuthenticationManager \
io.mender.Authentication1.GetJwtToken
method return time=1699578277.798716 sender=:1.8382 -> destination=:1.8601 serial=523 reply_serial=2
string ""
string ""
- When this happens, something is not working with Mender’s localhost:
sudo netstat -nlp | grep mender | wc -l
0
While on a working device it is:
sudo netstat -nlp | grep mender
tcp 0 0 127.0.0.1:44003 0.0.0.0:* LISTEN 1098026/mender
Logs
I have added DaemonLogLevel: "debug"
to /etc/mender/mender.conf. This seems to happen randomly when we decommission a device:
mender-client
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=info msg="Wait 30s before next inventory update attempt in 59.999998496s"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=warning msg="Failed to refresh inventory: failed to submit inventory data: inventory submit failed: transient error: authorization request failed"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=error msg="inventory submit failed: transient error: authorization request failed"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="Inventory update sent, response <nil>"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=error msg="Failed to submit inventory data: transient error: authorization request failed"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=warning msg="Reauthorization failed with error: transient error: authorization request failed"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=error msg="Failed to authorize with \"[https://hosted.mender.io\":](https://hosted.mender.io/%22:) authentication request rejected server error message: dev auth: unauthorized"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="Got response: &{401 Unauthorized 401 HTTP/1.1 1 1 map[Connection:[keep-alive] Content-Length:[86] Content-Type:[application/json; charset=utf-8] Date:[Thu, 09 Nov 2023 16:29:41 GMT] Referrer-Policy:[no-referrer] Strict-Transport-Security:[max-age=31536000; includeSubDomains; preload] X-Content-Type-Options:[nosniff] X-Men-Requestid:[5c063b0d-9e1c-4ec7-9ffe-2bd2034a68bb] X-Xss-Protection:[1; mode=block]] 0x40000ca4a0 86 [] false false map[] 0x4000110300 <nil>}"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="Making an authorization request () to server [https://hosted.mender.io](https://hosted.mender.io/)"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="Authorization data: {{\"mac\":\"48:b0:2d:eb:08:0f\"} REDACTED\n-----END PUBLIC KEY-----\n}"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="Tenant token: REDACTED"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=info msg="Output (stderr) from command \"/usr/share/mender/identity/mender-device-identity\": using interface /sys/class/net/eth0"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="received the FETCH_AUTH_TOKEN action"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=info msg="Device unauthorized; attempting reauthorization"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="Current manifest data: default"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="device_type=default"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="Reading data from the device manifest file: /var/lib/mender/device_type"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=warning msg="Inventory tool /usr/share/mender/inventory/mender-inventory-mender-configure wait failed: exit status 1"
mender-connect
sudo /usr/bin/mender-connect -dd daemon 00:47:18
INFO[2023-11-10T00:47:27Z] Loaded configuration file: /etc/mender/mender-connect.conf
WARN[2023-11-10T00:47:27Z] ShellArguments is empty, defaulting to [--login]
TRAC[2023-11-10T00:47:27Z] daemon Run starting
TRAC[2023-11-10T00:47:27Z] mender-connect connecting to dbus
TRAC[2023-11-10T00:47:27Z] mender-connect entering main loop.
TRAC[2023-11-10T00:47:27Z] messageLoop: starting
TRAC[2023-11-10T00:47:27Z] messageLoop: posting event: reconnect-req; waiting for response
TRAC[2023-11-10T00:47:27Z] needsReconnect: got event: reconnect-req
TRAC[2023-11-10T00:47:27Z] dbusEventLoop: daemon needs to reconnect
TRAC[2023-11-10T00:47:37Z] dbusEventLoop: WaitForJwtTokenStateChange [], err timeout waiting for signal JwtTokenStateChange
TRAC[2023-11-10T00:47:45Z] dbusEventLoop: WaitForJwtTokenStateChange [{s } {s }], err <nil>
TRAC[2023-11-10T00:47:56Z] dbusEventLoop: WaitForJwtTokenStateChange [], err timeout waiting for signal JwtTokenStateChange
TRAC[2023-11-10T00:48:07Z] dbusEventLoop: WaitForJwtTokenStateChange [], err timeout waiting for signal JwtTokenStateChange
TRAC[2023-11-10T00:48:15Z] dbusEventLoop: WaitForJwtTokenStateChange [{s } {s }], err <nil>
TRAC[2023-11-10T00:48:16Z] dbusEventLoop: WaitForJwtTokenStateChange [{s } {s }], err <nil>
TRAC[2023-11-10T00:48:27Z] dbusEventLoop: WaitForJwtTokenStateChange [], err timeout waiting for signal JwtTokenStateChange
TRAC[2023-11-10T00:48:38Z] dbusEventLoop: WaitForJwtTokenStateChange [], err timeout waiting for signal JwtTokenStateChange
TRAC[2023-11-10T00:48:46Z] dbusEventLoop: WaitForJwtTokenStateChange [{s } {s }], err <nil>
TRAC[2023-11-10T00:48:55Z] dbusEventLoop: WaitForJwtTokenStateChange [{s } {s }], err <nil>
TRAC[2023-11-10T00:48:56Z] dbusEventLoop: WaitForJwtTokenStateChange [{s } {s }], err <nil>
TRAC[2023-11-10T00:49:07Z] dbusEventLoop: WaitForJwtTokenStateChange [], err timeout waiting for signal JwtTokenStateChange
Files
cat /etc/mender/mender-connect.conf
{
"ShellCommand": "/bin/bash",
"User": "nobody"
}
cat /etc/mender/mender.conf
{
"DaemonLogLevel": "debug",
"HttpsClient": {},
"Security": {},
"Connectivity": {},
"DeviceTypeFile": "/var/lib/mender/device_type",
"DBus": {
"Enabled": true
},
"UpdateControlMapExpirationTimeSeconds": 90,
"UpdateControlMapBootExpirationTimeSeconds": 45,
"UpdatePollIntervalSeconds": 5,
"InventoryPollIntervalSeconds": 5,
"RetryPollIntervalSeconds": 30,
"TenantToken": "REDACTED",
"Servers": [
{
"ServerURL": "https://hosted.mender.io"
}
]
}
versions
mender-connect --version
mender-connect version 2.1.0 runtime: go1.17.13
mender --version
3.5.1 runtime: go1.17.13
Questions
- Is it possible that the device is a ghost on the hosted mender server? (Something like Recovering/Reauthorizing Decommissioned Device on Hosted Mender - #2 by mirzak)
- Is there anything else that I can investigate?
- Is there a way to fix this issue?
Thank you
Shaked