Cannot add device after decommissioning a device on hosted mender

Hi,

I have been experiencing a new error in the last few weeks which I haven’t seen before.

Some information about my setup

  • I have around 120 IoT devices connected to https://hosted.mender.io/
  • I have an automated process that accepts them once they are in a Pending state
  • Using this automated process I decommission devices quite often, mainly for development purposes

The problem is that suddenly, when I decommission few devices, one of them is not getting back for hours, maybe days without a reason other than 401 Unauthorized 401.

Notes regarding tests

  • The token is 100% valid,
  • This device was already connected to Mender and our other devices are configured exactly the same. At some point the device will recover and authenticate without any human intervention

Problem and logs

  • Running dbus seems weird:
sudo dbus-send \
    --system \
    --print-reply \
    --dest=io.mender.AuthenticationManager \
    /io/mender/AuthenticationManager \
    io.mender.Authentication1.GetJwtToken

method return time=1699578277.798716 sender=:1.8382 -> destination=:1.8601 serial=523 reply_serial=2
   string ""
   string ""
  • When this happens, something is not working with Mender’s localhost:
sudo netstat -nlp | grep mender | wc -l
0

While on a working device it is:

sudo netstat -nlp | grep mender

tcp        0      0 127.0.0.1:44003         0.0.0.0:*               LISTEN      1098026/mender

Logs

I have added DaemonLogLevel: "debug" to /etc/mender/mender.conf. This seems to happen randomly when we decommission a device:

mender-client
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=info msg="Wait 30s before next inventory update attempt in 59.999998496s"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=warning msg="Failed to refresh inventory: failed to submit inventory data: inventory submit failed: transient error: authorization request failed"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=error msg="inventory submit failed: transient error: authorization request failed"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="Inventory update sent, response <nil>"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=error msg="Failed to submit inventory data: transient error: authorization request failed"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=warning msg="Reauthorization failed with error: transient error: authorization request failed"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=error msg="Failed to authorize with \"[https://hosted.mender.io\":](https://hosted.mender.io/%22:) authentication request rejected server error message: dev auth: unauthorized"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="Got response: &{401 Unauthorized 401 HTTP/1.1 1 1 map[Connection:[keep-alive] Content-Length:[86] Content-Type:[application/json; charset=utf-8] Date:[Thu, 09 Nov 2023 16:29:41 GMT] Referrer-Policy:[no-referrer] Strict-Transport-Security:[max-age=31536000; includeSubDomains; preload] X-Content-Type-Options:[nosniff] X-Men-Requestid:[5c063b0d-9e1c-4ec7-9ffe-2bd2034a68bb] X-Xss-Protection:[1; mode=block]] 0x40000ca4a0 86 [] false false map[] 0x4000110300 <nil>}"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="Making an authorization request () to server [https://hosted.mender.io](https://hosted.mender.io/)"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="Authorization data: {{\"mac\":\"48:b0:2d:eb:08:0f\"} REDACTED\n-----END PUBLIC KEY-----\n}"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="Tenant token: REDACTED"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=info msg="Output (stderr) from command \"/usr/share/mender/identity/mender-device-identity\": using interface /sys/class/net/eth0"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="received the FETCH_AUTH_TOKEN action"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=info msg="Device unauthorized; attempting reauthorization"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="Current manifest data: default"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="device_type=default"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=debug msg="Reading data from the device manifest file: /var/lib/mender/device_type"
Nov 09 16:29:41 48-b0-2d-eb-08-0f mender[4176123]: time="2023-11-09T16:29:41Z" level=warning msg="Inventory tool /usr/share/mender/inventory/mender-inventory-mender-configure wait failed: exit status 1"
mender-connect
sudo /usr/bin/mender-connect -dd daemon                                                                                                    00:47:18
INFO[2023-11-10T00:47:27Z] Loaded configuration file: /etc/mender/mender-connect.conf
WARN[2023-11-10T00:47:27Z] ShellArguments is empty, defaulting to [--login]
TRAC[2023-11-10T00:47:27Z] daemon Run starting
TRAC[2023-11-10T00:47:27Z] mender-connect connecting to dbus
TRAC[2023-11-10T00:47:27Z] mender-connect entering main loop.
TRAC[2023-11-10T00:47:27Z] messageLoop: starting
TRAC[2023-11-10T00:47:27Z] messageLoop: posting event: reconnect-req; waiting for response
TRAC[2023-11-10T00:47:27Z] needsReconnect: got event: reconnect-req
TRAC[2023-11-10T00:47:27Z] dbusEventLoop: daemon needs to reconnect
TRAC[2023-11-10T00:47:37Z] dbusEventLoop: WaitForJwtTokenStateChange [], err timeout waiting for signal JwtTokenStateChange
TRAC[2023-11-10T00:47:45Z] dbusEventLoop: WaitForJwtTokenStateChange [{s } {s }], err <nil>
TRAC[2023-11-10T00:47:56Z] dbusEventLoop: WaitForJwtTokenStateChange [], err timeout waiting for signal JwtTokenStateChange
TRAC[2023-11-10T00:48:07Z] dbusEventLoop: WaitForJwtTokenStateChange [], err timeout waiting for signal JwtTokenStateChange
TRAC[2023-11-10T00:48:15Z] dbusEventLoop: WaitForJwtTokenStateChange [{s } {s }], err <nil>
TRAC[2023-11-10T00:48:16Z] dbusEventLoop: WaitForJwtTokenStateChange [{s } {s }], err <nil>
TRAC[2023-11-10T00:48:27Z] dbusEventLoop: WaitForJwtTokenStateChange [], err timeout waiting for signal JwtTokenStateChange
TRAC[2023-11-10T00:48:38Z] dbusEventLoop: WaitForJwtTokenStateChange [], err timeout waiting for signal JwtTokenStateChange
TRAC[2023-11-10T00:48:46Z] dbusEventLoop: WaitForJwtTokenStateChange [{s } {s }], err <nil>
TRAC[2023-11-10T00:48:55Z] dbusEventLoop: WaitForJwtTokenStateChange [{s } {s }], err <nil>
TRAC[2023-11-10T00:48:56Z] dbusEventLoop: WaitForJwtTokenStateChange [{s } {s }], err <nil>
TRAC[2023-11-10T00:49:07Z] dbusEventLoop: WaitForJwtTokenStateChange [], err timeout waiting for signal JwtTokenStateChange

Files

cat /etc/mender/mender-connect.conf
{
  "ShellCommand": "/bin/bash",
  "User": "nobody"
}
cat /etc/mender/mender.conf
{
    "DaemonLogLevel": "debug",
    "HttpsClient": {},
    "Security": {},
    "Connectivity": {},
    "DeviceTypeFile": "/var/lib/mender/device_type",
    "DBus": {
        "Enabled": true
    },
    "UpdateControlMapExpirationTimeSeconds": 90,
    "UpdateControlMapBootExpirationTimeSeconds": 45,
    "UpdatePollIntervalSeconds": 5,
    "InventoryPollIntervalSeconds": 5,
    "RetryPollIntervalSeconds": 30,
    "TenantToken": "REDACTED",
    "Servers": [
        {
            "ServerURL": "https://hosted.mender.io"
        }
    ]
}
versions
mender-connect --version
mender-connect version 2.1.0        runtime: go1.17.13

mender --version
3.5.1       runtime: go1.17.13

Questions

Thank you
Shaked

Hi @Shaked,

This seems to be related to a caching phenomenon that we are currently looking into. We’re working on it and will release a fix as soon as possible. In the meanwhile, no device should take longer than 24h to reappear.

Greetz,
Josef

1 Like

Hi Josef,

Thank you for the explanation.

Is there a place where I could get updates regarding these issues/fixes? Maybe subscribe to a Github/JIRA board?

Hi @TheYoctoJester

Any idea if this bug has been fixed?

I am also facing the same issue when i try to reconnect a decommissioned device to the mender.

The requested URL returned error: 401 Unauthorized

ERROR: Cannot get mender-monitor from https://downloads.customer.mender.io/content/hosted/mender-monitor/debian/1.3.0/mender-monitor_1.3.0-1+debian+bullseye_all.deb

Can you suggest any solution?

1 Like