Systemd mender-client.service restart option

Hi,

On our device we saw an error on mender-client (systemctl status mender-client)

[[0;1;31m●[[0m mender-client.service - Mender OTA update service
     Loaded: loaded (/lib/systemd/system/mender-client.service; enabled; vendor preset: enabled)
     Active: [[0;1;31mfailed[[0m (Result: exit-code) since Tue 2022-11-22 11:07:41 UTC; 1 day 5h ago
    Process: 387 ExecStart=/usr/bin/mender daemon [[0;1;31m(code=exited, status=1/FAILURE)[[0m
   Main PID: 387 (code=exited, status=1/FAILURE)

Nov 22 11:05:36 hostname mender[387]: time="2022-11-22T11:05:36Z" level=info msg="State transition: idle [Idle] -> check-wait [Idle]"
Nov 22 11:06:33 hostname mender[387]: time="2022-11-22T11:06:33Z" level=error msg="couldn't dial to remote backend url \"wss://OUR_MENDER_URL/api/devices/v1/deviceconnect/connect\", err: websocket: bad handshake"
Nov 22 11:07:36 hostname mender[387]: time="2022-11-22T11:07:36Z" level=info msg="State transition: check-wait [Idle] -> update-check [Sync]"
Nov 22 11:07:36 hostname mender[387]: time="2022-11-22T11:07:36Z" level=info msg="Device unauthorized; attempting reauthorization"
Nov 22 11:07:36 hostname mender[387]: time="2022-11-22T11:07:36Z" level=info msg="Output (stderr) from command \"/usr/share/mender/identity/mender-device-identity\": using interface /sys/class/net/eth0"
Nov 22 11:07:36 hostname mender[387]: time="2022-11-22T11:07:36Z" level=info msg="successfully received new authorization data from server https://OUR_MENDER_URL"
Nov 22 11:07:36 hostname mender[387]: time="2022-11-22T11:07:36Z" level=info msg="Local proxy stopped"
Nov 22 11:07:41 hostname mender[387]: time="2022-11-22T11:07:41Z" level=fatal msg="Proxy Shutdown failed: context deadline exceeded\n"
Nov 22 11:07:41 hostname systemd[1]: [[0;1;39m[[0;1;31m[[0;1;39mmender-client.service: Main process exited, code=exited, status=1/FAILURE[[0m
Nov 22 11:07:41 hostname systemd[1]: [[0;1;38;5;185m[[0;1;39m[[0;1;38;5;185mmender-client.service: Failed with result 'exit-code'.[[0m

When looking into the service file, we saw the restart option is set to on-abort. Reference: meta-mender/mender-client.service at dunfell · mendersoftware/meta-mender · GitHub

Question: is there a reason for on-abort? Why not using on-failure, perhaps combined with RestartSec=10s to prevent running into a restart loop (reference for options can be found here: systemd.service)

Thanks and best regards
Ruben

Hi @ruben,

This has been recently addressed here, so my guess is that you are not running the latest release of the client?

Greetz,
Josef

Hi @TheYoctoJester ,

we are using mender client 3.4, build from meta-dunfell (GitHub - mendersoftware/meta-mender at b3a10de4a3e5332d91fdea6169db3b0c2eb3f3ee). I don’t think this patch is already in there.

But, just for clarification: As far as I understand, when building with yocto the systemd service file from meta-mender is used (Link in my first post). Is this correct? And should the same fix be made there also?

Cheers
Ruben

Hi @ruben,

It seems that on dunfell the version from meta-mender is used, whereas on kirkstone the version from the Mender client sources is applied. This is unfortunate in my opinion, I’ll try to clear things up next week. Thanks for reporting!

Greetz,
Josef

Hi @TheYoctoJester,

any update on this?

Thanks
Ruben

Hi @ruben,

Yes and no:

  • the layer-supplied version is actually not used anywhere, it is just a remnant that is not cleared up in all branches yet
  • the file that comes with Mender releases 3.3.1 and 3.4.0 is the on-abort version.

So if you’d like the always version from the repo right now, then you have to build the git version. Otherwise, it will automatically come in with the upcoming 3.5 release.

Greetz,
Josef