@kacf, thanks for a prompt response
Let me elaborate on the steps we tried to remedy the issue.
First, we tried to re-deploy the same artifact with a dir
module - it was rejected as it was already installed
. I am unsure why it was considered to be already installed
as current device state was 4.1.0-INCONSISTENT
while 4.1.0
is expected.
Then we tried to re-deploy the same dir
module artifact with a Force update
ticked - this successfully re-applied the artifact.
Then, to be 100% safe, we deployed current A/B artifact (again, with Force update
ticked as it was technically installing the same thing it had before the update that brought the device into an INCONSISTENT
state) to trigger a switch to a different partition. This successfully reverted a device back to a fully operational state but with the app being outdated (got 4.0.0
, latest 4.1.0
).
Then, I have created 2 new individual device deployments (one deployment for a “rescued” device, another deployment for a newly commissioned device just to see if it was a fluke) with the same release via the dir
module - both of them completed successfully now.
The only difference between a case now and before:
- Now it was 2 individual device deployments created. Both deployments succeeded.
- Before it was a deployment created for a device group, containing a single device. This deployment failed.
- Device that failed to apply/install a deployment was up for ~3-4 weeks before receiving a deployment. After a reboot, all the deployments for that device were succeeding so far.
Please find below the logs from the 2 deployments that succeeded:
Device deployment log 1
{"level":"info","message":"Running Mender client version: 3.5.2","timestamp":"2024-02-13T16:17:45Z"}
{"level":"info","message":"State transition: update-fetch [Download_Enter] -\u003e update-store [Download_Enter]","timestamp":"2024-02-13T16:17:45Z"}
{"level":"info","message":"Installer: authenticated digital signature of artifact","timestamp":"2024-02-13T16:17:46Z"}
{"level":"info","message":"State transition: update-store [Download_Enter] -\u003e update-after-store [Download_Leave]","timestamp":"2024-02-13T16:17:49Z"}
{"level":"info","message":"State transition: update-after-store [Download_Leave] -\u003e mender-update-control-refresh-maps [none]","timestamp":"2024-02-13T16:17:49Z"}
{"level":"info","message":"State transition: mender-update-control-refresh-maps [none] -\u003e mender-update-control [none]","timestamp":"2024-02-13T16:17:49Z"}
{"level":"info","message":"State transition: mender-update-control [none] -\u003e update-install [ArtifactInstall]","timestamp":"2024-02-13T16:17:49Z"}
{"level":"info","message":"Executing script: ArtifactInstall_Enter_00_retry_later","timestamp":"2024-02-13T16:17:49Z"}
{"level":"info","message":"Collected output (stderr) while running script /var/lib/mender/scripts/ArtifactInstall_Enter_00_retry_later\nrootfs-2024.01.1: Running ArtifactInstall_Enter_00_retry_later\nOutside of update window.\n<redacted> App shutdown signal file NOT found.\nPostponing to retry update install process later.\n\n---------- end of script output","timestamp":"2024-02-13T16:17:50Z"}
{"level":"info","message":"statescript: ArtifactInstall_Enter_00_retry_later requested a retry","timestamp":"2024-02-13T16:17:50Z"}
{"level":"info","message":"Collected output (stderr) while running script /var/lib/mender/scripts/ArtifactInstall_Enter_00_retry_later\nrootfs-2024.01.1: Running ArtifactInstall_Enter_00_retry_later\nOutside of update window.\n<redacted> App shutdown signal file NOT found.\nPostponing to retry update install process later.\n\n---------- end of script output","timestamp":"2024-02-13T16:18:51Z"}
{"level":"info","message":"statescript: ArtifactInstall_Enter_00_retry_later requested a retry","timestamp":"2024-02-13T16:18:51Z"}
{"level":"info","message":"Collected output (stderr) while running script /var/lib/mender/scripts/ArtifactInstall_Enter_00_retry_later\nrootfs-2024.01.1: Running ArtifactInstall_Enter_00_retry_later\nOutside of update window.\n<redacted> App shutdown signal file NOT found.\nPostponing to retry update install process later.\n\n---------- end of script output","timestamp":"2024-02-13T16:19:52Z"}
{"level":"info","message":"statescript: ArtifactInstall_Enter_00_retry_later requested a retry","timestamp":"2024-02-13T16:19:52Z"}
.
.
.
.
<Approx. 500 retry later requests truncated>
.
.
.
.
.
.
{"level":"info","message":"Collected output (stderr) while running script /var/lib/mender/scripts/ArtifactInstall_Enter_00_retry_later\nrootfs-2024.01.1: Running ArtifactInstall_Enter_00_retry_later\nOutside of update window.\n<redacted> App shutdown signal file NOT found.\nPostponing to retry update install process later.\n\n---------- end of script output","timestamp":"2024-02-14T01:58:55Z"}
{"level":"info","message":"statescript: ArtifactInstall_Enter_00_retry_later requested a retry","timestamp":"2024-02-14T01:58:55Z"}
{"level":"info","message":"Collected output (stderr) while running script /var/lib/mender/scripts/ArtifactInstall_Enter_00_retry_later\nrootfs-2024.01.1: Running ArtifactInstall_Enter_00_retry_later\nOutside of update window.\n<redacted> App shutdown signal file NOT found.\nPostponing to retry update install process later.\n\n---------- end of script output","timestamp":"2024-02-14T01:59:56Z"}
{"level":"info","message":"statescript: ArtifactInstall_Enter_00_retry_later requested a retry","timestamp":"2024-02-14T01:59:56Z"}
{"level":"info","message":"Output (stdout) from command \"/var/lib/mender/scripts/ArtifactInstall_Enter_00_retry_later\": <redacted>-<redacted>-app: signalled","timestamp":"2024-02-14T02:00:58Z"}
{"level":"info","message":"Collected output (stderr) while running script /var/lib/mender/scripts/ArtifactInstall_Enter_00_retry_later\nrootfs-2024.01.1: Running ArtifactInstall_Enter_00_retry_later\nUpdate window reached.\n<redacted> App shutdown signal file NOT found.\nSending SIGUSR1 to <redacted> <redacted> app.\nSleeping 15 seconds before checking for <redacted> <redacted> App status again within the same script execution.\n<redacted> App shutdown signal file found.\nRemoving <redacted> App termianted and termination pending file.\nContinuing update process.\n\n---------- end of script output","timestamp":"2024-02-14T02:01:13Z"}
{"level":"info","message":"State transition: update-install [ArtifactInstall] -\u003e mender-update-control-refresh-maps [none]","timestamp":"2024-02-14T02:01:16Z"}
{"level":"info","message":"State transition: mender-update-control-refresh-maps [none] -\u003e mender-update-control [none]","timestamp":"2024-02-14T02:01:16Z"}
{"level":"info","message":"State transition: mender-update-control [none] -\u003e update-commit [ArtifactCommit_Enter]","timestamp":"2024-02-14T02:01:16Z"}
{"level":"info","message":"Device unauthorized; attempting reauthorization","timestamp":"2024-02-14T02:01:16Z"}
{"level":"info","message":"Output (stderr) from command \"/usr/share/mender/identity/mender-device-identity\": using interface /sys/class/net/eth0","timestamp":"2024-02-14T02:01:16Z"}
{"level":"info","message":"successfully received new authorization data from server https://eu.hosted.mender.io","timestamp":"2024-02-14T02:01:16Z"}
{"level":"info","message":"shutting down websocket connections","timestamp":"2024-02-14T02:01:16Z"}
{"level":"info","message":"Local proxy stopped","timestamp":"2024-02-14T02:01:16Z"}
{"level":"warning","message":"error while sending close message: websocket: close sent","timestamp":"2024-02-14T02:01:16Z"}
{"level":"error","message":"Proxy Shutdown failed: context canceled\n","timestamp":"2024-02-14T02:01:16Z"}
{"level":"info","message":"Local proxy started","timestamp":"2024-02-14T02:01:16Z"}
{"level":"info","message":"Reauthorization successful","timestamp":"2024-02-14T02:01:16Z"}
{"level":"warning","message":"error while sending close message: websocket: close sent","timestamp":"2024-02-14T02:01:16Z"}
{"level":"info","message":"State transition: update-commit [ArtifactCommit_Enter] -\u003e update-after-first-commit [none]","timestamp":"2024-02-14T02:01:16Z"}
{"level":"info","message":"State transition: update-after-first-commit [none] -\u003e update-after-commit [ArtifactCommit_Leave]","timestamp":"2024-02-14T02:01:16Z"}
{"level":"info","message":"State transition: update-after-commit [ArtifactCommit_Leave] -\u003e cleanup [none]","timestamp":"2024-02-14T02:01:16Z"}
{"level":"info","message":"Executing script: ArtifactCommit_Leave_00_start_<redacted>_<redacted>_app","timestamp":"2024-02-14T02:01:16Z"}
{"level":"info","message":"Output (stdout) from command \"/var/lib/mender/scripts/ArtifactCommit_Leave_00_start_<redacted>_<redacted>_app\": <redacted>-<redacted>-app: started","timestamp":"2024-02-14T02:01:19Z"}
{"level":"info","message":"Collected output (stderr) while running script /var/lib/mender/scripts/ArtifactCommit_Leave_00_start_<redacted>_<redacted>_app\n<redacted>-<redacted>-app-4.1.0: Running ArtifactCommit_Leave_00_start_<redacted>_<redacted>_app\n\n---------- end of script output","timestamp":"2024-02-14T02:01:19Z"}
{"level":"info","message":"State transition: cleanup [none] -\u003e update-status-report [none]","timestamp":"2024-02-14T02:01:19Z"}
Device deployment log 2
{"level":"info","message":"Running Mender client version: 3.5.2","timestamp":"2024-02-13T16:12:53Z"}
{"level":"info","message":"State transition: update-fetch [Download_Enter] -\u003e update-store [Download_Enter]","timestamp":"2024-02-13T16:12:53Z"}
{"level":"info","message":"Installer: authenticated digital signature of artifact","timestamp":"2024-02-13T16:12:53Z"}
{"level":"info","message":"State transition: update-store [Download_Enter] -\u003e update-after-store [Download_Leave]","timestamp":"2024-02-13T16:12:57Z"}
{"level":"info","message":"State transition: update-after-store [Download_Leave] -\u003e mender-update-control-refresh-maps [none]","timestamp":"2024-02-13T16:12:57Z"}
{"level":"info","message":"State transition: mender-update-control-refresh-maps [none] -\u003e mender-update-control [none]","timestamp":"2024-02-13T16:12:57Z"}
{"level":"info","message":"State transition: mender-update-control [none] -\u003e update-install [ArtifactInstall]","timestamp":"2024-02-13T16:12:57Z"}
{"level":"info","message":"Executing script: ArtifactInstall_Enter_00_retry_later","timestamp":"2024-02-13T16:12:57Z"}
{"level":"info","message":"Collected output (stderr) while running script /var/lib/mender/scripts/ArtifactInstall_Enter_00_retry_later\nrootfs-2024.01.1: Running ArtifactInstall_Enter_00_retry_later\nOutside of update window.\n<redacted> App shutdown signal file NOT found.\nPostponing to retry update install process later.\n\n---------- end of script output","timestamp":"2024-02-13T16:12:58Z"}
{"level":"info","message":"statescript: ArtifactInstall_Enter_00_retry_later requested a retry","timestamp":"2024-02-13T16:12:58Z"}
{"level":"info","message":"Collected output (stderr) while running script /var/lib/mender/scripts/ArtifactInstall_Enter_00_retry_later\nrootfs-2024.01.1: Running ArtifactInstall_Enter_00_retry_later\nOutside of update window.\n<redacted> App shutdown signal file NOT found.\nPostponing to retry update install process later.\n\n---------- end of script output","timestamp":"2024-02-13T16:13:59Z"}
{"level":"info","message":"statescript: ArtifactInstall_Enter_00_retry_later requested a retry","timestamp":"2024-02-13T16:13:59Z"}
{"level":"info","message":"Collected output (stderr) while running script /var/lib/mender/scripts/ArtifactInstall_Enter_00_retry_later\nrootfs-2024.01.1: Running ArtifactInstall_Enter_00_retry_later\nOutside of update window.\n<redacted> App shutdown signal file NOT found.\nPostponing to retry update install process later.\n\n---------- end of script output","timestamp":"2024-02-13T16:15:00Z"}
{"level":"info","message":"statescript: ArtifactInstall_Enter_00_retry_later requested a retry","timestamp":"2024-02-13T16:15:00Z"}
.
.
.
.
<Approx. 500 retry later requests truncated>
.
.
.
.
.
.
{"level":"info","message":"Collected output (stderr) while running script /var/lib/mender/scripts/ArtifactInstall_Enter_00_retry_later\nrootfs-2024.01.1: Running ArtifactInstall_Enter_00_retry_later\nOutside of update window.\n<redacted> App shutdown signal file NOT found.\nPostponing to retry update install process later.\n\n---------- end of script output","timestamp":"2024-02-14T01:58:09Z"}
{"level":"info","message":"statescript: ArtifactInstall_Enter_00_retry_later requested a retry","timestamp":"2024-02-14T01:58:09Z"}
{"level":"info","message":"Collected output (stderr) while running script /var/lib/mender/scripts/ArtifactInstall_Enter_00_retry_later\nrootfs-2024.01.1: Running ArtifactInstall_Enter_00_retry_later\nOutside of update window.\n<redacted> App shutdown signal file NOT found.\nPostponing to retry update install process later.\n\n---------- end of script output","timestamp":"2024-02-14T01:59:10Z"}
{"level":"info","message":"statescript: ArtifactInstall_Enter_00_retry_later requested a retry","timestamp":"2024-02-14T01:59:10Z"}
{"level":"info","message":"Output (stdout) from command \"/var/lib/mender/scripts/ArtifactInstall_Enter_00_retry_later\": <redacted>-<redacted>-app: signalled","timestamp":"2024-02-14T02:00:12Z"}
{"level":"info","message":"Collected output (stderr) while running script /var/lib/mender/scripts/ArtifactInstall_Enter_00_retry_later\nrootfs-2024.01.1: Running ArtifactInstall_Enter_00_retry_later\nUpdate window reached.\n<redacted> App shutdown signal file NOT found.\nSending SIGUSR1 to <redacted> <redacted> app.\nSleeping 15 seconds before checking for <redacted> <redacted> App status again within the same script execution.\n<redacted> App shutdown signal file found.\nRemoving <redacted> App termianted and termination pending file.\nContinuing update process.\n\n---------- end of script output","timestamp":"2024-02-14T02:00:27Z"}
{"level":"info","message":"State transition: update-install [ArtifactInstall] -\u003e mender-update-control-refresh-maps [none]","timestamp":"2024-02-14T02:00:30Z"}
{"level":"info","message":"State transition: mender-update-control-refresh-maps [none] -\u003e mender-update-control [none]","timestamp":"2024-02-14T02:00:30Z"}
{"level":"info","message":"State transition: mender-update-control [none] -\u003e update-commit [ArtifactCommit_Enter]","timestamp":"2024-02-14T02:00:30Z"}
{"level":"info","message":"Device unauthorized; attempting reauthorization","timestamp":"2024-02-14T02:00:30Z"}
{"level":"info","message":"Output (stderr) from command \"/usr/share/mender/identity/mender-device-identity\": using interface /sys/class/net/eth0","timestamp":"2024-02-14T02:00:30Z"}
{"level":"info","message":"successfully received new authorization data from server https://eu.hosted.mender.io","timestamp":"2024-02-14T02:00:30Z"}
{"level":"info","message":"shutting down websocket connections","timestamp":"2024-02-14T02:00:30Z"}
{"level":"info","message":"Local proxy stopped","timestamp":"2024-02-14T02:00:30Z"}
{"level":"warning","message":"error while sending close message: websocket: close sent","timestamp":"2024-02-14T02:00:30Z"}
{"level":"error","message":"Proxy Shutdown failed: context canceled\n","timestamp":"2024-02-14T02:00:30Z"}
{"level":"warning","message":"error while sending close message: websocket: close sent","timestamp":"2024-02-14T02:00:30Z"}
{"level":"info","message":"Local proxy started","timestamp":"2024-02-14T02:00:30Z"}
{"level":"info","message":"Reauthorization successful","timestamp":"2024-02-14T02:00:30Z"}
{"level":"info","message":"State transition: update-commit [ArtifactCommit_Enter] -\u003e update-after-first-commit [none]","timestamp":"2024-02-14T02:00:30Z"}
{"level":"info","message":"State transition: update-after-first-commit [none] -\u003e update-after-commit [ArtifactCommit_Leave]","timestamp":"2024-02-14T02:00:30Z"}
{"level":"info","message":"State transition: update-after-commit [ArtifactCommit_Leave] -\u003e cleanup [none]","timestamp":"2024-02-14T02:00:30Z"}
{"level":"info","message":"Executing script: ArtifactCommit_Leave_00_start_<redacted>_<redacted>_app","timestamp":"2024-02-14T02:00:30Z"}
{"level":"info","message":"Output (stdout) from command \"/var/lib/mender/scripts/ArtifactCommit_Leave_00_start_<redacted>_<redacted>_app\": <redacted>-<redacted>-app: started","timestamp":"2024-02-14T02:00:33Z"}
{"level":"info","message":"Collected output (stderr) while running script /var/lib/mender/scripts/ArtifactCommit_Leave_00_start_<redacted>_<redacted>_app\n<redacted>-<redacted>-app-4.1.0: Running ArtifactCommit_Leave_00_start_<redacted>_<redacted>_app\n\n---------- end of script output","timestamp":"2024-02-14T02:00:33Z"}
{"level":"info","message":"State transition: cleanup [none] -\u003e update-status-report [none]","timestamp":"2024-02-14T02:00:33Z"}
It is concerning that the deployment failed for no apparent reason on our first deployment, first device before sending these devices into the field.