Unexpected rollback updating device

Hi all,

We’re getting a rollback too frequently for it to seem correct when updating devices. Given the time between the messages in the quoted log, I’d have to say I’m pretty sure the network is up, especially since it had no problem returning the failure log to the server. Does anybody have any ideas why this might be happening. We’re using the Mender 2.0.1 server and a 1.7.1 client. I’ve attached a log from our failed update.

Blockquote

2019-09-12 12:48:38 +0000 UTC info: Running Mender version 2.0.0
2019-09-12 12:48:38 +0000 UTC debug: handle update fetch state
2019-09-12 12:48:38 +0000 UTC debug: status reported, response 204 No Content
2019-09-12 12:48:38 +0000 UTC debug: Received fetch update response &{200 OK 200 HTTP/1.1 1 1 map[Server:[openresty/1.13.6.2] Last-Modified:[Thu, 12 Sep 2019 12:44:19 GMT] X-Xss-Protection:[1; mode=block] Strict-Transport-Security:[max-age=63072000; includeSubdomains; preload] Content-Length:[149346304] Accept-Ranges:[bytes] Etag:["7fd0983e60dff2c7d83dec84094bb85d"] Vary:[Origin] X-Content-Type-Options:[nosniff] Date:[Thu, 12 Sep 2019 12:48:38 GMT] Content-Security-Policy:[block-all-mixed-content] X-Amz-Request-Id:[15C3B1BC7A07C5E4] X-Frame-Options:[DENY] Content-Type:[application/vnd.mender-artifact] Connection:[keep-alive]] 0xc00033a820 149346304 [] false false map[] 0xc0004a6400 0xc0001f6630}+
2019-09-12 12:48:38 +0000 UTC info: State transition: update-fetch [Download_Enter] -> update-store [Download_Enter]
2019-09-12 12:48:38 +0000 UTC debug: handle update install state
2019-09-12 12:48:38 +0000 UTC debug: status reported, response 204 No Content
2019-09-12 12:48:38 +0000 UTC debug: Read data from device manifest file: device_type=intel-corei7-64
2019-09-12 12:48:38 +0000 UTC debug: Current manifest data: intel-corei7-64
2019-09-12 12:48:38 +0000 UTC info: Update Module path "/usr/share/mender/modules/v3" could not be opened (open /usr/share/mender/modules/v3: no such file or directory). Update modules will not be available
2019-09-12 12:48:38 +0000 UTC debug: checking if device [intel-corei7-64] is on compatibile device list: [intel-corei7-64]
2019-09-12 12:48:38 +0000 UTC debug: installer: processing script: ArtifactReboot_Enter_90_symlink-hostname
2019-09-12 12:48:38 +0000 UTC debug: installer: processing script: ArtifactReboot_Leave_00_fix-hostname
2019-09-12 12:48:38 +0000 UTC debug: installer: processing script: ArtifactReboot_Leave_10_flash-poweruc
2019-09-12 12:48:38 +0000 UTC debug: installer: processing script: ArtifactReboot_Leave_20_update-camera
2019-09-12 12:48:38 +0000 UTC debug: installer: processing script: ArtifactReboot_Leave_80_start-exec
2019-09-12 12:48:38 +0000 UTC debug: installer: successfully read artifact [name: test-image; version: 2; compatible devices: [intel-corei7-64]]
2019-09-12 12:48:38 +0000 UTC debug: Trying to install update of size: 4110417920
2019-09-12 12:48:38 +0000 UTC debug: Active partition: /dev/sda3
2019-09-12 12:48:38 +0000 UTC debug: Detected inactive partition /dev/sda2, based on active partition /dev/sda3
2019-09-12 12:48:38 +0000 UTC info: native sector size of block device /dev/sda2 is 512, we will write in chunks of 1048576
2019-09-12 12:48:38 +0000 UTC info: opening device /dev/sda2 for writing
2019-09-12 12:48:38 +0000 UTC info: partition /dev/sda2 size: 4110417920
2019-09-12 12:52:15 +0000 UTC info: wrote 4110417920/4110417920 bytes of update to device /dev/sda2
2019-09-12 12:52:16 +0000 UTC debug: status reported, response 204 No Content
2019-09-12 12:52:16 +0000 UTC info: State transition: update-store [Download_Enter] -> update-after-store [Download_Leave]
2019-09-12 12:52:16 +0000 UTC info: State transition: update-after-store [Download_Leave] -> update-install [ArtifactInstall]
2019-09-12 12:52:17 +0000 UTC debug: status reported, response 204 No Content
2019-09-12 12:52:17 +0000 UTC debug: Inactive partition: /dev/sda2
2019-09-12 12:52:17 +0000 UTC debug: Marking inactive partition (/dev/sda2) as the new boot candidate.
2019-09-12 12:52:17 +0000 UTC info: Enabling partition with new image installed to be a boot candidate: 2
2019-09-12 12:52:17 +0000 UTC debug: Have U-Boot variable: mender_check_saveenv_canary=
2019-09-12 12:52:17 +0000 UTC debug: List of U-Boot variables:map[mender_check_saveenv_canary:]
2019-09-12 12:52:17 +0000 UTC debug: Marking inactive partition as a boot candidate successful.
2019-09-12 12:52:17 +0000 UTC info: State transition: update-install [ArtifactInstall] -> reboot [ArtifactReboot_Enter]
2019-09-12 12:52:17 +0000 UTC debug: start executing script: ArtifactReboot_Enter_90_symlink-hostname
2019-09-12 12:52:17 +0000 UTC debug: status reported, response 204 No Content
2019-09-12 12:52:17 +0000 UTC debug: status reported, response 204 No Content
2019-09-12 12:52:17 +0000 UTC debug: handling reboot state
2019-09-12 12:52:17 +0000 UTC debug: status reported, response 204 No Content
2019-09-12 12:52:17 +0000 UTC info: rebooting device(s)
2019-09-12 12:52:17 +0000 UTC info: Mender rebooting from active partition: /dev/sda3
2019-09-12 12:52:38 +0000 UTC info: Running Mender version 2.0.0
2019-09-12 12:52:38 +0000 UTC info: Update Module path "/usr/share/mender/modules/v3" could not be opened (open /usr/share/mender/modules/v3: no such file or directory). Update modules will not be available
2019-09-12 12:52:38 +0000 UTC info: State transition: init [none] -> after-reboot [ArtifactReboot_Leave]
2019-09-12 12:52:38 +0000 UTC debug: Have U-Boot variable: mender_check_saveenv_canary=
2019-09-12 12:52:38 +0000 UTC debug: List of U-Boot variables:map[mender_check_saveenv_canary:]
2019-09-12 12:52:38 +0000 UTC debug: Have U-Boot variable: upgrade_available=1
2019-09-12 12:52:38 +0000 UTC debug: List of U-Boot variables:map[upgrade_available:1]
2019-09-12 12:52:38 +0000 UTC info: State transition: after-reboot [ArtifactReboot_Leave] -> after-reboot [ArtifactReboot_Leave]
2019-09-12 12:52:38 +0000 UTC debug: handling state after reboot
2019-09-12 12:52:38 +0000 UTC info: State transition: after-reboot [ArtifactReboot_Leave] -> update-commit [ArtifactCommit_Enter]
2019-09-12 12:52:38 +0000 UTC debug: start executing script: ArtifactReboot_Leave_00_fix-hostname
2019-09-12 12:52:38 +0000 UTC error: failed to report status: Put https://menderserver.no/api/devices/v1/deployments/device/deployments/21db410f-e38b-4b49-a783-1c0bd3a91633/status: dial tcp: lookup [menderserver.no](http://menderserver.no/) on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: connect: network is unreachable
2019-09-12 12:52:38 +0000 UTC error: statescript: can not send start status to server: reporting status failed: Put https://menderserver.no/api/devices/v1/deployments/device/deployments/21db410f-e38b-4b49-a783-1c0bd3a91633/status: dial tcp: lookup [menderserver.no](http://menderserver.no/) on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: connect: network is unreachable
2019-09-12 12:52:38 +0000 UTC debug: start executing script: ArtifactReboot_Leave_10_flash-poweruc
2019-09-12 12:52:38 +0000 UTC error: failed to report status: Put https://menderserver.no/api/devices/v1/deployments/device/deployments/21db410f-e38b-4b49-a783-1c0bd3a91633/status: dial tcp: lookup [menderserver.no](http://menderserver.no/) on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: connect: network is unreachable
2019-09-12 12:52:38 +0000 UTC error: statescript: can not send start status to server: reporting status failed: Put https://menderserver.no/api/devices/v1/deployments/device/deployments/21db410f-e38b-4b49-a783-1c0bd3a91633/status: dial tcp: lookup [menderserver.no](http://menderserver.no/) on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: connect: network is unreachable
2019-09-12 12:52:48 +0000 UTC debug: start executing script: ArtifactReboot_Leave_20_update-camera
2019-09-12 12:54:17 +0000 UTC error: failed to report status: Put https://menderserver.no/api/devices/v1/deployments/device/deployments/21db410f-e38b-4b49-a783-1c0bd3a91633/status: EOF
2019-09-12 12:54:17 +0000 UTC error: statescript: can not send start status to server: reporting status failed: Put https://menderserver.no/api/devices/v1/deployments/device/deployments/21db410f-e38b-4b49-a783-1c0bd3a91633/status: EOF
2019-09-12 12:55:21 +0000 UTC info: Running Mender version 2.0.0
2019-09-12 12:55:21 +0000 UTC error: Update was interrupted in state: after-reboot
2019-09-12 12:55:21 +0000 UTC info: Update Module path "/usr/share/mender/modules/v3" could not be opened (open /usr/share/mender/modules/v3: no such file or directory). Update modules will not be available
2019-09-12 12:55:21 +0000 UTC info: State transition: init [none] -> rollback [ArtifactRollback]
2019-09-12 12:55:21 +0000 UTC debug: transitioning to error state
2019-09-12 12:55:21 +0000 UTC info: performing rollback
2019-09-12 12:55:21 +0000 UTC debug: Have U-Boot variable: mender_check_saveenv_canary=
2019-09-12 12:55:21 +0000 UTC debug: List of U-Boot variables:map[mender_check_saveenv_canary:]
2019-09-12 12:55:21 +0000 UTC debug: Have U-Boot variable: upgrade_available=0
2019-09-12 12:55:21 +0000 UTC debug: List of U-Boot variables:map[upgrade_available:0]
2019-09-12 12:55:21 +0000 UTC debug: will try to rollback reboot the device
2019-09-12 12:55:21 +0000 UTC info: State transition: rollback [ArtifactRollback] -> rollback-reboot [ArtifactRollbackReboot_Enter]
2019-09-12 12:55:21 +0000 UTC info: rebooting device(s) after rollback
2019-09-12 12:55:21 +0000 UTC info: Mender rebooting from inactive partition: /dev/sda3
2019-09-12 12:57:10 +0000 UTC info: Running Mender version 2.0.0
2019-09-12 12:57:10 +0000 UTC info: Update was interrupted in state: rollback-reboot
2019-09-12 12:57:10 +0000 UTC info: Update Module path "/usr/share/mender/modules/v3" could not be opened (open /usr/share/mender/modules/v3: no such file or directory). Update modules will not be available
2019-09-12 12:57:10 +0000 UTC info: State transition: init [none] -> verify-rollback-reboot [ArtifactRollbackReboot_Leave]
2019-09-12 12:57:10 +0000 UTC debug: transitioning to error state
2019-09-12 12:57:10 +0000 UTC debug: Have U-Boot variable: mender_check_saveenv_canary=
2019-09-12 12:57:10 +0000 UTC debug: List of U-Boot variables:map[mender_check_saveenv_canary:]
2019-09-12 12:57:10 +0000 UTC debug: Have U-Boot variable: upgrade_available=0
2019-09-12 12:57:10 +0000 UTC debug: List of U-Boot variables:map[upgrade_available:0]
2019-09-12 12:57:10 +0000 UTC info: State transition: verify-rollback-reboot [ArtifactRollbackReboot_Leave] -> after-rollback-reboot [ArtifactRollbackReboot_Leave]
2019-09-12 12:57:10 +0000 UTC debug: handling state after rollback reboot
2019-09-12 12:57:10 +0000 UTC info: State transition: after-rollback-reboot [ArtifactRollbackReboot_Leave] -> update-error [ArtifactFailure]
2019-09-12 12:57:10 +0000 UTC debug: handle update error state
2019-09-12 12:57:10 +0000 UTC info: State transition: update-error [ArtifactFailure] -> cleanup [Error]
2019-09-12 12:57:10 +0000 UTC debug: Handling Cleanup state
2019-09-12 12:57:10 +0000 UTC info: State transition: cleanup [Error] -> update-status-report [none]
2019-09-12 12:57:10 +0000 UTC debug: handle update status report state
2019-09-12 12:57:10 +0000 UTC error: failed to report status: Put https://menderserver.no/api/devices/v1/deployments/device/deployments/21db410f-e38b-4b49-a783-1c0bd3a91633/status: dial tcp: lookup [menderserver.no](http://menderserver.no/) on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: connect: network is unreachable
2019-09-12 12:57:10 +0000 UTC error: error reporting update status: reporting status failed: Put https://menderserver.no/api/devices/v1/deployments/device/deployments/21db410f-e38b-4b49-a783-1c0bd3a91633/status: dial tcp: lookup [menderserver.no](http://menderserver.no/) on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: connect: network is unreachable
2019-09-12 12:57:10 +0000 UTC error: failed to send status to server: transient error: reporting status failed: Put https://menderserver.no/api/devices/v1/deployments/device/deployments/21db410f-e38b-4b49-a783-1c0bd3a91633/status: dial tcp: lookup [menderserver.no](http://menderserver.no/) on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: connect: network is unreachable
2019-09-12 12:57:10 +0000 UTC info: State transition: update-status-report [none] -> update-retry-report [none]
2019-09-12 13:02:10 +0000 UTC debug: wait complete
2019-09-12 13:02:10 +0000 UTC info: State transition: update-retry-report [none] -> update-status-report [none]
2019-09-12 13:02:10 +0000 UTC debug: handle update status report state
2019-09-12 13:02:10 +0000 UTC debug: status reported, response 204 No Content
2019-09-12 13:02:10 +0000 UTC debug: attempting to upload deployment logs for failed update

Hi @ckelloug2 from this line, it looks like something is not happy in your DNS resolution setup in this case. It may be instructive to add some debug into your ArtifactReboot state scripts to dump status of connman or whatever network manager you are using. Output of “ifconfig” or “ip addr” as well as the contents of /etc/resolv.conf would also be useful.

You will probably need to dump that info into the /data partition and then view it after the rollback from an ssh connection to the device.

Drew