I have recently installed the Mender OS software for 2.4.0, and have come across an issue where a deployment is stuck as pending indefinitely due to an old entry in mongo DB that thinks a previous update is still in progress. Further details in the reproduction steps below:
- Deploy a new image to a device (either a debian or full OS image works)
- Upon success, redeploy the same image to the device. At this point, the deployment will automatically abort as expected due to the artifact already being installed on this image.
- Retry deployment - note that the device never leaves the pending state. In the logs, you should see something to this effect:
time="2020-08-28T15:03:04Z" level=info msg="New status: already-installed for device # deployment: #" device_id=5e67f1133081290001a599bd file=app.go func="app.(*Deployments).GetDeploymentForDeviceWithCurrent" line=926 plan=x request_id=#
This continues for that device until the culprit deployment is removed from the db. I looked a bit into the deployment code myself, but couldn’t tell where this was happening (I’m very new to
This may be occurring in other instances as well, but this was the first scenario I noticed it for and could reproduce.
I’ve had another person replicate this as well, and I should note that we also ensured the devices were online, and even forced them to check for an update via their command lines.