Mender 2.4.0 Deployment Stuck Pending Bug

Hello,
I have recently installed the Mender OS software for 2.4.0, and have come across an issue where a deployment is stuck as pending indefinitely due to an old entry in mongo DB that thinks a previous update is still in progress. Further details in the reproduction steps below:

  1. Deploy a new image to a device (either a debian or full OS image works)
  2. Upon success, redeploy the same image to the device. At this point, the deployment will automatically abort as expected due to the artifact already being installed on this image.
  3. Retry deployment - note that the device never leaves the pending state. In the logs, you should see something to this effect:
    time="2020-08-28T15:03:04Z" level=info msg="New status: already-installed for device # deployment: #" device_id=5e67f1133081290001a599bd file=app.go func="app.(*Deployments).GetDeploymentForDeviceWithCurrent" line=926 plan=x request_id=#

This continues for that device until the culprit deployment is removed from the db. I looked a bit into the deployment code myself, but couldn’t tell where this was happening (I’m very new to go).

This may be occurring in other instances as well, but this was the first scenario I noticed it for and could reproduce.

I’ve had another person replicate this as well, and I should note that we also ensured the devices were online, and even forced them to check for an update via their command lines.

Thanks!
Liz

Thanks @lizziemac for the report.

@peter and @tranchitella are probably the people with best insights in to this part of the code and will defer to them to comment.

Hi @peter @tranchitella , are you able to reproduce this on 2.7? I would like to update my OS version from 2.4.0, but would want to verify this is no longer happening. Let me know if you need more information,
Liz

Also, the db command I use to clean up is:

db.devices.remove({ status:{$in: ["pending", "installing", "rebooting", "downloading"]}, deviceid:"<insert-device-id>"})

@lizziemac I’m not aware of this specific issue in recent Mender versions (2.6 or 2.7).

hello Liz,

I would recommend upgrading.

peter

I’ll be able to verify after our second update window in two months, but I’m pretty sure that the solution will end up being this: High Rate of Debian Failures - #10 by kacf