Units not starting update unless we decommission and re-accept them on the console

Hi,

We are seeing issues on some of our units where we will start an update but the unit never finds it or starts downloading.
We have been able to check the journal logs for the mender client on one of the units and it runs through the inventory check and then goes straight to idle instead of downloading the update.

If we decommission and re-accept the unit and restart the unit we can get the update to go through but this is not a great long-term solution.

Is there other logging available to see what could be causing this issue?

Thanks,

Kevin

Hello,

This just happened again. Details
Original Deployment: 92f67566-8354-4956-909a-0284b9fe7878
Original Device ID: 28ecd029-38ce-4bc4-854a-741495b6dd63

After decommissioning and re accepting the device the update went through successfully.
New Deployment ID: 511537b4-ce00-4d5e-86f7-3deada3763e2
New Device ID: a86299c6-aa35-4362-b034-c5150f5df9ee
Note this is the same physical device in both attempted updates.

What can be done to prevent this from happening?

Hello @mbheidebr and @kevlan,

We reviewed your data on Hosted Mender and finally understood the problem.

First, the deployments service will try to apply the deployments to the devices by creation date, starting from the oldest unapplied deployment to the most recent one.

In your setup, however, we see that you target a particular device with a scheduled deployment that will start in the future. Meanwhile, you are targeting the same device with a new deployment, created after the one above. You expect the second deployment to be applied to the device, but this won’t happen until the device completes the scheduled deployment (which will eventually start in the future), either successfully updating or because you abort the deployment. We also see that you aborted the scheduled deployment, but this happened after the decommissioning of the device on the February the 25th.

Can you please confirm this is the case?

Hi @tranchitella,

This explains what is going on. We have a nightly deployment set up for all of our production units. It schedules deployments the morning before to start later that night. Then the next morning the nightly deployment is aborted.

I will adjust the creation of the nightly deployments to be at the time we want the deployment to start and will see if this solves the issue we were seeing.

Matt

1 Like