State scripts for tricky update scheduling logic

Hi everyone,

I’d like to get some input on how to implement some potentially tricky update timing/scheduling logic.

Here’s the device behavior I am going for:

  • updates are downloaded to the device automatically
  • installation is scheduled for an appropriate day and time (either automatically by our software or via user input) - could be days/weeks in the future
  • there’s also an “Update Now” button that runs the update when it’s clicked
  • there’s also a button to disable automatic updates; updates would then only happen when the aforementioned “Update Now” button is clicked

Here is what I am thinking for the implementation:

Set mender config variables to something like:

"StateScriptRetryIntervalSeconds": 30,
"StateScriptRetryTimeoutSeconds" : 604800

Have a Download_Leave script that causes my custom scheduler (either software-only or incl. asking for user input) to run. This script keeps returning 21 (for retry-later) until the scheduled time comes, and then returns 0 to proceed with the update.

The above StateScriptRetryTimeoutSeconds would allow scheduling the update for up to a week into the future.

If the user clicks “update now” at any time prior to the scheduled time, the Download_Leave script would just return 0 the next time it runs and the update would happen.

So far so good, I think - though if you see any issues with that, please let me know.

How could I deal with a scenario where either no suitable time is found, or the user disables automatic updates?

In this scenario, either the user chooses to disable automatic updates, or no suitable time to run the update is found before StateScriptRetryTimeoutSeconds is up.

The update thus gets marked as failed.

At this point, how could I implement an “Update Now” button on the client?

It seems like I should tell the mender client to go to the ArtifactInstall state when the “Update Now” button is clicked at this point (and again, noting that this command should originate on the client rather than on the server / mender dashboard). Is there a way to do that?

It looks like I could get the download manually via the Device / Deployments API, but that I’d kind of be bypassing the state machine by doing that.

Is there a good way for me to implement this functionality?

Thanks!

-Krystof

Hi @krystof indeed this is a tricky scenario and anything you do to allow an unbounded delay in installation will eventually result in a timeout on the server side. I don’t think there is an easy way around that.

In the 2.4 release we did add automatic deployment retries which can be used to catch these failures and retry however that may just push the issue out in time since we don’t support infinite retries.

The best approach may be implement custom scripting over the API to handle this.

@kacf do you have any further ideas?

Drew

Thanks @drewmoseley

Here’s how I’m thinking the scenario where the automated timeout expires and then the user clicks the “Update Now” button could be implemented:

  1. User clicks “Update Now”
  2. The device hits the Mender Deployments API with a request like “create a new deployment with the latest artifact for this specific device”
  3. Normal Mender client state flow happens and the device updates

Does that make sense to you - to use the Deployments API for this?

Also, is there a way to get the current state or more details from the client so that I can display a reasonable download/progress bar or something similar?

Yes, that could work. The risk is that you need to have the Mender UI password on the client if it is going to be logging into create deployments. Could you isolate that code in your server infrastructure somewhere?

At present there is no client status for progress bars although it is in discussion. See https://tracker.mender.io/browse/MEN-3563 for the details.

Drew

Yeah I would definitely add a layer of indirection where the devices would only talk to our own API (which they need to do anyway), and our server would then talk to the Mender server API.

Thanks!