System Software Version after using mender-convert

Hey guys,

First just wanted to thank everyone that contributed to this product!

I was wondering if anyone could help with a question regarding the System Software Version (as visible in the GUI of the server) which is under the rootfs-partition update module (namespace?).

When creating a new golden device dump using mender-convert, how does one include the System Software Version, as well as an Application Software Version, so that it is available in the .img file (to be used to commission new client devices).

So far I know how to pass this information along from mender-convert to mender-artifact using the available hook, but these versions are then only available for the artifact file (file which will then be deployed to existing clients).

So existing clients, no problem there, however newly comissioned devices, there is nothing there to say what is the “starting” point. I am asking this as my assumption would be that following directory/single-file deployments would make sense to have a “depend”, but i fails to see how i can make this new deployment depend on a specific version, if the client does not really “have” one

Thanks in advance

Hi @robertalexa,

Thanks for getting in touch! Concerning software depedencies, there is a quite sophisticated and powerful mechanism available. It is described in the documentation here.

In order to leverage it, you can pass the relevant flags to mender-artifact through mender-convert, and to a substantial extent also modify the Artifacts directly.

Greetz,
Josef

Hi @TheYoctoJester . I appreciate your reply.

Just wanted to mention that we are already doing that and you might have misread my message?

As per above, the correct software version are passed and presented as part of the artifact generated as a side effect of running mender-convert, and that is great for existing devices in the field.

However if we are to commission a new device from the .img generated via the same process, the software version is not preserved (as far as we can tell) and not presented in the GUI. As a result, is it not clear what a newly commissioned device is running, until it gets its next update, from a future artifact.

Hope that makes sense?

Hey @TheYoctoJester

Don’t mean to be pestering you directly so accept my apologies if i’ve bothered you!

Let me rephrase my “challenge” and maybe that helps you help me with out implementation. It may just be a case of my misunderstanding the operational flow (and unfortunately i have found some of the information on the docs to be contradictory)

  1. Is mender-convert meant to be used once - the very first time we create a mender ready image - or every time we require to do apt packages or full on system updates? The information on Convert a Mender Debian image | Mender documentation seems to be slightly in discrepancy from Create an Artifact with system snapshot | Mender documentation

  2. If mender-convert is meant to be used every time a full on image needs to be generated, this is where my initial question comes from. How can you pass along the relevant version? The mender-convert script command does not have any passable arguments (this may be a PR suggestion so people don’t overwrite the original file from your repo and instead just pass things along natively). In our case, we have added 2 extra ENV variables, which we then pass along into mender-artifact. (ROOTFS_SOFTWARE_VERSION, APPLICATION_SOFTWARE_VERSION)

docker run \
  --rm \
  -v "$INPUT_DIRECTORY":/mender-convert/input \
  -v "$LOGS_DIRECTORY":/mender-convert/logs \
  -v "$DEPLOY_DIRECTORY":/mender-convert/deploy \
  --privileged=true \
  --cap-add=SYS_MODULE \
  -v /dev:/dev \
  -v /lib/modules:/lib/modules:ro \
  --env MENDER_ARTIFACT_NAME="${MENDER_ARTIFACT_NAME}" \
  --env MENDER_CONVERT_LOG_FILE=logs/"${LOG_FILE}" \
  --env MENDER_CONVERT_VERSION=${GIT_PROVIDED_TAG_NAME} \
  --env ROOTFS_SOFTWARE_VERSION="${ROOTFS_SOFTWARE_VERSION}" \
  --env APPLICATION_SOFTWARE_VERSION="${APPLICATION_SOFTWARE_VERSION}" \
  "$IMAGE_NAME" "$@"

When doing so, we have also hooked into mender_create_artifact Customization | Mender documentation

Simplified version:

mender_create_artifact() {
  local -r device_type="${1}"
  local -r artifact_name="${2}"

  mender_artifact_name=${device_type}-${artifact_name}.mender
  mender_artifact_path=deploy/${mender_artifact_name}
  log_info "Running custom implementation of the 'mender_create_artifact' hook"

  log_info "Writing Mender artifact to: ${mender_artifact_path}"
  log_info "This can take up to 20 minutes depending on which compression method is used"
  run_and_log_cmd "mender-artifact --compression ${MENDER_ARTIFACT_COMPRESSION} \
      write rootfs-image \
      --file work/rootfs.img \
      --output-path ${mender_artifact_path} \
      --artifact-name ${artifact_name} \
      --device-type ${device_type} \
      --software-version ${ROOTFS_SOFTWARE_VERSION} \
      --provides rootfs-image.application.version:${APPLICATION_SOFTWARE_VERSION} \
      --script input/fd/state_scripts/ArtifactInstall_Enter_00_retain_user_data"
}

As mentioned, the above retains information above version on the newly created artifact, but not onto the image that is meant to be used to commission new devices.

  1. If mender-convert is NOT meant to be used every time, and instead mender-artifact should be used (as presented in the above mentioned article about full system snapshot), then I am even more confused. The article refers to a golden device, but since the golden device does not contain the mender client, how would that actually work? Also, what about any custom overlays, certain files in certain places, scripts

  2. The docs say that sudo apt update/upgrade should not be done on a device containing the mender-client, which re-enforces my thinking at point 3 that the “golden device” is actually “barebone”

  3. (This is a LE) - looking at release notes and code of mender-convert 4.0.0 and mender-client 3.5.0, it seems that a bootstrap-artifact has been implemented, in what i assume to be a “placeholder” artifact that would inform the server of certain pieces of info? I suspect that this could be used to pass along the system software version as well as what application version it provides? However this is not documented at all in the current version of the docs. If this is the real solution, then there is no need for passable arguments, in that case, the bootstrap artifact can contain that information and be updated manually for every system deployment and version controller appropriately?

  4. (This is a LE) - are we even meant to use these recent versions since Mender Server 3.5 was withdrawn? Is the dependency actually from the server downwards? e.g. looking at the JSON https://docs.mender.io/releases/versions.json and matching the version of the server with the corresponding sub-systems?

In simplier form, I guess my confusion just stems from potentially contradictory information (or at least what i interpret as contradictory). As a side note, what definitely lacks from the docs is a “real-world use case” type of example/definition.

Things like:

  • start from scratch
  • put your application
  • add your overlays
  • when to convert
  • what to convert
  • what about when times come for an application change
  • what about when times come for an os upgrade

Any direction would be greatly appreciated here!

Hi @robertalexa

No problem, I don’t feel pestered. Just traveling and attending an event this week and therefore not having the capacity to carefully work through the post. So it is perfectly possible that I did not read good enough in the first place. Apologies!

Maybe @kacf or @lluiscampos can chime in here?

Greets
Josef

1 Like

I was just about to politely ask Kristian and Louis for some help as an offtopic question attached to one of my open PRs, but I appreciate you pinging them.

Offtopic: good luck with your event!

1 Like

Hi @robertalexa ,

Let me clarify one implementation detail that will throw some light into some of the questions you are having.

The System Software Version and the Application Software Version is an information available in the local database of each device. The database is located in the data partition. This database is initialized at first boot with an special “bootstrap Artifact”, when such Artifact is found. Otherwise is initialized with some placeholder values.

This is the reason why these Versions information follow on an Artifact metadata but not on an mender-convert generated img: the database is individual per device and it is not “recreated” or prepared in any form when mender-convert creates a new img.

However, you can utilize the mentioned bootstrap Artifact mechanism to inject such version information (or any other user-defined keys, really) in your newly flashed devices. This is the part of the mender-artifact-package that generate the bootstrap Artifact:

You can write a platform hook that generates your custom bootstrap Artifact and install it in the same location. That should result in what you are asking for: the devices will have the desired versions on first boot.

Otherwise if you have a suggestion on how could we better support this use case out of the box I’d be happy to review more code from you :slight_smile:

LluĂ­s

2 Likes

Hi @lluiscampos

Thank you kindly for the reply. That basically touches on my original point 5.

This doesn’t seem to be documented anywhere as of now? (correct me if i am wrong) And as such it raises a few more questions in my mind before I can 100% say that i understand the issue at hand.

  1. If I create a PLATFORM_MODIFY_HOOKS that creates a bootstrap artifact I have the following questions:
  1. If i create a PLATFORM_PACKAGE_HOOKS with the idea to overwrite the boostrap artifact that was natively created:
  1. If i create my own custom file for mender-convert-package i can control this however i want, but then i will fall out of sync with your codebase

Thanks

1 Like

Not documented. We considered it an implementation detail and not something that a user might tinkle with. I was giving it to you because, after all, it is open source and you can exploit it as you wish :slight_smile:

As a side note: the main motivation of this bootstrap Artifact was to support delta Artifacts (Enterprise feature) from first boot, nothing else…

I was thinking about a platform package hook overriding or possibly modifying (*) the bootstrap Artifact. It is packaged into the final image here.

(*) I am not sure we can use mender-artifact modify on an bootstrap Artifact, that I have not done before.

Yes, the hooks are meant to be the entry point for user modifications.

LluĂ­s

PS: this thread might go silent for a while because starting tomorrow I am going on vacation for few weeks :grin:

1 Like

@robertalexa I found that there is some documentation of the feature here.

1 Like

Hi @lluiscampos hope you had a good time off!

Been struggling to find time (commercially) to continue with our integration hence the radio silence.

I think the above information, combined with our playground has put us roughly on track.

I do have a question if you would be kind to help:

We hardware does not have any user inputs (screens buttons etc), but are engaged in processes that must be gracefully shut down to avoid data loss. So our approach would be to handle updates, if existing, on a nightly basis at 2AM

Config:
UpdateControlMapExpirationTimeSeconds - 48 hours (long time to avoid expiration)
UpdateControlMapBootExpirationTimeSeconds - 600 (default)
UpdateControlMapPollIntervalSeconds - 60 seconds

State script
ArtifactInstall_Leave
- dbus update control map - uuid FIXED_VALUE_UUID, priority 0, ArtifactCommit_Enter - pause
- SIGNAL to application that an update is due
ArtifactCommit_Leave (on just directory updates)
- supervisor restart main_app.py

Application main thread
When SIGNAL is received, start graceful app halting.
When ready, send dbus update control map uuid FIXED_VALUE_UUID, priority 0, ArtifactCommit_Enter - continue

My hope is that by using the same UUID and the same priority, the update control map will be updated in place. I am loosely basing my assumption on the code here https://github.com/mendersoftware/mender/blob/e7ac5bc79c078c6af69ac36955677f886a734c5d/app/updatemanager.go#L141

When another deployment happens, the cycle repeats.

Would you be kind to sanity check this approach? Would really appreciate some clarity around this.

PS: For the sake of clarity, our application is a python app kept alive using supervisor. The choice of letting ArtifactInstall happen naturally was on purpose, as this will make the “deployment” time faster (from starting graceful stop to resuming) for both A/B and Directory updates. Given that the python app is ran via supervisor, it “old” version will continue to run in memory. The supervisor will restart the task on ArtifactCommit_Leave

Regards
Rob

Hello @robertalexa. I had a lovely vacation, thanks!

Based on your description, it seems like the use of update control maps is not because a (human) user needs to interact but basically that you need some good time for a graceful (non-human) shutdown of a given application.

I suggest then dropping the use of update control maps altogether and just rely on the “retry later” mechanism of state scripts. See here which is designed exactly for this use case.

Then the high level idea is to have one single state script (you decide in which state of the state machine) that. The script checks if the application is running - if so signals the application for shutdown and returns “retry later” to Mender. On successive calls the application will be done at some point, then the script can just return “ok” to Mender.

Does this make sense? Why do you think that update control maps is a better fit?

LluĂ­s

Hey guys,

Thanks for sharing your knowledge and providing hints on the best direction to go in terms of Mender integration. @robertalexa is busy with another project at the moment so I would like to intervene and resolve some design concerns encountered in a meanwhile.

@lluiscampos thanks for a proposed approach. To reiterate - a state script will be returning 21 (“retry later”) until all the requirements are satisfied. Then, when all the requirements are satisfied - state script returns 0 - this allows Mender update to continue.

The approach of “retry later” seems much more straight forward then update control maps manipulation. Seems like we overlooked such a neat approach to an issue :sweat_smile:.

However, there is a specific scenario that might trigger the system to misbehave - sudden power loss while state script in “retry later” state. Let me provide an example for clarification:

  • Update is received - Download state completed successfully, deployment enters ArtifactInstall_Enter state
  • During ArtifactInstall_Enter a state scripts is run to check for satisfied requirements - not all requirements are yet satisfied, so returns “retry later”
  • State scripts is scheduled to be rerun in X minutes
  • Device loses power (i.e. customer disconnects the power plug)
  • Device gets connected back to power outlet - as per documentation - Artifact enters an error state which results in failure of an entire deployment

Ideally, after power loss in such a scenario, it is desired for the update to still be applied under the same deployment.

Is it achievable to apply the deployment that was delayed via “retry later” state script after a power loss?