Update when user interacts

Hi all,

For our new product, we use ElectronJS to build an application in combination with the UP Squared. This application is a kind-of POS system. These devices are going to stores all around the world and will be used by people all day. So, updating whenever there is an update is not what we want and have in mind, because then the customer can’t use his/her system anymore when the containers are restarting. And when the ElectronJS container is restarting, the whole application quits and starts again. Again, not what we want.

In earlier products, we didn’t use containers and made a update system, that downloads a package, opens the package and installs what needs to be installed. Using containers is much more flexible and more reliable. But in BalenaOS, whenever I push an update, every device installs that update. I know about the lock files and Balena Supervisor API, but I was wondering if anyone has any suggestions to tackle two problems,

So, TL;DR, 2 questions:

  • What’s the best way to create a function so that my application is in charge when all containers can update? (So when a user clicks on a button and/or when it’s between 00:00 and 03:00 for example)
  • Is it possible to push an update that only certain devices can download until the update is “published”? So like a fleet of beta-devices and when the update is tested properly, it can be published to all devices.

Thanks in advance!

I am aware of three potential solutions:

1- Staged releases
Use/adapt the “staged releases” scripts found in this repo: https://github.com/balena-io-projects/staged-releases

  • “Rolling release” is the feature by which all devices update ASAP when you push an update to your app. The disable-rolling-release-on-fleet.sh disables that.
  • To trigger an update to a specific device when that device’s user “clicks on a button”, I guess the set-device-to-a-release.sh script could be used/adapted. To be honest I’m not sure if the device itself can run that code, but if not then I suppose the device could contact a server of yours, and the server could run the script.
  • Is it possible to push an update that only certain devices can download until the update is “published”?
    Here enters the update-test-group.sh script.

2- Update strategies
An alternative approach that may be suitable in some cases is to change the supervisor’s “update strategy” as described in this page: https://www.balena.io/docs/learn/deploy/release-strategy/update-strategies/

The strategies are “download-then-kill”, “kill-then-download”, “delete-then-download” and “hand-over”. The “hand-over” strategy is the one potentially useful here, as it could deliver zero downtime if the device has sufficient resources and the app’s logic can be modified to accommodate it.

3- Application update locks (“lock files”)
You’ve mentioned you are already aware of this solution, but I wanted to add it to this list for the benefit of anyone else reading this post. Your app can create a lockfile that prevents the supervisor from killing and restarting the app, but updates will still be downloaded to be applied once the lockfile is removed. https://www.balena.io/docs/learn/deploy/release-strategy/update-locking/

Do you think your use case is covered by one or a combination of these options?

Regards,
Paulo

1 Like

Hi @pdcastro,

Thanks for the quick and detailed response. I will look into all the potential solutions you’ve mentioned!
The staged releases looks like something I will use for the groups, so I can create testing/beta/production groups.

About the update locks, if I understand correctly, the containers will be downloaded but only installed when the update locks are removed? Because that would be a possibility in combination with the staged releases. Downloading the update and only installing it when something is triggered sounds like something that would work and be fast. The only question then is, is it possible to check if there are new containers “waiting to be installed”? Because then I can give the user feedback that new containers will be installed.

Thanks in advance!

@bversluijs, a colleague has pointed out that the update locks are created under /tmp and are automatically removed on reboot, so consider that the app might end up being updated if the end user reboots or power cycles it. This may be a good or a bad thing for your use case.
Your understanding matches the documentation, that the containers will be downloaded but only installed when the update locks are removed. I think you’d need to know how long it takes for the the update to be triggered after the lock is removed – it might be OK to wait 5 seconds, but maybe not 10 minutes. I’ll have to carry out some investigation and come back to you on this point, and also on whether / how your app could tell that updates are waiting to be installed. (If it’s not currently possible, it sounds like a useful feature request to be implemented.)

Regards,
Paulo

Hi @pdcastro,

Thanks for your explanation. When the system reboots, it’s currently no issue that the newest updates are going to be installed. However, we would like to have this process in our control. So when we decide in the future, we only want it when the user interacts, we can implement this. Because when the device is booting, and no kiosk is starting, the user thinks something is wrong. So that may be an issue.

Regarding the time how long it takes that an update is installed / triggered to install, that would be a great feature. Or something like a force to install now. Because I can show to the user in the Electron app what the device is currently doing, and it would be useful to show when the device is really installing the updates. And before that whether an update is available, locally (when it’s downloaded already) and remote (when it’s not downloaded but can be downloaded).

Thanks in advance!

@bversluijs Supervisor maintainer here, hopefully I can answer your questions.

The update locks themselves haven’t really aged well, and we plan to overhaul them fairly soon, although we’re still in the architecture phase of that.

For your above questions, the length of time until a release to be installed after a lock has been released is not usually determinable, but you can force this by calling the /v1/update endpoint on the supervisor API. This tells the supervisor to trigger the update process, pretty much instantly.

As for detecting when the update is ready to apply, again using the supervisor API, you can check for a downloaded status on the services. The endpoint to call is

curl --header "Content-type:application/json" "$RESIN_SUPERVISOR_ADDRESS/v2/applications/state?apikey=$RESIN_SUPERVISOR_API_KEY"

which will return an object similar to:

{
  "supervisortest": {
    "appId": 1011165,
    "commit": "42a5d01723cac00538fb10df5ea0a671",
    "services": {
      "main": {
        "status": "Downloaded",
        "releaseId": 688486,
        "downloadProgress": null
      }
    }
  }
}

Let me know if this helps you!

@CameronDiver, thanks for the explanation! I’ll let you know if everything works like expected and if I’m missing any features.

Regarding the update locks, you’re not planning to remove them completely? Because my plan was to always have the update lock set, so the containers don’t update unless a user / the custom system asks the OS to. I think this is the correct way to reach my goal?

And since I’m already asking some questions about the updating, what’s the best way to restart another container when a container is updated? The ElectronJS app browses to the NodeJS app (going to http://nginx_container/, and nginx serves the NodeJS/ExpressJS server via a proxy), but when the NodeJS container is updated, the ElectronJS container has to restart of some sort.

I have 2 options in mind:

  • Restart the whole ElectronJS container when the NodeJS container is restarted (I don’t know what the best way is to do this yet)
  • Refresh the ElectronJS browser when it detects that the NodeJS container is offline (Checking connection of socket or have a healthcheck ping to check the NodeJS container version, or both)

And today I had a download loop of an image. I changed the image from node:8-alpine to node:8-slim, because I have a library that requires glibc and doesn’t work in alpine (which is a real bummer because of image size). Only after rebooting the device, the image successfully downloaded. But ofcourse this is not the way the update process must behave. It didn’t say it failed to download, but just restarted the download at 0% at started again up until 50%. I’ve tried it with delta updates on and off, but both failed and restarted the download at around 50%. Do I have to create a new topic for this? Or is this topic just fine?

Anyway, thanks for the response and thinking along for a solution. It makes using Balena for this and future projects a lot better!

Certainly not - we are thinking about how to make them more resilient and helpful, whilst maintaining full backwards compatibility - quite the undertaking :slight_smile:

I would always recommend using staged releases instead of update locks for long lived locks, but different situations call for different methods - so if this works for you then I’d say go for it.

For your other question, we do have update strategies which aim to cover cases like this, you can find out more here: Fleet update strategy - Balena Documentation

For your last point, if the device is not in this state anymore, I couldn’t really say what had happened. If the device gets into this state, you could make another topic and share the logs with me and I might be able to see what’s going on.

We have a very similar need, but haven’t found an easy way of reaching our goal.
We would like a notification to pop up on each device: “software update available”, and it is up to the end user to accept when (or if) he wants to upgrade. Much like your smartphone works.

Our best approach so far, seems to be to setup a dedicated server “upgrade-manager”, that works as a middle man between devices and the balena backend.

A device may then periodically contact the upgrade-manager and say:

Hey I am device “XYZ”, are there any updates available for me?
The upgrade-manager will have all information from the balena-cloud.
The upgrade-manager can have some rules that based on device tags and release tags, will determine which releases should be offered for the device “XYZ”.
When the user accepts an upgrade, the upgrade-manager takes care of requesting the upgrade from the balena-backend.

In principle this functionality can live on each device, but this would require that each device has the high-priviledged access to the balena-cloud. Which might not be a risk you want to take.

We have not implemented this yet, but plan to do, unless we discover a better way.

@krix the method that you describe would be the one that I would recommend - regarding the permissions, you can use the device api key for this purpose (using the io.balena.features.balena-api label, documentation here: https://www.balena.io/docs/reference/supervisor/docker-compose/#labels). You can assign this label to a single service as well, to reduce the reach that other services have.

Balena SDK may also be helpful, as it provides a lot of helper methods for doing this kind of thing.

I assume you’re using pinned releases to do this? The device api key has the permissions to do this.

Your use case sounds interesting, I’d love to hear more about it!

@bversluijs @pdcastro
I haven’t been able to use the combination of “disabling rolling releases” and “update-locks”.
Could it be that these don’t go together?
If I point a device to a specific release it seems to update regardless of a lockfile “/tmp/balena/updates.lock” being present…

What do you mean by “disabling rolling releases”? Using pinned versions + update lockfiles should stop the device from updating to a new pinned version until the lockfile is removed, unless you pass the force option when pinning the version.

@brownjohnf, I think @krix meant the disable-rolling-release-on-fleet.sh script I mentioned in the 2nd comment in this thread. By the way, @brownjohnf, when you say “pinned versions”, is it the same functionality as the pin-devices-running-release.sh script?

I was now looking at the implementation of those shell scripts and they are basically making a few queries to the balena API. By the way, for efficiency and flexibility, app developers should probably make the API queries directly in their programming language of choice, rather than running the shell scripts in child processes.

@CameronDiver wrote that he “would always recommend using staged releases instead of update locks for long lived locks,” and I’m pondering if there is a sequence of steps that minimise the need for update lock files (/tmp/balena/updates.lock), while delivering everything the users are asking in this thread.

Generalising a bit, I understand that the needs are:

  1. A balena user app (based on end user input) controls when/if the app container(s) get updated. (An app update pushed to the cloud does not automatically trigger update on the device.)
  2. The app needs a way to know whether an update is available, to notify the device’s end user.
  3. The app needs a way to trigger immediate download of an update, if one is available. But this should not automatically cause the update to be applied, once the download finishes.
  4. The app needs a way to know whether an update download has finished.
  5. The app needs a way to apply the update immediately, after download has finished.

It looks like the disable-rolling-release-on-fleet.sh and set-device-to-a-release.sh scripts deliver on the first requirement. They make these API calls:

curl -X PATCH "https://api.$BASE_URL/v4/application($APP_ID)" -H "Authorization: Bearer $authToken" -H "Content-Type: application/json" --data-binary '{"should_track_latest_release":false}'

curl -X PATCH "https://api.$BASE_URL/v4/device($DEVICE_ID)" -H "Authorization: Bearer $authToken" -H "Content-Type: application/json" --data-binary '{"should_be_running__release":'$RELEASE_ID'}'

These calls set the should_track_latest_release field in the Application object and the should_be_running__release field in the Device object:
https://www.balena.io/docs/reference/api/resources/application/
https://www.balena.io/docs/reference/api/resources/device/

For the app to know whether an update is available, I’m thinking that the app could query the Release object:
https://www.balena.io/docs/reference/api/resources/release/
The Release object has fields like created_at and several others that the app can compare with the release it is currently running. I assume this is sufficient for the app to decide that an update is available.

By the way, note that commit and release ID can be translated. Given a release ID, the Release object can be queried for the commit hash. Given a commit hash, the get-release-id.sh script shows how to get a release ID:

curl "https://api.$BASE_URL/v4/release?\$select=id,commit&\$filter=belongs_to__application%20eq%20$APP_ID%20and%20commit%20eq%20'$COMMIT_HASH'%20and%20status%20eq%20'success'" -H "Authorization: Bearer $authToken" | jq '.d[0].id'

For the 3rd requirement, to trigger download of an update, the app could change the value of the Device.should_be_running__release field to the desired release, and then use the supervisor /v1/update API to trigger the download as @CameronDiver pointed out. But before that, if the app does not want the update to be automatically applied when download finishes, the app should create the /tmp/balena/updates.lock update lock file – but only for the duration of the download.

For the 4th requirement, again as @CameronDiver pointed out in his answer, the /v2/applications/state supervisor API could be used, to check for a status value of “Downloaded”.

Finally, for the 5th requirement, the app would delete the update lock file and make another call to the supervisor /v1/update API. This step is not needed if the lock file was not created in the first place.

Disclaimer: I have not tested these steps! This post is just some thinking and needs validation. If you can confirm or reject some of the steps/assumptions, please share.

@krix, if the lock file seems to be ignored, check perhaps if the Enable Lock Override option was selected in the web dashboard, device summary screen, or if the BALENA_SUPERVISOR_OVERRIDE_LOCK configuration variable was set to 1 in the device configuration screen, as described at the end of the update lock documentation page: https://www.balena.io/docs/learn/deploy/release-strategy/update-locking/

My lock file problem (sorry I did not mean to hijack this thread) is resolved now. I had created the lockfile /tmp/balena/updates.lock.lock and not /tmp/balena/updates.lock. However it does not really do what I expected: if I restart through /v1/restart the lockfile is preserved (which is ok), but the application won’t start with the old not-updated application as I would expect. The balena OS stays in “booting” until the lockfile is removed.

Hi @krix,

Just to confirm, if you restart the application via /v1/restart with a pinned release and a lockfile in place, the latest release is deployed when the application restarts? Or am I misunderstanding something about what you are seeing?

Thanks!

Sorry, let me more accurate:

  • create lockfile
  • change pinned release to something else
  • await download of new release
  • call end point /v1/restart
  • The restart is refused, reply: Updates are locked: EEXIST: file already exists

The same is seen for /v1/reboot and /v1/shutdown.
I was hoping that it would be possible to restart the application and/or the complete device and remain on the existing release. But that is apparently not possible.

Reboot and shutdown actions are also blocked if there’s a lockfile set, either the application needs to remove the lockfile, or you have to set a lockfile override (can do e.g. from the dashboard). Alternatively have to call the supervisor endpoint with the force value set to true, see e.g. the documentation at: https://www.balena.io/docs/reference/supervisor/supervisor-api/#post-v1-reboot

I’m just checking whether the “restart” endpoint (application restart) has the force option too, it should have, but then we’ll have to add it to the docs as well (not listed on that, but listed on reboot, shutdown…)

Let us know if you had a chance to try this out!

Try it out? Not sure which test you are referring to. I did test that the removing the lockfile or using lockfile override will enable the reboot/shutdown/restart actions. This works and is not a problem. I was just saying that the way it works is too limited for me. I can not really use the lockfile as a means for the user to decide when to effectuate the update, because if he for some reason needs to restart/powercycle the device, the upgrade is enforced.

@krix meaning calling the API with the force: true enabled, which is a one-time override of the lockfile.

But yeah, that extra information you mention adds to the conversation. I don’t think lockfiles is the way for you to go there because lock files are meant to be temporarily not allowing updates.

I guess there’s no current model to “download, but do not update” in balena. The closest is pinned releases, for which there’s an example repo, of how to use the API calls:

In this case, I believe it would be something like:

  • the device pinned to a release
  • enable some user interaction that would enable the user to change the pinned release
  • this then would result in an application download and update

Some downsides:

  • the download happens after the repinning, thus the change is less immediate
  • you likely will need an external service you create that would get the application releases that are available, and notify the device, and also hold the API key that can repin a device to a new release (the device cannot do it itself). A small external server could do such a thing. We don’t have an example of that (yet), but this question pops up frequently enough to revisit it.
  • also note, that this doesn’t use the locking feature at all

Does this above process make sense? Would it be closer to what you imagine? (except the download being ready on the device)

Hi @krix, just a ping to see if you’ve managed to try out our suggestions yet? :slight_smile: