Supervisor /v1/device API missing commit if update pending.

Hi,

We’re using the supervisor API /v1/device request to query the current release hash and download status (if applicable), but have found that if update_pending == True, the response no longer contains the commit field. The documentation implies that it will always be present so we expected it to continue reflecting the current release (not target release) if an update was pending but held off because of an update lock. We certainly didn’t expect it to disappear!

Is this expected behavior?

Thanks,

Adam

Hello, I had a quick look at the supervisor source code and from what I could understand we seem to remove the commit field when an update is in progress, however I am not sure why this is done exactly and I have reached out to the supervisor team to understand better. I will let you know as soon as I hear back from them

Thanks, @nazrhom. I’ve been doing some more testing and it seems like sometimes I get the following response, even when nothing has been updated:

{'api_port': 48484, 'ip_address': '192.168.0.140', 'os_version': 'balenaOS 2.50.1+rev1', 'supervisor_version': '11.4.10', 'update_pending': True, 'update_failed': True, 'update_downloaded': False, 'status': 'Installed', 'download_progress': None}

For the last hour or so I’ve been changing some stuff in the shared volume and restarting services/rebooting the device. I haven’t done any release builds, changed the device’s pinned release, or anything like that at all. Sometimes, but not every time, when I power cycle the device the supervisor comes back with the above status indicating that an update is pending. Restarting the service does not change the status. If I power cycle the device again, on the other hand, it seems to clear this condition maybe half or more of the time.

It seems like I can replicate this condition if I first restart the service manually at least once and let it come up to steady state before power cycling the device, though it’s hard to tell if there’s a real correlation there.

There’s nothing I can see in the host OS journal that would explain why it’s saying there’s an update pending. Any ideas?

Quick update: just rebooted again and this time my first query came back with update_pending == True as above, and then a second query about 1 second later came back with false but still no commit field:

{'api_port': 48484, 'ip_address': '192.168.0.140', 'os_version': 'balenaOS 2.50.1+rev1', 'supervisor_version': '11.4.10', 'update_pending': False, 'update_failed': False, 'update_downloaded': False, 'status': 'Idle', 'download_progress': None}

All queries after that continued to come back like that - update pending false, no commit field - though one of them got a connection refused error instead. Not sure why it’s different this time. It seems to be non-deterministic.

Hi Adam,

Thanks for the update and the additional information. We’re still tracking this down with our engineers to get a read on the behavior you’re seeing.

John

Hi Adam,

Thanks for your feedback and your investigation.

The device supervisor will set update_pending = true while it’s applying the target state reported by the cloud. On start, the supervisor does not know yet if it has to perform any updates so it sets update_pending until it has verified it is at the correct state, therefore the behavior you have observed is what is expected. As you have also observed, the change in this value correlates with the value reported by the commit field in most cases but not always.

However, I tend to agree with you that the commit should be available whenever there is an release installed on the device, independently of the update status. I have created an issue to track this.

We’ll update you when we have updates on the issue

Cheers
Felipe

Ok, that makes sense. Thanks, Felipe.

You might want to update the documentation so it’s clear when update_pending will be set. As written, it implies that it will only be true if there is an update to the current software, not also just always at boot.

Personally though, I think that behavior isn’t really great. Ideally, update pending is something the software should be able to use to put a notification in front of a user whenever an actual update was pending. With this boot up corner case, they’d potentially get an update notice when none was available, which would be quite confusing.

1 Like

Hey Adam, I just created https://github.com/balena-io/balena-supervisor/pull/1507 to improve the wording of this value. I don’t think we’ll change the behavior because to me when you’re checking/determining if there is an update that matches the description of pending. I’ll bring it up with some more people though.

I could go either way on that. Technically “pending” means awaiting a decision, and in theory that decision could be “is there an update available or not?” In that sense, updating the documentation would definitely help.

It would still be really useful from our end though to have a signal that is “update is available,” i.e., true only if there is a new software update ready to be installed (ideally with the hash of what the update release is rather than just a bool). That way we could notify users in a UI and have an interactive “click to install” type of thing. With the current update pending behavior, the UI might show a notice when no update was actually available, at least temporarily until the supervisor set it back to false. We wouldn’t want to confuse users with that.

The hash would be nice so we could use it to query for release tags. We we use tags to store software version numbers since displaying a UUID is not particularly user-friendly. Our build script tags each release automatically via the Python SDK.

Adam,

Thanks for the feedback, I know the docs have been updated around this so that hopefully clears up the confusion.

It would still be really useful from our end though to have a signal that is “update is available,”

So there is only going to be a timeframe that this is useful if you have pinned your device version to X and then pushed a new release Y; your device will not push past the pinned version, so it might make sense to advertise to the user that there is version Y. However I don’t see this as the remit of the Supervisor as there could be many reasons why you pinned it in the first place. If the device isn’t pinned then it should already be doing the update automatically and so there isn’t a need to inform the user.

Kind regards.

Rich

Hey Rich,

Yeah, that’s fair. Right now we have most of our customer devices tracking the application, and using the update lock to prevent the supervisor from resetting stuff when they’re actively navigating. That’s working for the moment, but we’re just thinking ahead design-wise. In hindsight, I wonder if as you said, the supervisor is actually not really the right way to get this information anyway.

For the most part, downloading updates automatically but holding them off via update lock may work for some users. That being said, we want to have the option to not do that though since the automatic download could have a big impact on bandwidth and our container deltas are often quite big. Most of our users are connected via cell, and when they’re navigating that definitely needs to take priority over downloading a release. Right now pinning appears to be the only way to prevent that. I don’t believe there’s a way to tell the supervisor to check for updates but not download them. It’s all or nothing, no?

We also don’t want to change software versions unexpectedly if they’re actively testing something, particularly if the change includes something that may not be backwards compatible. Don’t want to have any repeatability issues.

From https://forums.balena.io/t/update-when-user-interacts/, our plan was to use pinning to resolve this. Basically, instead of tracking the application, devices would be pinned to some release. The software would check if there was a new release available by seeing what the application was pinned to vs what the device is currently running. We tag releases with a version number since the release hashes don’t have any notion of semver, so our software would have to query the release tags via the Python SDK (and balena-api permissions).

Does that sound about right?

Thanks,

Adam

Hi Adam, sorry Rich’s last message was sent as an internal comment by mistake so didn’t make to you. I have pasted it below:

Right now pinning appears to be the only way to prevent that.

The device-release pinning is the way we want users to control which release is running, the update-lock is (as you say) only to be used to prevent an update in the middle of a critical operation.

Your mechanism to read the release tags is a good one, and where I would go with it :ok_hand:

It’s interesting to read how users are trying to solve these problems, and our goal is to reduce the friction you face, so I will make a point to highlight this to product :+1:

Hi Shaun,

No worries. Glad to have some confirmation that we’re thinking along the right lines.

Is there any way to get the current application release through the supervisor if the device is pinned to a different release, or does that require API access? Not sure it matters too much since to query the tags you still need API access, mostly just curious.

Hi Adam,

You can get the current release the device is running either through the API or throught the Supervisor API, which provides the current application commit.

Here are some related resources


The documentation for the /v1/device commit field says “Hash of the current commit of the application that is running.” In practice though, it’s actually the hash that the device is actively running, including if it’s pinned, as opposed to the hash that the application is set to, which the device in question may not be running right now. We’re using /v1/device to display the device’s release on our UI, and I just checked against a device I know is pinned, so I’m fairly confident that that is correct.

I was curious if there is a way to query the hash that the entire application is configured for in the dashboard, i.e., the default release if a device is not pinned. You can definitely do that via the API, but I was just wondering if it was possible to do through the supervisor without having to enable balena-api for the container as well.

Hi again Adam,

You’re right, I misunderstood your question. The commit shown by the /v1/device supervisor endpoint is the current commit of the application. There is no way to query the application releases throught the Supervisor API, only throught the API.

The goal of the Supervisor API is to communicate with a specific device, while the general API is to gather information about the general fleet, including applications, releases, devices, etc.

Let me know if that answers your question.
Felipe

Yep, that makes sense and is what I thought. I just figured I’d check. We will need API access to query the release tags anyway so no real problem there.

Cheers,

Adam

Great, let us know if you have any more questions!

Just to confirm: I believe the original issue is still not resolved, correct? commit still disappears whenever update_pending is true (not sure if it also disappears under any other circumstances). You mentioned you created an github ticket to track that one, did you have a link for that?

Yes, you’re correct, we are still discussing about that. Here is the ticket https://github.com/balena-io/balena-supervisor/issues/1504