That is strange. A bad configuration could be possible. But then I’d be surprised to see downloads working and not working. Flakiness usually points to something else. And in many cases flaky Ethernet/wifi
Where have you deployed openBalena?
What device are you trying with what balenaOS version?
Does this happen with different devices/os versions? or just one?
We probably need more logging. Is this log from the device side? What does the openBalena side log look like when it works/doesn’t work?
So it sounds like the backend is instructing the Supervisor to change releases to use your new image, but either;
the image hasn’t been pushed yet, so it’s missing.
or the auth token the device is using isn’t valid for that resource, so it appears missing.
Given that the auth tokens wouldn’t change at all, I think it has to be the first scenario; the instruction to change images is happening faster that the registry can complete the upload. This is a very unusual scenario though, and not something I have personally come across during my testing with openBalena.
I will try to find some time to replicate (I have a Pi3B though, not an OrangePi) and confirm the theory, but thanks for pointing out this potentially annoying behaviour
we just recently switched to a different update strategy. Before: download then kill now, kill then download. The new strategy seems to work much smoother for some reason and we have not seen the 404 error anymore. Not sure why. To your theories I am not sure why the download should work then when you simply unplug and replug your system.
@torben, Rich has not got to investigate further but is still going to try take a look at the problems you encountered. Meanwhile it sounds like you found a solution that wors for you ?
Before: download then kill now, kill then download. The new strategy seems to work much smoother for some reason and we have not seen the 404 error anymore. Not sure why.
Well, this sounds potentially compatible with @richbayliss’ theory that “the instruction to change images is happening faster that the registry can complete the upload.” If it is a timing issue, some sort of “racing condition” where a few seconds makes a difference, then I imagine that the “kill then download” strategy may afford the device the few extra seconds before attempting the download (it takes a bit of time to kill the containers before the download is attempted). (I am just reasoning without having actually tested or measured anything…)
While we investigate this issue, let us know in case this issue is “blocking” your app development or if you have additional findings or thoughts that might help resolving it. Thanks for reporting it!