Hello @sgserg after speaking with my colleagues, we have an hypothesis of what is happening.
If the base image has changed, the delta will need enough disk space to store all the layers of the base image that have been updated. If this is the case, try to allow delete-then-download update (at the expense of downtime and bandwith).
Sure, glad to take a look. Are you seeing the same error messages such as “Delta still processing remotely. Will retry…” ? If you can attach or send us the device diagnostics that may help.
@mpous@alanb128 Sorry to bump this after over a year but we just ran into this same issue, and I was wondering if there is another workaround other than the “delete then download” method which would result in downtime.
We recently pushed an update to our fleet which inadvertently included an updated balenalib/raspberrypi4-64-debian:bookworm image, and now we are having devices run out of storage as they try to apply the delta updates. I’m assuming it is related to this comment above:
If the base image has changed, the delta will need enough disk space to store all the layers of the base image that have been updated. If this is the case, try to allow delete-then-download update (at the expense of downtime and bandwith).
If we kill and delete the old containers and let them redownload from scratch, the devices download the images fine - which I guess is in essence the same as applying a “delete then download” strategy… but I’m hoping there might be a better way?
And separately, mostly out of curiosity, but also because I’d like to see if I can help - what is the reason that balena-engine is not able to handle deltas appropriately when base images change? It seems to work great otherwise…
@mpous we are running openbalena with a delta server that uses balena-engine to create the deltas. While I know that potentially introduces a number of other variables, I believe this is the issue we are seeing.
@mpous it is our own open source delta server, which is based on ‘balena-engine’ as detailed here and here.
Could I ask it a different way - is the issue noted above still present in balena cloud? (i.e. if base images change, does that require devices to download new base images plus changed layers)? And if so, is there some kind of fundamental reason why this can’t be handled differently / can I help resolve it?
It feels like a fairly significant issue, because people who use delta updates rely on them being small, and I suspect most won’t appreciate that you need to leave 2x your fleet size in storage headroom on your device just in case base images change otherwise the update will get stuck. Especially when it seems that the base images do change from time to time, the debian one I noted above was just changed two weeks ago but retained the same tag - so even if you were “pinned” to that image, it would have changed, and would necessitate the storage headroom.