High host OS CPU after some pushes on random boxes

Im currently testing some Raspberry Pis (3 x RPi3 and 1xRPi4) as we want to move everything over to Balena.

Everything is fine except for a strange issue where if I push a code/Docker file update (even a small one to a single service) the CPU/Load will become very high. It will move from circa 10% to 50% usage.


The picture shows the base CPU before the push. The increase it after the push and the sudden drop is rebooting the board

I have stopped all of the containers and then restarted them but the base idle CPU is still higher than before the push.

I have also done in the host OS:
systemctl stop resin-supervisor
balena stop $(balena ps -aq)
systemctl stop balena
systemctl start balena
systemctl start resin-supervisor

And it is the same.
If I actually reboot the device its immediately fixed

This doesnt happen on every box when pushes are done and it wont happen at all sometimes but it is doing it quite regularly, every couple of pushes

All the boxes are running the same code
The Balena Host OS is untouched

Any help would be appreciated as this could be a deal breaker for us
Jeremy

Hi @jezzab,

Welcome to the forum!

This is quite strange.
Its understandable that there is some cpu spike when the supervisor starts pulling in the new image, stops container, starts the new application.

But then the cpu load should go back to a nominal baseline (if the application didn’t add new load).

Bizarre. What is using the cpu? Is it obvious via the top utility?

We can have a look via support as well.
If you do a simple change, and then you have 1-2 devices that manifest the bug. And 1-2 devices that don’t, you can grant support access to the whole application and we can have a look.

Ideally, we’d like a nice and small Dockerfile for a repeatable test case?
Also, I’d like to know which OS version/image are you using? the latest 64 bit ones?

Regards
ZubairLK

top shows /sbin/init when it happens, is much higher on the boxes with issues and the top of the list.

RPi3s are running balenaOS 2.46.1+rev1
RPi4 is running balenaOS 2.46.1+rev3

Hi. That is indeed weird. If you enable support access and send us the UUID of a device exhibiting this behavior we can take a look. I believe /sbin/init is a symlink to systemd, so it’s hard to pin the exact cause without at least looking at the logs.

Support access has been enabled and the UUID is 2d38c1256d85a0340c4dc62acfbb9305

The above box is exhibiting the behaviour (2 are and 2 are not since the last push). The only way to be able to get it out of it is to perform a full reboot

Thanks
Jeremy

Do you happen to have deltas enabled?

No we don’t but should we? After reading about it, it looks very useful.

Hi, you could enable delta updates as it is indeed an awesome feature and recommended for production deployments (and will be enabled by default for all new apps soon). I’m unsure from the message above if delta updates were being suggested as a fix or a potential source of the problem but either way if you enable them and see if the problem persists it would eliminate them as a variable.