Hi,
It looks like containerd on your device stopped working. In the newer version of balenaOS we enhanced our health checks to automatically detect this kind of failure (see https://github.com/balena-os/meta-balena/issues/1391).
I will try restarting balena engine service on your device if you don’t mind.
balena engine service has been restarted, and the device seems to function normally - it applied the target release successfully from what I can see.
I would suggest upgrading to the latest available version of balenaOS. As I mentioned health checks there are improved, and such failures should be handled automatically. Though, please note that this device runs a development build of the OS, so you will need to upgrade it manually, reflashing the device - we currently don’t support over-the-air updates for development builds.
If it’s a production device, we strongly recommend switching to production build.
Also, please note that you have container_name on one of the services in your docker-compose file, and it’s not supported by the supervisor running on the device and is simply ignored. It’s ok to have it there - just want to make sure you understand it has no effect.
I am running balena OS [balenaOS 2.47.0+rev1] which is the latest available for an rpi 3B+ and rosetta@home. I have 3 devices all the same and all the same os. One of them keeps failing with the messages:
Killing service ‘ui sha256:677ac37d9eeb9b74025d272bc8c756e85a0fd543642ff7cb4b86062b2e1589cc’
23.09.20 20:52:35 (+0200) Failed to kill service ‘ui sha256:677ac37d9eeb9b74025d272bc8c756e85a0fd543642ff7cb4b86062b2e1589cc’ due to '(HTTP code 409) unexpected - You cannot remove a running container 93d21d5e6187458187171b803962ac9b4bdc7bab2d079afb9c1f2cecf5b6c2a8. Stop the container before attempting removal or force remove ’
23.09.20 20:52:36 (+0200) Killing service ‘ui sha256:677ac37d9eeb9b74025d272bc8c756e85a0fd543642ff7cb4b86062b2e1589cc’
Hi @gratefulfrog and @shawaj, unfortunately we have observed this type of error when the container is refusing to exit. This may happen for different reasons: the container has a hold on a kernel resource, it is using a hardware device that may be malfunctioning. Sometimes the output of dmesg can show some information on possible causes of the malfunction.
Trying to kill the container directly using balena kill <container> will generally fail too, as it is the container that is in a state that it cannot be killed. There have been reports than doing balena rm --force <container> can work, or restarting the engine using systemctl restart balena can help too, but it may depend on the root cause.
If you have a device with a supervisor that is showing this message right now, please enable support access and let us know so we can take a look, try to pinpoint the underlying cause and run some tests.