Balena Devices Supervisor Dies but Still on the Network

ko7eraven · May 17, 2021, 7:12pm

We are seeing some odd behavior from Balena supervisor on a RPi CM3 system. The system keeps dropping offline, and will come back after a reboot, but then goes away eventually. This happens even if there is no running container on the system (i.e. the application has been stopped through the dashboard). The device can still be pinged using its external IP address, but it seems like the supervisor has died. Is there any circumstance in which this would happen ?

Thanks,

Kevin

anujdeshpande · May 18, 2021, 6:16am

That is indeed strange!

Can you help us debug this

share the device diagnostics
get the supervisor logs from the device using journalctl
if the device is rebooting - for example if your local network ssh connection keeps disconnecting - can you enable persistent logging and share the boot logs using journalctl again?

These things will help us figure what’s going on with the device

ko7eraven · May 18, 2021, 10:28am

Thanks. This device is currently offline, I will get it restarted and share the information you have requested. Persistent logging is enabled.

Kevin

ko7eraven · May 27, 2021, 12:28pm

OK, I finally got some log information about this problem. Looks like NetworkManager is getting into a state where it can’t connect. This from journalctl -u NetworkManager. It hangs not connecting (for hours), but works on reboot. The wifi environment is fine, many other units are connected at the same time. Is this some kind of race condition ?

toochevere · June 2, 2021, 10:30pm

Hey Kevin. That’s interesting. I think we need to look at the device diagnostics to get a better picture. Can you post the health check info on the device, and then run diagnostics and post the results? (I think we need to consider the possibility of a hardware issue.)

Lizzieepton · June 9, 2021, 1:37pm

Hi Kevin, just checking in to see if you have managed to take a look at those device diagnostics for us so we can get a better look at whats going on here?

Topic		Replies	Views
balena supervisor died balenaOS	4	483	November 6, 2020
supervisor/BalenaEngine crash, device offline balenaEngine support , raspberrypi3 , network	2	342	February 4, 2021
DNS failure not caught by supervisor balenaOS	72	2269	November 11, 2020
Issues with offline devices Product support	9	636	September 1, 2021
supervisor copntainer suddenly stopped balenaOS raspberrypi3	5	387	September 11, 2020

Balena Devices Supervisor Dies but Still on the Network

Related topics