I have experienced a few rare times containers not being able to access each other via the default bridge network. Some of the times that happened when we were experiencing heat issues and problems on the network interface in hot days and related to Thermal Operating Range of BalenaFin
The latest time (during a heat event), some of the containers were assigned the same ip address by internal dns resolver. This is the log I could obtain (anonymized the container names):
I have experienced this problem once again in two different devices. In one of them I did reboot, the other one is still with the error present if you want to take a look (a4f1619a6a677b24ba125ba734605423).
Hi @fcovatti, thank you for the follow up and providing support access. This is an unusual issue. We started investigating this. I see some other users reporting this issue on Docker’s repository as well:
I have raised this issue to my teammates maintaining balenaEngine. They might look further into your device for debugging purposes. We will post an update as we have any news to share.
Hi, there has been an update to the balena engine to version 19.03.13 that happened in BalenaOS v2.50. It would be interesting to see if the issue persists on the newer engine version. Also, any debugging/fix would need to happen using the most recent engine.
Could you please try to update your hostOS version to the latest on a test device and see if the problem still occurs?
This is an older Moby thread that might offer some insight. The bottom line was to run rm -rf /var/lib/docker/network/files on the hostOS to reset the network database. This issue was related to unclean stops of Docker (balena-engine) which led to the device “losing track” of assigned IP addresses stored in SQLite.