Conflicting IP address for containers

I have experienced a few rare times containers not being able to access each other via the default bridge network. Some of the times that happened when we were experiencing heat issues and problems on the network interface in hot days and related to Thermal Operating Range of BalenaFin

The latest time (during a heat event), some of the containers were assigned the same ip address by internal dns resolver. This is the log I could obtain (anonymized the container names):

root@ef4b243:~# balena ps -q | xargs -n 1 balena inspect --format ‘{{ .Name }} {{range .NetworkSettings.Networks}} {{.IPAddress}}{{end}}’ | sed ‘s#^/##’;
a____2409535_1428871 172.18.0.8
b____2409534_1428871 172.18.0.3
c____2409532_1428871 172.18.0.9
d____2409540_1428871 172.18.0.10
e____2409533_1428871 172.18.0.7
f____2409538_1428871 172.18.0.3
g____2409546_1428871 172.18.0.4
h____2409544_1428871
i_____2409547_1428871
j_____2409536_1428871 172.18.0.6
k____2409531_1428871 172.18.0.5
l____2409542_1428871 172.18.0.2
resin_supervisor

This was making container a not be able to reach container b

After restarting container f , it worked. No errors happened after that and I could not yet reproduce the error.

The containers that experience the conflict are using the default bridge (no network_mode configuration)

Is this a known issue? Is there a configuration that could avoid these conflicts to happen?

After the restart of container f, it got a new different IP address.

Hello, did this IP conflict occur after an update?

No update was performed.
I could not reproduce the error yet. If I get it again I will update this topic.

Hi, when you are able to reproduce this again can you notify us again so we can take a look please in the device?

I have experienced this problem once again in two different devices. In one of them I did reboot, the other one is still with the error present if you want to take a look (a4f1619a6a677b24ba125ba734605423).

1 Like

This is the log I could obtain (anonymized the container names):

root@a4f1619:~# balena ps -q | xargs -n 1 balena inspect --format ‘{{ .Name }} {{range .NetworkSettings.Networks}} {{.IPAddress}}{{end}}’ | sed ‘s#^/##’;
a_2409535_1428871 172.18.0.8
b 10.114.101.2
c_2409533_1428871 172.18.0.6
d_2409532_1428871 172.18.0.8
e_2409534_1428871 172.18.0.7
f_2409544_1428871
g_2409546_1428871 172.18.0.3
h_2409540_1428871
i_2409531_1428871 172.18.0.4
j_2409547_1428871
k_2409538_1428871 172.18.0.3
l_2409542_1428871 172.18.0.5
m_2409536_1428871 172.18.0.2
n
root@a4f1619:~#

1 Like

Hi @fcovatti, thank you for the follow up and providing support access. This is an unusual issue. We started investigating this. I see some other users reporting this issue on Docker’s repository as well:

I have raised this issue to my teammates maintaining balenaEngine. They might look further into your device for debugging purposes. We will post an update as we have any news to share.

Cheers…

Hi, there has been an update to the balena engine to version 19.03.13 that happened in BalenaOS v2.50. It would be interesting to see if the issue persists on the newer engine version. Also, any debugging/fix would need to happen using the most recent engine.
Could you please try to update your hostOS version to the latest on a test device and see if the problem still occurs?

We have tried updating the OS [balenaOS 2.60.1+rev1] and it still happens.

Hi,

Can you enable support so we can take a look? We presume you haven’t added any specific network (nmcli) configurations, correct?

John

Access granted.

Thanks. Can you provide the device UUID?

ef4b243a1d5072822b1cfe9bc382ed64

Hi,

This is an older Moby thread that might offer some insight. The bottom line was to run rm -rf /var/lib/docker/network/files on the hostOS to reset the network database. This issue was related to unclean stops of Docker (balena-engine) which led to the device “losing track” of assigned IP addresses stored in SQLite.

John