websocket disconnect between containers

I am taking over a project for a client and am just getting familiar with BalenaOS and the customer’s application. I do have some embedded Linux experience and I feel pretty comfortable with the HW, OS, and Application. However, I am seeing an issue that I think the community will be able to help me solve faster than I can by myself.

We are currently running a multi container node.js UI application. The UI front end container is communicating though a websocket to the backend. When the device is connected to the internet, the application seems to run fine. When the device is disconnected from the internet, the websocket client detects a ping time out and attempts to reconnect to the sever in the backend container every second. This reconnection does not happen until the internet is reconnected. When the internet is reconnected, the issue resolves itself almost immediately.

I assume this TCP connection between containers should not be affected if the device is disconnected from the internet. Am I wrong here? Can apps run in BalenaOS without an internet connection?

@embeddeddan Hello and welcome to the balena forums!

Are you using local IP addresses to communicate both services? or you are using any service depending on your router which is not installed on your device (e.g. DHCP, DNS, etc.)

Let us know more details to try to help you.

@mpous, thanks! The work you guys have done with balenaOS is great and I am developing a deeper understanding of embedded Linux just by reading the various documentation available.

Regarding your questions to my particular application, I will try to provide you some additional info. Note: I ran into some issues here tyring to embedded multiple images into the post.

The IP address assigned to the device by customers network is not the same as the IP addresses used by the containers. The device IP is 192.168.0.41 and the IP of the API and APP containers are 172.17.0.2 and 172.17.0.5.

When I run a find command from the Host OS terminal for dhcp, I see the info in the attached images

I don’t know what it all means but it does appear that there is a DHCP client and server available with the presence of udhcpc and udhcpd, correct?

When I run the same command in the api container, nothing is printed.

When I run the command in the app this is what I see.
image

I also ran the find command looking for anything related to dns.
In the HostOS is see nothing.

In the api I see this…

In the app this…

Here is a snippet of the docker-compose.yml to see how the ports are being exposed for each individual container.

This is how I think it probably should work. Correct my where I am wrong.

  1. HostOS / Balena Engine assigns IP addresses to the docker containers using the dhcp server program udhcpd
  2. Individual containers expose the specified ports. For our api container it is 172.17.0.2:3000. Note: I do not seen any port exposed for the app container.
  3. webocket client in app container connects to websocket server in api container through port 3000.
  4. A connection is now established that is independent of the physical device being connected to the internet…

I will be interested in hearing your thoughts. I also appreciate your help.

Sorry for the multiple responses, but I could not embedded more than one image in my responses because I am new to the forum.

Thanks,
Dan

Hi Dan,

Because you don’t have any other network_mode specified in your compose file, your containers are using the docker bridge network (by default I believe it uses CIDR range 172.17.0.0/16). If you want to use the hosts network for any reason (the 192.168.0.0/24 range you mentioned earlier), you can add network_mode: host into any of the containers, and they will operate by listening to ports on the loopback of the host. This is more FYI and not totally relevant to your original question.

What is the specific mechanism you are using to connect the WS client to the WS server? Are you connecting to the IP directly like wss://172.17.0.2:3000? If so, you might want to either specify localhost or use the service name directly like wss://api:3000 in your client.

This link should provide you with a much better explanation about docker networking than I can!

Keep us updated!

I am another developer working with Dan on this project.

@nucleardreamer the mechanism we have been using to connect the websockets is

http://localhost:3000

We will try using:

http://api:3000

edit:

I just double checked. we have been using http://api:3000

Hi @cowens - welcome to the forums as well!

What library(s) and version are you guys using for websockets? (ws, socketio, etc)

Did you try switching the project to use network_mode: host to see if this is possibly something related to internal DNS? It’s definitely strange behavior.

Last question as well: is it possible to have a link to your git repo so we can look at it or test ourselves?

We are using,
socket.io-client v4.1.3
socket.io v4.1.3

We have not tried host mode yet but may in the future to try as a temporary fix. From my understanding host mode would not be the long term solution as none of our services require access from the outside world.

I cannot comment on if we can provide access to the repo.

Hey @cowens ,

A couple things to try out, that may point towards a possible reason for the WS disconnect.

  • Changing your API_HOST to 127.0.0.1
  • Changing your API_HOST to 0.0.0.0 - this is not great long term (security wise), it is essentially binding to any network interface at all, which may expose it unintentionally to other places.
  • Adding hostname of 0.0.0.0 or 127.0.0.1 to the http server listen method, which should be the second parameter like this: httpServer.listen(api.port, '0.0.0.0')
  • Running network_mode: host like mentioned above, to rule out the docker bridge network

When I am able to have a little more time, I will try and reproduce this behavior here as well. Let us know if anything works / changes!

Thanks for the suggestions we will give them a try!

We tried changing API_HOST to 127.0.0.1 while simultaneously changing httpServer.listen(api,port, '127.0.0.1') and the app service was unable to connect to the api service.

We did the same thing as above but with 0.0.0.0, with the same results, app service unable to connect to the api service.

Thanks for trying out the options and getting back. Have you tried the network_mode: host options suggested by my colleague above?

Also, are you able to make a curl request to your API via the exposed port 3000? One alternate might be to get the IP by hostname first (for API in this case), and then use the resolved IP to connect to the websocket service. Let us know if we get any closer with any of the changes here.

Regards,
N

Hello @cowens let us know if the suggestions from my colleagues are working now or you are using other solutions. Thanks :slight_smile:

@nitish, yes when the API gets disconnected from the APP we are able to ssh into the APP container and make a curl request to the API container on port 3000.

curl "http://172.17.0.5:3000/socket.io/?EIO=4&transport=polling"

Sorry for the late reply. This is a project we cannot immediately test changes, sometimes there are 2-3 day turn around time between making changes and being able to actually test.

When the app service is disconnected from the api service we were able to curl from the app container to the api container on port 3000.

When the app service is disconnected from the api service and we reboot the app container it was able to reconnect to the api container – even with the network cable unplugged.

We tried disabling the is-online internet connectivity function. The app service disconnected from the api service but it appeared the app service was still receiving updates from the api service via the websocket.

I am currently trying to implement the switch to host mode but am unable to get basic connectivity between the containers to work. I can certainly look this up myself but when we use host mode can we still use http://api:3000 to connect to the containers?

Do you guys know if it is possible to reallocate resources to specific services? For example, maybe give our app container more RAM + CPU time?