Hi,
at first, I am not an network expert …
On our future data acquisition environment we use thousands of devices running balenaOS. The communication of the balenaOS devices to the internet (balenasdashboard) works via a forward proxy ( nadoo/glider) which is running on a docker container on a balenaOS.
So for example we will have 5000 devices (Siemens IPC) with balenaOS that has an internet connection via an edge-device (Siemens IPC, balenaOS with forward proxy in an docker container)
My questions are:
- How many permanent/temp connections does each device (balenaOS) opens to the internet
- what could be the limit, if the limit is the number of available ports at the edge-device I need more edge-devices, if the limit is the CPU-load, I need an edge-device with more power
- I have to find out, is one edge-device enough or do I need more edge-devices
Thank you!
Hi @dev4iotattgw
Welcome to the forums.
I am forwarding your questions to the relevant team. we will get back to you.
Thanks
For anyone else following this I wanted to expose some information that was shared via our support agents:
Notes on the connections devices make
-
There will be one permanent connection to the VPN, one to the logging backend and there also short-lived requests.
-
one request every 15 minute to the state endpoint. This can be configured to be longer. Note that if a new release is made available, device will be informed thru the vpn and this request will be made sooner.
-
one long lived request to the log endpoint (as long as there’s logs to be sent), limiting the logs produced by your apps would dramatically reduce the length of this request.
-
one permanent connection to the VPN (can be turned off, but would loose some functionalities)
-
some NTP connections to calibrate the clock
-
any connection your apps would do
Incoming connections
For the incoming connections from the 5000 devices since they will all connect to a single listening port that should not be an issue. Quite possibly some limits should be risen for the process though like RLIMIT_NOFILE - those types of settings are usually exposed as configuration parameters for the proxy itself.
Outbound connections
For the outbound connections the number of ports can be an issue, but that can be overcome by assigning multiple static IP addresses for the outbound interface. Here I found an example for Squid that defines a set of listening ports that map to a set of outgoing IP addresses: Configuring Squid Proxy with Multiple IP Addresses - Life of a webmaster
Logging
For logging you can just decrease the number of logs your application produces. By default only a few supervisor messages are being logged by the OS, so the majority of what is being logged to the dashboard comes from the application itself.
A further note on New Releases
The most challanging part will be when a new release is available. When a release is available all devices will start downloading it through the balenaEngine/Docker service running on the host OS. That will put pressure on the proxy device as it will have to download a duplicated copy for the release from our registry and at the same time deliver it to all of the 5000 devices.
If the proxy server running on the proxy device is a caching proxy like Squid proxy then this will eliminate the need for the proxy device to download the release 5000 times (simultaneously). I give Squid proxy as an example only as there could be better alternatives nowadays.
Then for the opposite problem of delivering a release to all the connected balena devices from the proxy that type of pressure can be reduced by updating the devices in batches. For this purpose pinning devices to a specific release could be used. With pinning devices once a release is available you may pin the first batch of devices to it and then wait a specific amount of time and then proceed with the next batch. You may do that programatically through our API. Here is a number of links that can be useful for this:
To conclude
We do not see a problem with one proxy device handling 5000 balena devices. The two challanges we can think of will be release downloads and logging.
Hello JonJRich,
Thank you for your reply! I will start doing some tests with the recommended forward proxy and check the logging of our services. As the mills grid slowly in my work environment I will close that product support request and come back to you, if there are more questions coming up!
Kind Regards,
Patrick