We have customers that are concerned about the high data usage of our system and after taking a closer look we found there is a discrepancy between the actual data usage and what is expected according to this doc: Reduce bandwidth usage - Balena Documentation.
In our systems normal state with the default Balena settings our data usage is at about 4468 MB/Month
This data usage goes down to about 3811 MB/Month with Balena logging disabled
When stopping our service entirely and keeping Balena logs disabled the data usage is still 2142 MB/Month
Following the information here: Reduce bandwidth usage - Balena Documentation and setting these values:
BALENA_SUPERVISOR_CONNECTIVITY_CHECK = 0
BALENA_SUPERVISOR_LOG_CONTROL = 0
BALENA_SUPERVISOR_VPN_CONTROL = 0
BALENA_SUPERVISOR_POLL_INTERVAL = 86400000
The data usage gets down to ~1499 MB/Month
Now if we were to set BALENA_SUPERVISOR_HARDWARE_METRICS = 0 (we are running v12.7.0 so we didn’t test this value) according to the doc this should save us at most 168 MB/Month which would bring out data usage down to 1,331 MB/Month
According to the doc setting these supervisor values should bring the data usage down to 1.3 MB/Month.
What else could be causing the system to be using ~1499 MB/Month?
Thanks @sophiahaoui for reaching out. To understand your query better, can I ask you couple of clarifying questions:
Are the five config variables (BALENA_SUPERVISOR_CONNECTIVITY_CHECK, BALENA_SUPERVISOR_LOG_CONTROL, BALENA_SUPERVISOR_VPN_CONTROL, BALENA_SUPERVISOR_POLL_INTERVAL) what you have set and experiencing the ~1.499MB/moth usage or is it ~1499MB/month?
What is the OS version you have running along with v12.7 of supervisor? Also, any reason for not upgrading them to the latest version?
Can you check on the services running on the device other than balena ones, maybe some metrics still being batched out which are not related to balena?
I am also checking with our team to see if we have any other metrics/logging capture enabled other than ones highlighted with balena-supervisor.
With the these settings of supervisor variables:
BALENA_SUPERVISOR_CONNECTIVITY_CHECK = 0
BALENA_SUPERVISOR_LOG_CONTROL = 0
BALENA_SUPERVISOR_VPN_CONTROL = 0
BALENA_SUPERVISOR_POLL_INTERVAL = 86400000
We are experiencing 1499 MB/month, not 1.499
The Host OS version is balenaOS 2.80.3+rev1
We made sure to stop the services running on the device, so only balena is running.
Hey Sophia,
That (1499MB) is an unusually high number. Here are a few questions that may help us narrow down. You may not have answers to all of them right away, but please help with whatever information you have (or can confidently guess).
Do you see similar amount of usage on multiple/all your balena devices (assuming you have more than one!)?
Can you share how you measured the traffic and where on the network was it done?
Can you share what fraction of the total usage is upload (from the device) and what fraction is download (to the device)?
Do you have any data as to what endpoint(s) all that traffic was being sent to?
Is the traffic sent in short large bursts or trickles continuously all day?
Yes we are seeing a similar amount of usage on all of our devices.
For measuring when the SUPERVISOR config variables were disabled, we would let the device run without any connection and then after a few days we would enabled VPN and check the eth0 data through the Host OS.
On average 2/3s of the data was RX bytes and the rest was TX bytes.
No, unfortunately we could not see where the data was coming from or going to.
From some tests we ran while the VPN was enabled it appeared to be more trickling continuously all day, but once the VPN was disabled we could not check the data regularly.
Thanks for getting back to me and let me know if you have any other questions or specific test you’d like us to run from our end.
We recently did some data usage tests on balenaOS 2.80.3+rev1 and also found much higher bandwidth usage than expected, with approximately 2/3s RX bytes.
We upgraded the supervisor from 12.7.0 to 12.10.1 (kept the balenaOS version the same) so that we could disable the metrics reporting (I think from v12.8.x supports this) and found our usage dropped significantly. I think we saved about 32MB per day…(!)
So quick update, after testing with supervisor version 12.8.0 and setting:
BALENA_SUPERVISOR_CONNECTIVITY_CHECK = 0
BALENA_SUPERVISOR_LOG_CONTROL = 0
BALENA_SUPERVISOR_POLL_INTERVAL = 86400000
BALENA_SUPERVISOR_VPN_CONTROL = 0
BALENA_SUPERVISOR_HARDWARE_METRICS = 0
As well as disabling our device’s services so it is only Balena running at the moment.
We were able to get out data usage down to about 1000MB per month. So that did take out a good amount of data usage however that’s still not the 1.3MB that is expected once all the metrics are disabled.
Currently I am testing on balenaOS 2.80.3+rev1 and Supervisor 12.8.0
Let me know if there are any other device configurations I should test out.
We only tested with the v12.10.1 supervisor, not the v12.8.0, but I did spot on the changelogs for Balena Supervisor that in v12.8.3 they fixed a bug that prevented a recursive loop when reporting current state (balena-supervisor/CHANGELOG.md at master · balena-os/balena-supervisor · GitHub) so it might be worth trying a newer supervisor…
We ended up consuming 4MB per day (~120MB per month) with the following settings (note the different poll interval):
You mentioned the need for a “low bandwidth” mode. This is something we are currently discussing internally, though it may be some time before this is released. We’ll keep you in the loop though!
Are you now testing with Supervisor v12.8.3+? As @st-mono mentions, there are some current state reporting improvements which may reduce your data usage. Thanks, let us know!
We have a lot on our plate though – it may be useful to check the Supervisor repo’s meta manager to see what we’re prioritizing at the moment (and feel free to ping us in GitHub issues too!)
I am one of balenaOS maintainers and want to shed some light on the OS bandwidth consumption.
Let’s start by saying that the numbers that appear in Reduce bandwidth usage - Balena Documentation are so outdated that it makes little sense to use them for anything else than setting an improvement objective. In hindsight, providing number for something that is changing per release was a mistake.
Those numbers are below what you are reporting, but also outdated.
We discussed this internally and decided we needed to introduce bandwidth usage as an OS constraint, and fail validation if the consumption is above a given threshold.
We are bringing in a lot of new units to our Balena fleets soon but this unexplained high data usage is becoming more of a concern and will not be a realistic solution for many of these sites.
Can we get some more insight on what is causing this and any ways we can lower the bandwidth.
I just tested overnight again with these config params
“RESIN_SUPERVISOR_LOG_CONTROL”: “false”,
“BALENA_SUPERVISOR_HARDWARE_METRICS”: “false”,
“RESIN_SUPERVISOR_CONNECTIVITY_CHECK”: “false”,
“RESIN_SUPERVISOR_POLL_INTERVAL”: “18000000”,
“RESIN_SUPERVISOR_VPN_CONTROL”: “false”
I have put our container in IDLE so it is not running anything
Checked data usage this morning:
uptime: 15:15
eth0 data usage: RX bytes:17094120 (16.3 MiB) TX bytes:5748947 (5.4 MiB)
I have granted this device support access: balena dashboard
Current config on unit has VPN enabled and RESIN_SUPERVISOR_POLL_INTERVAL back to default and feel free to reset any of these params to test with the device.
Hi again Sophie, looking into the bandwidth constraint for the OS is still not being actively worked on. It’s a matter of juggling different priorities. I will ask our customer success team to contact you and study the business case to see where it stands with regards to other priorities.
I believe that this conversation has shifted private chat support. So I will close this ticket. But if there is still something that you would like to discuss here in the forums, please feel free to reply and the ticket will automatically re-open.