API Rate limits and workarounds

Currently we have around two dozen devices and we are polling each of them via the sdk/api about every 5 seconds.

We’re experiencing a lot of timeout requests and error messages saying there are too many requests. We can’t build reliably anymore.

I was wondering if there is any detailed information on the performance of the api, how it rate limits and when. Would it be a matter of signing up for different plan to get better performance?

Thanks a lot for the help!

Hi there,
We do perform some rate-limiting for our API endpoints, but the specific limits are not documented, and we are still investigating what the best values would be.
May I ask you why you need to poll each one of them every 5 seconds? I’m asking because If we can better understand the use case maybe we can help you find another solution. May I also ask you if in your case making a single request for multiple devices would still be fine ?

Hi @JSReds - Thanks for the suggestion. I’m currently working on an implementation that calls the getAll* versions of the functions instead of multiple get functions.

We are observing the same problem. Our automated tests automatically deploy releases and test configurations by updating and checking environment variables for multiple devices. This usually works well if executed from a developer’s computer, but we are getting the “too many requests” error if executing the tests from the gitlab CI. If not documented, can you provide some indication how the get more reliable results?

Hey Andreas,

it would be faster to fetch environment variables of all of the devices in one query to reduce the number of api calls. I am assuming you are getting too many requests issue on every test run, can you be certain they are only run once? can you approximate how many api calls you might be making? the rate limiting is quite high so you shouldn’t be hitting them for normal use cases.

We are using “balena deploy” via balena CLI to update two applications with about 10 services each. I assume the command line tool will produce API calls. Next, we are using the python SDK and the get_all versions to to obtain the environment variables, for example balena.models.environment_variables.device_service_environment_variable.get_all I expect this is one API call. The test breaks when we create/update environment variables. We do one call per variable, in total about 50 over two applications and 6 devices. We are seeing the errors after about 40 calls. My understanding is there is no api call to set multiple variables?

The problem disappeared for us.

Hi

This is strange. Can you let us know if you start seeing this again?

Would help if you can share some logs - and if possible a script to reproduce this issue. Would make it easier to recreate on our end

Yes, I will let you know if we see it again. I found that we actually do a lot more the 40 API calls. We are doing a balena deploy and then query the devices every second if they are on the right commit and the supervisor is idle. In API calls this is

2021-03-16 11:56:27,651 [DEBUG] caddeploy.device.CADAMI-Server: [WAIT] - Not on right commit or supervisor not idle.
2021-03-16 11:56:28,657 [DEBUG] urllib3.connectionpool: Starting new HTTPS connection (1): api.balena-cloud.com:443
2021-03-16 11:56:28,738 [DEBUG] urllib3.connectionpool: [https://api.balena-cloud.com:443](https://api.balena-cloud.com/) "GET /v5/device?$filter=uuid%20eq%20'dDEVICE_ID' HTTP/1.1" 200 None
2021-03-16 11:56:28,744 [DEBUG] urllib3.connectionpool: Starting new HTTPS connection (1): api.balena-cloud.com:443
2021-03-16 11:56:28,827 [DEBUG] urllib3.connectionpool: [https://api.balena-cloud.com:443](https://api.balena-cloud.com/) "GET /v5/device?$filter=uuid%20eq%20'DEVICE_ID' HTTP/1.1" 200 None
2021-03-16 11:56:28,834 [DEBUG] urllib3.connectionpool: Starting new HTTPS connection (1): api.balena-cloud.com:443
2021-03-16 11:56:28,909 [DEBUG] urllib3.connectionpool: [https://api.balena-cloud.com:443](https://api.balena-cloud.com/) "GET /v5/application?$filter=id%20eq%20'APP_IDb' HTTP/1.1" 200 633
2021-03-16 11:56:28,916 [DEBUG] urllib3.connectionpool: Starting new HTTPS connection (1): api.balena-cloud.com:443
2021-03-16 11:56:28,999 [DEBUG] urllib3.connectionpool: [https://api.balena-cloud.com:443](https://api.balena-cloud.com/) "GET /v5/application?$filter=app_name%20eq%20'OTAcast-server-test' HTTP/1.1" 200 633
2021-03-16 11:56:29,008 [DEBUG] urllib3.connectionpool: Starting new HTTPS connection (1): api.balena-cloud.com:443
2021-03-16 11:56:29,448 [DEBUG] urllib3.connectionpool: [https://api.balena-cloud.com:443](https://api.balena-cloud.com/) "POST /supervisor/v1/device HTTP/1.1" 200 370

Traceback if the error happens

Traceback (most recent call last):
  File "/cadedge_tools/caddeploy_pkg/caddeploy/utils.py", line 66, in run_parallel
    return_values.append(task.result())
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 433, in result
    return self.__get_result()
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/cadedge_tools/caddeploy_pkg/caddeploy/utils.py", line 23, in wrapper
    return func(obj, *args, **kwargs)
  File "/cadedge_tools/caddeploy_pkg/caddeploy/device.py", line 150, in wait_for_release
    while retries > 0 and not (self.balena.is_on_right_commit() and self.balena.is_idle()):
  File "/cadedge_tools/caddeploy_pkg/caddeploy/balena_device_adapter.py", line 153, in is_on_right_commit
    app_commit = self.balena.models.application.get(self.application_name)["commit"]
  File "/usr/local/lib/python3.9/site-packages/balena/models/application.py", line 146, in get
    apps = self.base_request.request(
  File "/usr/local/lib/python3.9/site-packages/balena/base_request.py", line 197, in request
    raise exceptions.RequestError(response._content)
balena.exceptions.RequestError: b'Too Many Requests'

“We do perform some rate-limiting for our API endpoints, but the specific limits are not documented, and we are still investigating what the best values would be.”

Any updates on this?

Hi. Long story short, the rate limiting mechanisms are not publicly described in detail because they are subject to change. Ideally, balena aims to provide generous limits to allow reasonable workflows without disruptions. If you’re hitting the limits, it might be an opportunity to optimize the approach (e.g. call less often or fetch more information in a single call).

As always, the team is here to assist, even though it’s on the best-effort basis in the public forums. Hope that makes sense.

Well I think anything can always change. It wouldn’t hurt to just know the current rate limit settings

FYI, the use case on our side is we need to integrate prometheus metric data with our internal prometheus infrastructure so this API is needed so we can discover the endpoints on all the balena devices. A really low cost in terms of calling the API, I was just curious about rate limit numbers.