Balena API poll interval - jitter?

The BALENA_SUPERVISOR_POLL_INTERVAL seems to have a jitter function as per the below PR…

However it seems this is only 60000ms … I.e. 60 seconds.

Is this correct? And is there any way to override this setting with an environment variable?

It seems a bit silly to have a jitter of 60 seconds, with a poll interval of 24 hours.

It seems to make more sense to have a jitter of exactly the same parameter as the poll interval, so the API requests get spread out over the full period rather than all hitting the API at the same time.

But really, the best option is probably having it configurable IMO

Hi Aaron,

Thanks for your question

The API jitter delay was added as a way to ease the load on networks that have many Balena devices. The balenaCloud backend has other mechanisms to deal with scaling and updates from thousands of devices which the supervisor jitter actually interferes with.

Making the jitter proportional to the poll interval would make it much harder to predict load as devices could perform a target state poll between 24 and 48 hours (in the case of a 24 hour interval example), which would actually be worst for the overall service of our platform.

I hope this answers your question.

@pipex I don’t understand how that would make it harder to predict load? Unless I’m missing something, that seems completely counterintuitive. It is no harder to predict the load than any other jitter?

In the 24 hour poll interval example the load, at scale, would then be randomly distributed over a 24 hour period. Smoothing out the load considerably. This might lead to a higher baseline but it would remove the huge spikes in load as well which is surely beneficial for the vast majority of use cases.

Obviously in theory they could all update at the exact same time but the probability of that at scale is so small that it’s not really relevant.

I understand how balenaCloud might not necessarily fit into this mould as you have a variety of customers doing a variety of different stuff on different update pipelines etc. But for a single openbalena instance with a single application for example it just creates a massively unnecessary infrastructure overhead due to having to deal with these unnecessary spikes.

However, in any case, why not have an env variable BALENA_SUPERVISOR_POLL_JITTER so you can configure the jitter?

This seems to be the best solution all around?