Data collection by balena-managed devices

Hello everybody!

I’m currently looking into balena as a device management framework for one of our projects.
First of all, great work, I very much like your idea, what you are aiming at and the amount of effort and value you put into your documentation!

In the course of my research I discovered this site here regarding the data that is collected and sent to Mixpanel for balena-managed devices:
https://www.balena.io/docs/learn/more/collected-data/

The site suggests that it is possible to configure balena in a way that prohibits the collection of such data - quote: “… Data is submitted to Mixpanel from balena-managed devices (if allowed) …”
Unfortunately I wasn’t able to find any details on how to proceed when trying to disable the aforementioned data collection.

Do you have any hints or links you could provide me with to accomplish this task?

Hey,

Thanks for reaching out, and thank you for the compliments; we really appreciate them :+1:

I am not aware of a way to make a device NOT report some analytic data; it goes via our API now, rather than directly to Mixpanel. Is there some data you feel is of a sensitive nature that you would not like to go to mixpanel? We use the data purely for service performance/behaviour tracking and not to log any of your personal application logs/data etc.

Hi,

Thanks a lot for the quick reply!

To be honest at the start it was more of a general question as I’m trying to wrap my mind around the framework - triggered by the addition of “(if allowed)” to the sentence quoted above.
If there is no possibility to disable the reporting of such data it’s maybe worth to consider removing the addition stated above? Not that others start searching their way around the documentation as I did :slight_smile:

When thinking further there may be some data that could be regarded as of sensitive nature, depending on the individual project / use case: e.g. any clear name like e.g. Application name - if provided with sufficient context information - or the device IP and the data that may extracted out of it.

But as I said, what’s sensitive surely depends on what regarded as sensitive in your individual use case.
So thanks for having this list out there to know which point to think about and which not.

Best regards, happy holidays, and keep up the good work!

Hi Christian,
We currently do not offer a way to prevent devices from sending analytics to our backend.
Indeed the (if allowed) in that documentation page is inaccurate and sounds confusing as you mentioned.
I’ve opened an issue to remove that.
Thanks for pointing this out.

Kind regards,
Thodoris

Hi Christian,
Just wanted to let you know that the documentation page was updated. Thank you for your feedback and have a happy holidays!

For reference, link to docs:
https://www.balena.io/docs/learn/more/collected-data/#device-data-collected-by-the-supervisor

Hi,

thanks a lot for the quick reaction here.

A follow-up question for my understanding: As the analytics data goes via the API at the moment, I assume it is transmitted both when using the balenaCloud and the openBalena version as a backend?
Or is there any difference between balenaCloud and openBalena here?

Best regards,
Cristian

Hello Cristian,

Yes there is a difference, only devices on balenaCloud are (as the docs say) balena-managed and devices on openBalena are not. So if someone is using openBalena as a backend to manage their devices, no analytics data would be transmitted to us.

Kind Regards,
Marios

Hi Marios,

thanks for the quick reply.
I assumed the “managed” referred to whether a balena supervisor was running on a device or not, as e.g. mentioned here: https://www.balena.io/docs/reference/OS/overview/2.x/#standalone-balenaos

If that’s not the case, it’s good to hear that there is the possibility to “opt-out” of the data-collection by running an openBalena instance.

Best regards,
Cristian