Install Netdata on balena device (Raspberry pi 3); Issues & alternatives

Hey everyone,

I have been fiddling with balena for quite some time, trying to figure out it’s boundaries, the first being a multi-tenancy scheme leveraging balena as the orchestrating platform, as described in this forum post.

My latest endeavour has been installing an amazingly good monitoring system, netdata in balena, as illustrated in these posts:


I created a new forum post so we can aggregate all ideas and suggestions into a nice post for future reference.

The problem is that netdata needs special mounts to get the full picture of the host:

...
security_opt:
      - apparmor:unconfined
    volumes:
      - /etc/passwd:/host/etc/passwd:ro
      - /etc/group:/host/etc/group:ro
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
...

I installed netdata without those mounts to get better view of what it is able to do. There are tons of metrics and system-wide overview, but we don’t have the range and detail one would want in a production critical device.

Any ideas how to proceed?

I have thought about using a modified version of balenaOS, where the mounts would be mounted using labels, as it is currently possible with firmware and kernel.

Another possible approach could be monkey patching something with developer-os and sshing into host (unlikely to have the performance needed to get 1 metric/s) or using dbus (not sure how it works though).

Thank you for your time, can’t wait to hear your thoughts and ideas!

Are these :

      - /etc/passwd:/host/etc/passwd:ro
      - /etc/group:/host/etc/group:ro
      - /proc:/host/proc:ro
      - /sys:/host/sys

all the mount netdata needs, or will there be others?

hey @telphan,

According to their docker installation manual, this is the entire docker-compose:

version: '3'
services:
  netdata:
    image: netdata/netdata
    hostname: example.com # set to fqdn of host
    ports:
      - 19999:19999
    cap_add:
      - SYS_PTRACE
    security_opt:
      - apparmor:unconfined
    volumes:
      - /etc/passwd:/host/etc/passwd:ro
      - /etc/group:/host/etc/group:ro
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro

Moreover, netdata needs docker-socket (or a docker-socket proxy) to resolve container names, I suspect that it is possible to re-configure the docker socket directory in netdata to point to balena-engine socket.

Hey everyone (or just me?),

From a little research that I did, enabling custom volumes is (propably) possible through the balena supervisor.

For example, from the utils.ts in Github we see the function call service.config.volumes.push('/run/dbus:/host/run/dbus'); loads the appropriate volume.

  1. Is it safe to presume that using a custom version of the supervisor will be able to load an arbitrary number of volumes, including the ones we need to run netdata?

  2. If so, a demo would be possible to be constructed following the commands in the README.md of the supervisor?

As always, thanks for your time!

cheers :slight_smile:

Hi there @odyslam,

Thanks for asking about netdata! As you may have already surmised from me adding balena to the netdata known apps, proper support for netdata specifically is something we are actively working on. I have committed a few things upstream to make this easier, and I am resurrecting a brief demo docker-compose.yml for running netdata on-device.

With that said, you are certainly able to modify the supervisor if you like, but that will obviously make it harder to maintain and support. The limitations for mounts come from the supervisor itself (as you have already discovered), not necessarily balenaEngine. As such, you can also launch a fully-featured netdata daemon directly from the host OS shell, though again there are limitations to that approach in that the container lifecycle will suffer due to no supervisor management.

Hey @xginn8,

If you need help, I would love to aid in any way with the integration!

A couple of questions though:

  1. Why run netdata from hostOS if you can launch a user application container with augmented rights that will load all the necessary volumes?
  2. How one can install netdata directly into the hostOS?
  3. Have you personally tried the other monitoring solutions that balena has blogged about (datadog, prometheus). The absence of special container privileges (like access to volumes) from their part is due to the fact that they gather fewer metrics and offer a limited view of the system or is it because they are integrated into balena with a different way?

In any case, I suspect that there is an official strategy since you are already investigating it.

I am interested more into working towards a production proper integration of the platforms (i.e proper support of netdata) rather than locally monkey-patching a supervisor and running netdata.

Thank you so much for your time, it is very much appreciated.

Hi again @odyslam,

Thanks for your offer! I will answer each question inline:

  1. Why run netdata from hostOS if you can launch a user application container with augmented rights that will load all the necessary volumes?

I didn’t mean to imply that you should be running netdata in this way, just that it’s possible to work around the self-imposed limitations in the supervisor by launching your service outside of the supervisor. To be clear, that’s very much a hacky solution and why we are working towards native netdata support! If you like, privileged: true will get you quite far with accessing host OS data.

  1. How one can install netdata directly into the hostOS?

If you are so inclined you can build a custom OS, but I think it’s a good bit easier to simply launch a container from the host OS with all the goodies you need.

  1. Have you personally tried the other monitoring solutions that balena has blogged about (datadog, prometheus). The absence of special container privileges (like access to volumes) from their part is due to the fact that they gather fewer metrics and offer a limited view of the system or is it because they are integrated into balena with a different way?

I have tried some of them (I authored this post earlier this year, using Prometheus: https://www.balena.io/blog/monitoring-the-edge-with-prometheus-pt-1/). I have not tried Datadog specifically. These monitoring systems are subject to the same limitations as netdata, since primarily they use the same filesystem (/proc and /sys from the host OS). They are also using privileged: true to access the host OS in a more complete fashion (see https://github.com/balena-io-playground/balena-datadog/blob/master/docker-compose.yml#L10).

Hi @xginn8

Any update on netdata becoming an addition of known apps/proper support from balena? I’m about to embark on something similar to @odyslam but quite new to everything. I’ll still be giving it a go nonetheless!

Hi @ibex unfortunately there isn’t an ETA as of yet as we have had to refocus our efforts on other projects for the time being. The one thing that I would mention is that in supervisor 10.8.0, its now possible to specify a label to enable mounting of /proc into your netdata container, which should provide for much better metrics if memory serves.

sync