Balena logging aggregator/shipper

Hi all, I’m working on trying to standardize our logging setup with some best practices and would love some input. Currently I tooled up some custom, app level logging in python to ship logs directly to Datadog (using daiquiri in python), but I’m already starting to see the complexity that per app log configurations will cause as new services are introduced. This approach also doesn’t address log management for third party apps which have their own methods.

So in the interest of simplicity, rather than configure logs one by one, I’d like to instead have all balena logs, container, and app logs (first party or third party) send to stdout and use fluentbit or similar to handle the complex work of aggregating, parsing, and shipping. The upside is that it also makes switching between or combing vendors like Datadog, Elastic, Grafana/Loki, Cloudwatch, etc much, much easier. I’ve done some quick tests and easily shipped logs to Datadog and Newrelic and S3 all at the same time.

I’ve been testing:

  • FluentBit Seems like a fantastic fit. Well established based on older brother Fluentd and tailored to be small and lightweight which is perfect for embedded.
  • Vector Also excellent. Very powerful and fast growing community, but its a bit of a resource hog compared to Fluentbit and the docs are a little confusing. Upside, it handles metrics quite well too.

I’ll document my progress in this thread if anyone else wants to give input.

2 Likes

Example for Fluentbit which simply generates some dummy log data and outputs it to terminal and NewRelic. It plays nicely with variables too, so in Balena I set an NR_API_KEY variable and it just works.

docker-compose.yml

fluent-bit:
    build: fluent-bit
    privileged: true
    restart: always
    network_mode: host
    volumes:
        - 'persistent-data:/persistent-data'
    labels: 
        io.balena.features.balena-socket: '1'
        io.balena.features.journal-logs: '1'
    ports:
        - '2020:2020'

Dockerfile

FROM fluent/fluent-bit:1.8

COPY fluent-bit.conf fluent-bit/etc/fluent-bit.conf

COPY plugins.conf fluent-bit/etc/plugins.conf

# Use ADD to get the remote NewRelic plugin, rename it, and copy it to the plugins folder

ADD https://github.com/newrelic/newrelic-fluent-bit-output/releases/download/v1.7.0/out_newrelic-linux-arm64-1.7.0.so /fluent-bit/plugins/out_newrelic.so

CMD /fluent-bit/bin/fluent-bit -c /fluent-bit/etc/fluent-bit.conf

plugins.conf

[PLUGINS]
    Path    /fluent-bit/plugins/out_newrelic.so

fluent-bit.conf

# More info: https://kevcodez.de/posts/2019-08-10-fluent-bit-docker-logging-driver-elasticsearch/

##########################
# Configure Service
##########################
[SERVICE]
# This is the main configuration block for fluent bit.
# Use http server to get prometheus metrics for fluentbit itelf!
# ie. curl -s http://127.0.0.1:2020/api/v1/metrics/prometheus

    Plugins_File    /fluent-bit/plugins/plugins.conf
    HTTP_Server     On
    # HTTP_Listen     0.0.0.0 # May cause a problem with balena https://www.balena.io/docs/learn/develop/runtime/#using-dns-resolvers-in-your-container
    HTTP_Listen     127.0.0.1
    HTTP_PORT       2020
    log_level       debug

##########################
# Configure Inputs
# https://docs.fluentbit.io/manual/pipeline/inputs
##########################
[INPUT]
    Name        dummy
    Tag         dummy.log


##########################
# Configure Parsers
# https://docs.fluentbit.io/manual/pipeline/parsers
##########################


##########################
# Configure Filters
# https://docs.fluentbit.io/manual/pipeline/filters
##########################


##########################
# Configure Outputs
# https://docs.fluentbit.io/manual/pipeline/outputs
##########################
[OUTPUT]
    Name        stdout
    Match       **

[OUTPUT]
    # Dont forget to install NR's plugin.
    # More info: https://docs.newrelic.com/docs/logs/enable-log-management-new-relic/enable-log-monitoring-new-relic/fluent-bit-plugin-log-forwarding/#fluentbit-plugin
    Name        newrelic
    Match       *
    licenseKey  ${NR_API_KEY}
    endpoint    https://log-api.newrelic.com/log/v1

# [OUTPUT]
#     # No extra plugin required like NewRelic.
#     Name          datadog
#     Match         *
#     Host          http-intake.logs.datadoghq.com
#     TLS           on
#     compress      gzip
#     apikey        ${DD_API_KEY}
#     dd_service    test
#     dd_source     dummy
#     dd_tags       environment:test

One thing I haven’t figured out yet is how to best use dockers logging driver. In theory you could set each service in your docker-compose to use fluentd as a logging driver and have fluentbit automatically capture and handle it.

For example in docker-compose.yml

my-service:
        build: my-service
        privileged: true
        restart: always
        network_mode: host
        logging:
            driver: fluentd
            options:
              tag: my-service

Thanks @barryjump for the context here. If I understand this correctly, you are looking to see if we can use the logging keyword for compose directly with balena-engine.

Looking over our balena-engine code and cli, I think that drop-in usage should be possible without an issue (ref: balena-engine-cli/plugins_logging.md at master · balena-os/balena-engine-cli · GitHub - our base for logging is same as docker). Let us know how it goes with use of Fluentbit.

Regards,
N

Thanks @nitish thats good news. Yeah, thats roughly the goal.

Here’s a simple diagram for example of what I’m hoping to do. Yellow boxes are individual containers.

1 Like

Thanks @barryjump for sharing the overall view (“big picture”). This looks great and sounds like a great blog post once done!

This is exactly what I’m hoping to achieve. I hadn’t even heard of Fluent Bit until this post, so thanks! I will try it out. Have you made any progress?

Update: Looks like Fluent Bit isn’t readily available for armv6, which we still need to support. Either I need to build from source or stick to Telegraf.

@john_gronska you might also want to look into https://vector.dev/

1 Like

Thanks, I hadn’t heard of that either. Any thoughts about Vector vs Fluent Bit?

(Unfortunately for us, looks like Vector also doesn’t support armv6. At some point we will be able to move away from that requirement and potentially adopt Fluent Bit or Vector.)

In any case I was about to get Telegraf to read container logs in Balena using its docker_log input and e.g. stream to CloudWatch Logs. Just need this bit in my docker-compose.yml:

    labels:
      io.balena.features.balena-socket: '1'

Whoops sorry I didn’t realize vector didn’t not support v6.
I like vector, lots of promise. Very flexible and backed by Datadog which I absolutely love. However it is definitely less mature than Fluentd/fluentbit.

If I had to choose for a prod deployment, I’d likely stick w fluentbit. In a year from now? Perhaps not.

I just wanted to update this forum post to point out that the logging field of docker-compose files is not supported by Balena. We followed this thread and other advice to try to setup logging by having our containers forward logs to fluentd like @barryjump’s diagram above showed but this unfortunately does not seem possible.

Feb 07 19:55:31 0300f70 balena-supervisor[3117]: [warn]    Ignoring unsupported or unknown compose fields: logging`

This field is listed as being unsupported but we unfortunately missed this: docker-compose.yml fields - Balena Documentation

@nitish is there a straightforward way to use the Tail feature from Fluent Bit without any additional configurations? Here’s a link to the docs: Tail | 3.0 | Fluent Bit: Official Manual.

Ideally, we would run a single container responsible for scraping logs from /var/log/containers/*.log.