Notifications for production errors?

xginn8 · May 1, 2019, 4:57pm

Hi again @mpark,

If you have the development bandwidth, it is always good practice to set up monitoring for your services & devices. Some common stacks include telegraf/TICK stack or 2, Prometheus, or Datadog. I hope to publish an updated Prometheus guide soon, so stay tuned for that as well. Moreover, something like a log forwarding service can be useful if you have a robust logging setup, though I find pattern matching in logs to be a little brittle for arbitrary errors in production.

Additionally, there are some things you can do on-device to make your application more resilient to failure. We always recommend configuring a HEALTHCHECK in your Dockerfile, and making sure you have tested some common failure cases for your app.

Again, we are working on many of these problems now internally, so please let us know what issues you run into or what would make your life as a fleet owner easier!

Topic		Replies	Views
Alerts feature request Product support	11	869	September 8, 2022
Relaying host and container logs & console to logstash or logagent balenaOS	2	826	August 4, 2020
Balena logging aggregator/shipper Product support docker	11	1022	May 27, 2024
API outage - words of support Product support support , status , api	1	390	October 20, 2021
Experimental device diagnostics features now available! Product support support , reliability , monitoring , diagnostics	5	506	April 17, 2020

Notifications for production errors?

Related topics