Currently when a user device is creating logs, we only store a limited amount of them due to cost and storage requirements. Explore a way to allow for a paid upgrade of storage allowance.
We’d really appreciate a starting point here of just being able to configure the amount of disk space journald is allowed to use when writing persistent logs - we have devices with a reasonably large amount of robust storage so would be happy to assign anything up to a few gigabytes of storage to be able to diagnose why a device went offline once we regain access to it.
The current 32MB limit is a pretty extreme constraint, particularly when you’re running software that can be pretty noisy when connectivity is lost - exactly the time we need persisted logs. A benchmarking exercise recently showed that for a device that had lost comms could retain 19 hour’s worth of logs in that 32MB.
Alex Gonzalez: Hi Jon,
The feature in questions refers to increasing the balenaCloud log retention, and it seems you are referring to on device persistent logging.
I can offer some insight into the discussions that we have had around increasing the persistent logs on device.
First, it’s important to understand that we want to avoid fragmenting balenaOS with unneeded configurations so that:
- We test exactly the same system internally that our customers are using
- We don’t increase support overhead having to ask and compare configurations
- We don’t add the overhead of maintaining/documenting and testing different configurations
What we usually do instead is come up with a sensible default that works for most of our customers. This simplifies maintaining balenaOS for more than 100 different device types.
With persistent logging there is also a misconception - it is not designed to be permanently on as it will wear out media. That in itself is controversial because some devices can just swap storage disks, while others need to be replaced when the storage media fails. balenaOS tries to be conservative and avoid writes as much as possible.
So, we offer persistent logging as a way to debug sporadic problems. Customers are advised to forward logs to the cloud (here is where this feature request comes in as currently we have a small retention) and not to keep them on device, and to reduce the amount of logging that apps generate as it can also become a problem for the engine and supervisor if their logging is excessive.
Your use case is that there are devices that disconnect from the network, and by the time they are recovered the persistent logging has already rotated. It’s a valid point, and I think we could discuss increasing the default from 32MB, but we would not make it configurable, and we would not make it too big so that customers decide to just leave it on as an alternative to cloud logging.
As a side note, if you are chasing a specific problem you could also manually increase the logs size manually by modifying the root filesystem in a few devices - contact us via support channels if this is something you would like to follow up on.