I am experiencing unexplained reboots of my device (BeagleBone Green Wifi). I log any reboots initiated by the NodeJS application running on the device, however these reboots are occurring with no explanation as to why in the application logs. This makes me think the reboot was caused by something external to the application.
I’d like to check the syslog but can’t seem to find it on the system. When using journalctl, I am only able to see the BusyBox klog, not the syslog. Is the syslog being written to at all by resinOS? Or am I looking in the wrong place completely?
I really just need some way to access logs that describe the behavior of the system at the time of unexpected reboot. Any help is greatly appreciated!
Hey, when you say a reboot how are you detecting that? It could be that your application is being restarted due to an updated build being pushed, or env vars being changed, etc, which to your application could look like a reboot but the system itself would not.
As for seeing the full system log that should be visible by accessing the host os via the web terminal and then using journalctl
Reboot = the application has restarted and logged that it is starting up again. It is possible that it is not a full system reboot - however updates and env variables are not being changed on these devices. A cause has not been determined yet and so I want to check the system logs to see if I can find anything. My hunch is that there is either a memory leak causing all system RAM to be taken up, or something is causing balena to restart the container.
journalctl appears to only output information from the current boot, since when I run journalctl --list-boots I only ever see a single boot. If I need to evaluate the logs from a previous boot what is the best approach?
If you check the device logs in the resin.io dashboard it should say when your application is restarted by the supervisor, or if the supervisor reboots the device (ie when you hit “reboot device” in the dashboard) so that might also help to narrow down the cause.
For logging by default we store the journald logs in memory with rotation as this reduces a lot of the wear on the SD card and increases it’s lifespan/minimises potential corruption issues, however you can enable persistent logging by adding a "persistentLogging": true key to /mnt/boot/config.json and rebooting -(https://docs.resin.io/reference/OS/overview/2.x/#dev-vs-prod-images)
I think I’ve figured out a good way to grab the relevant logs without enabling persistentLogging. Thanks so much for your help. Y’all have the best community I’ve ever seen!
Is there more information / tips / advice for examining the contents of these journal/logs ?
We’re having a “once a week” device failure where it locks up and becomes unresponsive. Our application logs aren’t help diagnose the problem so we’re hoping that by turning on persistent logging we will gain more insight. I’m guessing that reading up on journalctl and .journal would probably also be useful.
You are correct in that journalctl / journald are the key here. Everything is logged via journald and can inspected from the host OS’s shell, via some incantation like journalctl -fu balena (this command translates to follow the log of the unit called balena). The journal can be manipulated and transformed to output whatever you need, so reading up on the various flags will be worthwhile. I believe the .journal files are actually binary encoded, so you should never need to interact with them directly.
In the meantime, could you explain more about the pathology you are seeing on your devices? What device type / supervisor version / OS version are these crashes happening with?