debugging machines going down

hey guys, so we have been going and debugging this, implemented an automatic reboot via supervisor, but the nucs keep going down.

  • we have tested rebooting the routers when we had about 15-20 NUC’s down - thinking they would magically come online if there was a router problem - they didn’t
  • here’s a screenshot with the last output I can get from one of these nucs. they are connected via wifi. the wifi just seems to go down. the automated reboot doesn’t seem to even bring them always online, so I am at a relative loss here.

any tips on how to debug more or get closer to a solution - would be appreciated.

@katmai I’m sorry to hear you’re still struggling with this. I know you’ve checked journalctl in the past after enabling persistent logging, but can you tell us specifically what you see when running journalctl -u networkmanager from the hostOS? It would be helpful to see those specific logs if you have them, as well as any insights you’ve since gained from Datadog.