Reboot loop on Google coral dev board

I’ve been attempting to a bunch of coral dev boards to my existing k3s cluster/fleet.

I had to build and load a large number of kernel modules, but did eventually get it to work.
However, once I did, all of the dev boards in the fleet started boot looping, and could no longer be configured from the dashboard. Even a power cycle doesn’t fix it, I had to reflash a device and remove it from the fleet to recover.

With all of that said, how can I debug why this is happening? I’m presuming that it occurs due to some interaction with the kernel modules, but since the device reboots immediately, I can’t get any logs.

Hello @msherman could you please confirm what application do you use?

I tried to create a minimal application to reproduce the issue here: GitHub - ChameleonCloud/chi-edge-coral: repo for building+testing kmods for chi@edge and k3s

I believe the issue is happening when the kernel modules are loaded with insmod, but I haven’t been able to verify yet, as I’m waiting for physical access to one of the boards to re-flash it.

It’s also possible that it’s occurring when the k3s kubelet starts up and begins using some of the kmods, in which case this repo won’t be able to reproduce the issue by iteself (when not joined to a cluster with the Calico CNI configured in IPIP mode)

Mostly I’m not sure how to debug any of this since the OS doesn’t print logs to an attached monitor, and it reboots before sending any logs remotely.