I attempted to update the OS on my Raspberry Pis yesterday. 2 of them updated happily, the 3rd seems to have got stuck on the “running supervisor update” stage of the the process, as it hasn’t progressed even ~ 20 hours later. It’s device https://dashboard.resin.io/devices/63369c9d6d143b2567b2118b58e3f294/summary and I’ve enabled support access.
We’re not able to access the host OS of the device, but if you’re able to physically restart the device, it should come back up with the new OS installed and then update the Supervisor.
Could you let us know if that doesn’t happen, and we’ll take another look.
Hi, please just power cycle it once, the supervisor has to recover after reboot, so it can pull the update and check in with the resin backend well. Then the status will be updated, in the meantime it will still show the last status message, as you mentioned.
For the time being, are you manually rebooting it? If so, no need further, the device should be loading the new OS, just need to sort itself out. We are taking a look.
Hi @pjb304 we see the device is continuously rebooting, any idea why that might happen? As much as we can tell, the device is starting the correct, new OS, but after less than 1 minute it seems to hard reboot. So far we don’t see any obvious reason for that to happen, so any context would be helpful troubleshooting this.
The device in question is powered using PoE and is sat on the roof of a building. As far as I’m aware the power supply to it is stable. As far as I know it was fine until I applied the update. If there’s nothing obvious causing it then it sounds like I need to go up (thankfully not too much of a headache to do) and see if I can observe anything that’s causing the reboots. If I don’t see anything obvious then I guess I’ll swap out the SD card and see if that improves matters.
Hi, it all sounds okay, thanks for the extra context!
We managed to log in, and stopping the application container it stopped rebooting. Checked the device’s status, and it seems to be recovered fine. In the testing we had to remove the application image from the device, that’s why it is downloading now once again. (To be clear, in a normal update the application does not need to be redownloaded)
As much as I can tell from the log, it stopped midway the supervisor update, so it looks like the device hung in some way. That’s not something we’ve seen before. The reboot cycle seems to be connected to the application somehow, as stopping it the cycle stopped.
In general we’d not expect such issues to happen in the future.
Both comments are very interesting. The device is a LoRaWAN gateway so the application interacts with hardware on top, so if for somereason the supervisor update stops the hardware being mapped through then the application might crash out.
Thanks for pointing out the undervoltages. It looks like I’ll need to dig into this in more detail.