BalenaCloud device resin_supervisor constantly logs 'Error from the API: 401'

Hello,
I have a few devices set up in my account that all run on the same application. I recently set up 3rd-party logging to Papertrail, and through that I can see the resin_supervisor logs.

The supervisor on one of my devices has been throwing a 401 error every 10 seconds for some time now. It didn’t seem to affect functionality at first, but eventually the device failed (or ignored?) a Supervisor update. Since then it has started exhibiting other weird behavior, including not showing one of my application containers in the dashboard “Services” list, even though that container is actively logging in the dashboard log window. The device is also stuck on an older release of my application code.

The error in the logs is the only indication of a possible problem. My application containers are all running and logging fine with no issues (aside from them being an older version than expected). My other two devices are not exhibiting this issue.

The error in the logs is:

f7402c2  [error]   Error from the API: 401
f7402c2  [error]   Non-200 response from the API! Status code: 401 - message: Error
f7402c2  [error]         at /usr/src/app/dist/app.js:22:554770
f7402c2  [error]       at runMicrotasks (<anonymous>)
f7402c2  [error]       at processTicksAndRejections (internal/process/task_queues.js:97:5)
f7402c2  [error]       at async /usr/src/app/dist/app.js:22:554078
f7402c2  [error]       at async /usr/src/app/dist/app.js:22:555600

I have tried restarting containers, rebooting, and manually pulling the power plug and letting it sit offline for a bit. This is a development/test machine and is not important in the short term, however as I get closer to deployment, I’m worried about this happening in production and what options I would have to solve it in that situation.

Thanks,
Tommy

Hey there Tommy, it sounds like something networking-related could be getting in the way of the supervisor reporting its state to the API. The data sent in these updates is ultimately what is presented in the dashboard so it makes sense if these requests are failing that you’re seeing strange or incomplete data in the dashboard.

Could there be any kind of firewall or proxy between your device and balenaCloud that may prevent these requests from reaching our API? You can read more about the network requirements here: Network Setup on balenaOS 2.x - Balena Documentation

Let us know if this helps; if not, it might help if you could enable support access and share the device UUID here so that we can take a look at the device ourselves.

Hi @chrisys , thanks for your reply. I don’t have any firewalls or proxies - my network is pretty vanilla outside of PiHole (which I’ve checked, and it isn’t blocking any DNS for the device). I have another device on the local network and it functions fine (granted, that one is in local mode, but Papertrail captures its logs as well). Additionally, I used to run 2 non-local-mode devices and have performed many Supervisor, OS, and app updates without issues before this. The VPN is connected, logs flow to Balena normally, and I can SSH in from the Balena dashboard; all of this leads me to believe it’s likely not a networking problem (plus, the error is 401, which is usually authentication related I think?).

I have granted support access for a week, and the UUID is f7402c253897c86a51dbcae4e617d2ab. Thanks for looking into this!

Hey Tommy,

Was the device with issues flashed with the same image as the other 2 devices? If not, is it in the same application?

I’ve had 3 devices in the same location:

  1. This problematic device (Pi CM4). It was flashed with development image balena-cloud-[app]-raspberrypicm4-ioboard-2.71.3+rev5-dev-v12.3.5.img.

  2. My main development device (Pi 4, most often in local mode). I believe that device used balena-cloud-[app]-raspberrypi4-64-2.58.6+rev1-dev-v11.14.0.img.

  3. A test device (Pi 3). This ran a development image at first, then later a production image. It is not currently in service.

The images were downloaded at different time (so they were different versions), but all for the same app (and I have not switched apps at all).

All devices have been through multiple successful OS and Supervisor updates.

In case it’s useful information, I just set up my test Pi 3 with balena-cloud-[app]-raspberrypi3-64-2.65.0+rev1-dev-v12.2.11.img right next to the CM4 that is throwing errors. It connected to Balena fine, downloaded the latest app release, and started up without issue. I waited a few minutes and then performed a supervisor update to 12.5.10, still all good. It has not logged any API errors thus far.

I forgot about this for a bit and let the support access lapse - I’ve re-granted it for a week.

Hi @chrisys and @danthegoodman1, I’ve re-enabled support access for another week. Is this something you still want to look into, or should I just re-flash the image and see if it goes away?

Hi,
Thanks for re-enabling support access to the device.
Unless this is blocking you, we would like to have some more time investigating the state of this device.

Kind regards,
Thodoris

@thgreasi no worries, it’s not blocking me - I’ll keep support access enabled. Thank you!