supervisor busy.. Cannot reach supervisor

I am getting this error,

what will cause the supervisor to be busy… I had it working, then it did not want to work

Hi, can you please provide more context? Can you show the journal logs for the supervisor with journalctl --no-pager -a -u resin-supervisor on the HostOS?

Sorry you’re having trouble.

I am having a very hard time trying to sleep the balena fin with the co-procesor. Sometimes it works, sometimes doesnt.

And when it does not work is after i push an update to my code. the balenafin restarts and then when i try to sleep it, it gives the supervisor busy error and it cant go to sleep anymore until i flash another version of the finabler


still same problem with the finabler

This is the most important feature that I need working for my project.

Hi,

If the Supervisor’s GET /v1/device endpoint is returning anything other than “Idle” for its status field in the response, “supervisor busy…” will be logged. I see that the request to the Supervisor’s API is actually getting rejected for other reasons here though. To debug, could you try running curl -X GET -H "Content-Type:application/json" "$BALENA_SUPERVISOR_ADDRESS/v1/device?apikey=$BALENA_SUPERVISOR_API_KEY" from your finabler container to see if the container can access the Supervisor API? This will let us know if the error is in the finabler or the network itself. You can use balena exec -it finabler /bin/bash to access its terminal.

Regards,
Christina

@cywang117
I was able to get into the ssh container and did the request you recommended and i got a good response and i was also able to reboot the balenafin from the finabler container via ssh with the command

curl -X POST --header “Content-Type:application/json”
http://127.0.0.1:48484/v1/reboot?apikey=$BALENA_SUPERVISOR_API_KEY

any other thoughts?

Jorge,

Can you let me know what supervisor version you are running? I’ll try to recreate this issue.

Phil

@phil-d-wilson

12.8.10

Please if you get it to work, do some updates on your sample code on your other container (the one that will send the signal) or message me and i can transfer the device to you, because after many updates is when the finabler stopped working and doing the supervisor busy thing.

Jorge,

I can’t recreate here. This is the reply from my curl:

root@d6e683b:/usr/src/app# curl -X GET -H "Content-Type:application/json" "$BALENA_SUPERVISOR_ADDRESS/v1/device?apikey=$BALENA_SUPERVISOR_API_KEY"
{"api_port":48484,"ip_address":"192.168.86.58","os_version":"balenaOS 2.58.3+rev1","mac_address":"B8:27:EB:B0:DC:AB 48:A4:93:03:0C:26 48:A4:93:03:0B:26","supervisor_version":"12.8.10","update_pending":false,"update_failed":false,"update_downloaded":false,"commit":"ab77c473fea5568f102f18f16c7e99e3","status":"Idle","download_progress":null}root@d6e683b:/usr/src/app# 

Can you please post me your reply from the same curl command?

this is what i have @phil-d-wilson

Jorge,

Ah, so your device is “stuck” in the “Downloaded” state. The finabler won’t do anything unless the device is in the “idle” state. Please could I have support access on the device, and I’ll take a look.

Phil

Jorge,

Up front: I’ve resolved the problem for you. What I don’t yet know is why the device got into the problem state.

What I found was that there were references to old versions of the finabler block in the local database that the supervisor uses to track state. I removed these old references, and the curl command started returning “idle” for the state, rather than “downloaded”. This allowed the finabler to do it’s job.
This is not an issue with the finabler - it is working as designed: not trying to sleep or flash the device if it is in the process of updating. This is an issue with the supervisor which we are looking into. I will update this ticket with a way to track this issue, FYI.

Your device is now sleeping soundly. :slight_smile:

Phil

1 Like

Thank you @phil-d-wilson.

Glad you got it solved and rule out the root cause…

Hi Jorge, we have a GitHub issue to track this: Incorrect release commit · Issue #1579 · balena-os/balena-supervisor · GitHub - I’ve linked it to this ticket so you should be alerted when the issue is closed.

Hi @phil-d-wilson

Well I broke it again. I pushed an update to my python code container and broke the supervisor . it is again in “stuck” in the “Downloaded” state.

By the way, since your fix, it worked great.

thanks

Hi @jorgesea,

Device dfbe2a2ddcde3e58ff7b0ebd12464673 should have returned to the normal state again. This is definitely a case of the issue described in Incorrect release commit · Issue #1579 · balena-os/balena-supervisor · GitHub. The Supervisor currently has a bug that we are looking into, where the device’s Supervisor database contains outdated image entries which result in the device entering an invalid state, as you saw here with status: Downloading. I’ve cleaned up some entries in the device database and you should be able to use the sleep endpoint now:

root@dfbe2a2:~# balena exec -it finabler_3797341_1854038 /bin/bash
root@dfbe2a2:/usr/src/app# curl -X GET -H "Content-Type:application/json" "$BALENA_SUPERVISOR_ADDRESS/v1/device?apikey=$BALENA_SUPERVISOR_API_KEY" | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   340  100   340    0     0   6415      0 --:--:-- --:--:-- --:--:--  6415
{
  "api_port": 48484,
  "ip_address": "192.168.86.99",
  "os_version": "balenaOS 2.80.3+rev1",
  "mac_address": "B8:27:EB:8D:51:E8 48:A4:93:02:BC:90 48:A4:93:02:BB:90",
  "supervisor_version": "12.8.10",
  "update_pending": false,
  "update_failed": false,
  "update_downloaded": false,
  "commit": "391ba0abd058f8d6f538f238ff411155",
  "status": "Idle",
  "download_progress": null
}

Could I ask what your patterns for pushing application updates are? I haven’t seen this Supervisor issue occur this frequently, so this would help us pinpoint the problem. Thanks!

Regards,
Christina

@cywang117

Thanks Christina.

I have windows. I push the update via balena cli. just the simple
balena push

Is there any other information you want me to check in my computer?

I bet if i push another update to the python container, i might break it again.

Hi Jorge,

Thanks for the information. I’m currently working on this issue and I’m getting close to a fix. Looking at the data shared by my colleagues, this lines up with my understanding of the causes of the issue. I think you are probably making changes to your tanklevel service but not making changes to the finabler service. While this is perfectly fine, this seems to cause issues for the supervisor to apply the target state under some conditions. If you get stuck in the same state again, I would appreciate if you could share the supervisor logs under those conditions, particularly since you have been able to reliably reproduce this (you can get them using journalctl -u resin-supervisor -a --no-pager). If you are in a hurry and need to push changes and not get this issue, you can make sure you make a change to the finabler container as well (it can be changing a comment, a version number). In any case I’m sorry about the issues this is causing and we’ll hopefully have a fix soon.

@pipex

Hi Thanks for the help.

So i did this. I updated the tank level container and added a consolo.log line to the finabler. but my python script crashed because of an error in the tank level.

then i updated only the tank level container and it broke the supervisor to the bug hehe.

then i updated both and still the supervisor has the bug. here is the journal

balena multiple.txt (78.5 KB)