We have had several instances where the device comes up in the dashboard w/ the target build different from alleged current build and no application listed. Diagnostics shows “unmatched local and remote supervisor”. On the dashboard it shows 10.8.0 but what is running locally is 9.14.0. We have seen this happen in 2 different scenarios (1) moving a device from one application to another, and (2) updating a system by burning a new SD card with same UUID but with different host OS (2.32 -> 2.48).
In this latest incident, I see:
root@51e8aa2:~# balena ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
944ac6e74434 balena/rpi-supervisor:v9.14.0 "./entry.sh" 2 weeks ago Up 7 minutes (health: starting) resin_supervisor
4acaecc51458 47d9342dad3d "/usr/bin/entry.sh /…" 3 weeks ago Up 19 hours solmon_2423264_1433257
root@51e8aa2:~# balena images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry2.balena-cloud.com/v2/ddf04b3f5d8ca26cd61796853192e796 <none> 47d9342dad3d 2 months ago 340MB
balena/rpi-supervisor v10.8.0 cc42f1e9307e 7 months ago 60.6MB
balena-healthcheck-image latest cfdb1bf11e4c 8 months ago 8.95kB
balena/rpi-supervisor v9.14.0 1764acf9f2c9 17 months ago 56.4MB
The proposed solution in Can't update supervisor shows that I can remove the old supervisor image (9.14.0) and trigger an update, and I have used that process previously to correct the situation when where there wasn’t an image for 10.8.0 already downloaded.
Now, when I follow the instructions in the forum topic above, set the new tag and run the script to set the correct remote tag, then run update-resin-supervisor
I correctly see:
Sep 10 18:34:24 51e8aa2 balenad[880]: time="2020-09-10T18:34:24.261728224Z" level=info msg="shim balena-engine-containerd-shim started" address=/containerd-shim/moby/093375c525e3511c34c4c2589c02743c93ba76b44cd153c8bec3e88297541d63/shim.sock debug=false pid=1635
Supervisor configuration found from API.
Getting image id...
Supervisor balena/rpi-supervisor:v10.8.0 already downloaded.
I then start the supervisor using systemctl start resin-supervisor
, I get into a loop with the following errors:
root@51e8aa2:/lib/systemd/system# systemctl start resin-supervisor
root@51e8aa2:/lib/systemd/system# Sep 10 18:35:27 51e8aa2 resin-supervisor[1827]: Error response from daemon: No such container: resin_supervisor
Sep 10 18:35:27 51e8aa2 resin-supervisor[1832]: active
Sep 10 18:35:32 51e8aa2 resin-supervisor[1833]: Error: No such object: balena/rpi-supervisor:v9.14.0
Sep 10 18:35:34 51e8aa2 resin-supervisor[1833]: Error: No such object: resin_supervisor
Sep 10 18:35:36 51e8aa2 resin-supervisor[1893]: Error response from daemon: No such container: resin_supervisor
Sep 10 18:35:36 51e8aa2 sh[1892]: Getting image name and tag...
Sep 10 18:35:38 51e8aa2 sh[1892]: Supervisor configuration found from API.
Sep 10 18:35:38 51e8aa2 sh[1892]: Getting image id...
Sep 10 18:35:39 51e8aa2 sh[1892]: Supervisor balena/rpi-supervisor:v10.8.0 already downloaded.
Sep 10 18:35:47 51e8aa2 resin-supervisor[1942]: Error response from daemon: No such container: resin_supervisor
Sep 10 18:35:47 51e8aa2 resin-supervisor[1947]: active
Sep 10 18:35:51 51e8aa2 resin-supervisor[1948]: Error: No such object: balena/rpi-supervisor:v9.14.0
Sep 10 18:35:52 51e8aa2 resin-supervisor[1948]: Error: No such object: resin_supervisor
Sep 10 18:35:55 51e8aa2 resin-supervisor[1985]: Error response from daemon: No such container: resin_supervisor
Sep 10 18:35:56 51e8aa2 sh[1984]: Getting image name and tag...
Sep 10 18:35:58 51e8aa2 sh[1984]: Supervisor configuration found from API.
Sep 10 18:35:58 51e8aa2 sh[1984]: Getting image id...
Sep 10 18:35:59 51e8aa2 sh[1984]: Supervisor balena/rpi-supervisor:v10.8.0 already downloaded.
Sep 10 18:36:06 51e8aa2 resin-supervisor[2032]: Error response from daemon: No such container: resin_supervisor
Sep 10 18:36:06 51e8aa2 resin-supervisor[2038]: active
Sep 10 18:36:11 51e8aa2 resin-supervisor[2039]: Error: No such object: balena/rpi-supervisor:v9.14.0```
I suppose I can fix this by removing the existing image completely and forcing the download, but in some of our installations a 60MB download is problematic:
My questions are:
- Help me understand the process that leads to this mismatch so we can perhaps prevent it from happening (especially in the field where connections are weak).
- Why does the resin-supervisor systemctl unit think that the supervisor version should be 9.14.0 when the update script recognizes that it should be 10.8.0. How can I remedy this mismatch short of the brute force method of deleting the existing image completely and forcing a redownload, which presumably triggers something that causes a consistent view on the version.
Thanks