Balena device corrupt after upgrading supervisor to version 13.1.0

Hi,

After upgrading the Balena supervisor of my NUC deployment to 13.1.0, my devices no longer connect to the platform.

Context:

  • Previous supervisor = 13.0.3
  • BalenaOS = 2.89.15

journalctl errors:

Apr 12 21:39:06 nuc balena-supervisor[152645]: Error response from daemon: No such container: resin_supervisor
Apr 12 21:39:06 nuc balena-supervisor[152654]: balena_supervisor
Apr 12 21:39:06 nuc balena-supervisor[152662]: active
Apr 12 21:39:06 nuc balena-supervisor[152663]: Container config has not changed
Apr 12 21:39:06 nuc balenad[1152]: time="2022-04-12T21:39:06.970267763Z" level=info msg="shim balena-engine-containerd-shim started" address=/containerd-shim/424185e9333>
Apr 12 21:39:07 nuc balena-supervisor[152724]: **find: /mnt/root/tmp/balena-supervisor/services: No such file or directory**
Apr 12 21:39:07 nuc 24474a8c8f68[998]: **find: /mnt/root/tmp/balena-supervisor/services: No such file or directory**
Apr 12 21:39:07 nuc balenad[1152]: time="2022-04-12T21:39:07.200487688Z" level=info msg="shim reaped" id=24474a8c8f6818507bbdc48b89b8e9c22f0f614b5610c07a15e6d56ca1963795
Apr 12 21:39:07 nuc balenad[998]: time="2022-04-12T21:39:07.209901531Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*ev>
Apr 12 21:39:07 nuc systemd[1]: balena-supervisor.service: **Main process exited, code=exited, status=1/FAILURE**
Apr 12 21:39:07 nuc systemd[1]: **balena-supervisor.service: Failed with result 'exit-code'.**
Apr 12 21:39:07 nuc balenad[1152]: time="2022-04-12T21:39:07.353491896Z" level=info msg="shim balena-engine-containerd-shim started" address=/containerd-shim/424185e9333>
Apr 12 21:39:07 nuc 24474a8c8f68[998]: find: /mnt/root/tmp/balena-supervisor/services: No such file or directory
Apr 12 21:39:07 nuc balenad[1152]: time="2022-04-12T21:39:07.631528069Z" level=info msg="shim reaped" id=24474a8c8f6818507bbdc48b89b8e9c22f0f614b5610c07a15e6d56ca1963795
Apr 12 21:39:07 nuc balenad[998]: time="2022-04-12T21:39:07.640858254Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*ev>
Apr 12 21:39:07 nuc balenad[1152]: time="2022-04-12T21:39:07.882736137Z" level=info msg="shim balena-engine-containerd-shim started" address=/containerd-shim/424185e9333>

Any advise on how I can quickly fix this issue?

Thanks,
Nico.

Hi, thanks for reporting this. This is indeed a problem with v13.1.0 that is now being addressed (see Do not fail lockfile cleanup if files do not exist by pipex · Pull Request #1928 · balena-os/balena-supervisor · GitHub). At the moment self-serve supervisor are released without undergoing integration testing with the OS, we are also looking to address this so that they go through our OS automation tests before being available.
I have linked the github issue to this ticket so you will be updated once it’s addressed.

Hi again, thanks for reporting this, and sorry for the trouble. We have just released supervisor v13.1.1 that solves the issue. Let us know how this works for you.

1 Like

Thanks for the swift action!

My devices recovered after upgrading the the supervisor.