Upgraded the supervisor, now I’m not able to ssh into the containers, is there a way of reverting the supervisor upgrade?
Hello @vedyilmaz
Can you tell us more about what’s happening ?
Are you sure the containers runs properly?
Thanks for your response, The device was running of supervisor version 11.x and I upgraded it to the latest one then I lost ssh access to the containers using balena-cli and on balena terminal window. I purged to re-deploy the fleet but no success so far. I can access to the host. May reverting the supervisor to the previous version would help if it’s possible.
root@3f2326c:~# systemctl status resin-supervisor
● resin-supervisor.service - Balena supervisor
Loaded: loaded (/lib/systemd/system/resin-supervisor.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2022-06-24 14:28:33 UTC; 8min ago
Process: 563 ExecStartPre=/usr/bin/balena stop resin_supervisor (code=exited, status=0/SUCCESS)
Process: 769 ExecStartPre=/bin/systemctl is-active balena.service (code=exited, status=0/SUCCESS)
Main PID: 770 (start-resin-sup)
Tasks: 11 (limit: 4709)
Memory: 10.6M
CGroup: /docker/67fe366d02e8fc880cfbb6ffdd6f7a8677ef61bbda3c4ae8405b736a9097a8b1/system.slice/resin-supervisor.service
├─770 /bin/sh /usr/bin/start-resin-supervisor
├─772 /proc/self/exe --healthcheck /usr/lib/resin-supervisor/resin-supervisor-healthcheck --pid 770
└─841 balena start --attach resin_supervisor
Jun 24 14:37:12 3f2326c resin-supervisor[770]: [error] Unable to get architecture: Error: ENOENT: no such file or directory, open '/mnt/root/mnt/boot/device-type.json'
Jun 24 14:37:12 3f2326c resin-supervisor[770]: [error] Unable to get device type: Error: ENOENT: no such file or directory, open '/mnt/root/mnt/boot/device-type.json'
Jun 24 14:37:12 3f2326c resin-supervisor[770]: [error] Unable to get device type: Error: ENOENT: no such file or directory, open '/mnt/root/mnt/boot/device-type.json'
Jun 24 14:37:12 3f2326c resin-supervisor[770]: [error] Unable to get architecture: Error: ENOENT: no such file or directory, open '/mnt/root/mnt/boot/device-type.json'
Jun 24 14:37:12 3f2326c resin-supervisor[770]: [warn] Could not initialise splash image backend Error: ENOENT: no such file or directory, scandir '/mnt/root/mnt/boot/splash'
Jun 24 14:37:12 3f2326c resin-supervisor[770]: [warn] Failed to read splash image: Error: ENOENT: no such file or directory, open '/mnt/root/mnt/boot/splash/balena-logo-default.png'
Jun 24 14:37:12 3f2326c resin-supervisor[770]: [error] Scheduling another update attempt in 512000ms due to failure: InternalInconsistencyError: Unknown device architecture unknown. Could not find matching supervisor metadata.
Jun 24 14:37:12 3f2326c resin-supervisor[770]: [error] at Object.exports.getSupervisorMetadata.memoizee.promise (/usr/src/app/dist/app.js:10:2687348)
Jun 24 14:37:12 3f2326c resin-supervisor[770]: [error] Device state apply error InternalInconsistencyError: Unknown device architecture unknown. Could not find matching supervisor metadata.
Jun 24 14:37:12 3f2326c resin-supervisor[770]: [error] at Object.exports.getSupervisorMetadata.memoizee.promise (/usr/src/app/dist/app.js:10:2687348)
Hi @vedylmaz, I suspect some corruption may have happened in the file /mnt/boot/device-type.json
. It is strange that this is preventing you from SSHing into the containers as that feature does not depend on the supervisor.
Could you provide the following information for me?
- What is your OS version?
- What is the previous supervisor version?
- What is the newer supervisor version you are upgrading to?
- What is the result of running the command
cat /mnt/boot/device-type.json
on the host OS terminal? - What is the output of the command
balena ps
Thank you
Hi Felipe,
It’s running balenaOS 2.50.1+rev1. The device type is Intel NUC, there is no /mnt/boot/device-type.json file and seems supervisor 14.0.6 is complaining It cannot find it. The previous supervisor version was 11.4.10 and I would like to revert to it. I tried changing the image to 11.4.10 image on /etc/resin-supervisor/supervisor.conf (registry2.balena-cloud.com/v2/e48a6f881298587edde22b62cc500295) but it reverts to 14.0.6 image later on.
root@3f2326c:~# balena ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1855ea0afe18 registry2.balena-cloud.com/v2/5335fb2a14f2871a5be88f234500311f:latest "/usr/src/app/entry.…" 11 hours ago Up 11 hours (healthy) resin_supervisor
4fec985a2eb5 b90cda6b81c8 "/usr/bin/entry.sh b…" 20 hours ago Up 11 hours desigo_4291255_1996334
Where is it keeping the supervisor version? Maybe if we change it to the old version it will be fixed.
I was able to downgrade the supervisor by doing the following:
- replaced the image on /etc/resin-supervisor/supervisor.conf with the target supervisor version image
- replaced the image on /tmp/update-supervisor.conf with the target supervisor version image
- removed all the containers: balena rm container-id
- systemctl stop resin-supervisor
- rm -v /mnt/data/resin-data/resin-supervisor/database.sqlite
- systemctl start resin-supervisor
All the containers are added and the supervisor version 11.4.10 is running now.
Hi @vedyilmaz,
Sorry for the delayed response. Glad you were able to find a way to downgrade the supervisor. I am very intrigued by how did you get this error in the first place. You point out that /mnt/boot/device-type.json
does not exists on your device, but on a freshly booted image for balenaOS 2.50.1+rev1 the file is there. Did you update this device from a previous balenaOS version at some point?
If you are interested in retrying the supervisor update, one thing you could do is to try to recreate the file and re-try the update. I think the commands I share below should be enough. Please let us know if this works for you
root@4076c29:~# cat << EOF | jq . > /tmp/device-type.json
> { "slug": "intel-nuc", "arch": "amd64" }
> EOF
root@4076c29:~# cat /tmp/device-type.json
{
"slug": "intel-nuc",
"arch": "amd64"
}
root@4076c29:~# cp /tmp/device-type.json /mnt/boot/
Hi Felipe,
Thanks for your response. The issue is that the device is of type Intel NUC running a qemu image and on this type of installation the file is located on another directory, I needed to copy it from its location to /mnt/boot/device-type.json and all is good now. This looks like a bug. Thanks.
Oh, so you are using a qemu image? This does sound like a bug in that balenaOS qemu version (I haven’t been able tot est this yet though). However we are phasing out qemu images in favor of generic x86/aarch64 images. I would recommend switching to one of those newer images if possible.
Glad that copying the file fixed the issue for the moment.
Good to know that you’re phasing out qemu images and we’ll probably be test and switch to generic x86/aarch64 images. Thanks.