Hi,
We did a upgrade to the latest openbalena (3.x) in a bit of a rush (the IT did not backup the volumes so probably no way to revert back to 2.x)
We have 2 ongoing issues…
We see unstability of the system (API crashes)
Here is the logs from the HA
[WARNING] 319/132651 (17) : Server backend_vpn/balena_vpn_1 is UP, reason: Layer4 check passed, check duration: 0ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
[WARNING] 319/132653 (17) : Server backend_registry/balena_registry_1 is UP, reason: Layer4 check passed, check duration: 0ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
[WARNING] 319/132654 (17) : Server backend_s3/balena_s3_1 is UP, reason: Layer4 check passed, check duration: 0ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
[WARNING] 319/132721 (17) : Server backend_api/balena_api_1 is UP, reason: Layer4 check passed, check duration: 0ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
[WARNING] 319/132738 (17) : Server vpn-tunnel/balena_vpn is UP, reason: Layer4 check passed, check duration: 0ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
[WARNING] 320/015000 (17) : Server backend_api/balena_api_1 is DOWN, reason: Layer4 connection problem, info: “Connection refused”, check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] 320/015000 (17) : backend ‘backend_api’ has no server available!
We now get error 503 - after everything was runing over the night…
We need to debug this very urgently…
You can see the new devices… but what are options about the old devices (one does not report at all) and other which are deployed and runing… what are our options?
Also, is there a way to deploy SSH key to these devices in orer to have ssh access ?
Make sure you update to the latest openBalena version – it is v3.1.1 now – which fixes a couple of initial issues. I’d then check the API service logs by SSH’ing into the container and querying journald.
Hi,
The update was done yesterday… i belive that makes it 3.1
On the other side, today i fixed the issue with only reseting the docker:
docker restart 2f5d63b9a213
The logs on the api instance are quite empty …
Also, can you provie info on the update of existing nodes - expecialy if we need to allow SSH access and the nodes are not available on site so how do we insert the SSH keys? any cool balena-cli command for this?
If you flashed a production image then you’re out of luck unfortunately and you’ll have to get physical access to the devices.
There should be some logs to debug issues like the API failing to start up, you mean there wasn’t anything like that in the logs? Cause even in production mode, the API should still log errors, etc, but much less than in development mode. Normally that should be enough, which is why we don’t use a DEBUG variable from the compose file.
The logs should not be empty for sure, you can try restarting the services and see if that fixes the issue. If it still persist, you can open an issue with some more info here or here, and we can take it from there.