Scanning development OS shows production variant (Cannot push locally)

Flashed SD card with development OS variant for jetson xavier nx devkit. (balena-cloud-FLEETNAME-jetson-xavier-nx-devkit-2.82.11+rev8-dev-v12.9.3.img)

Scanning the network, it shows the production variant:

Scanning for local balenaOS devices... Reporting scan results
- 
  host:       9c3687a.local
  address:   192.168.0.121
  osVariant: production

Device /mnt/boot/config.json show "developmentMode": "true",.

This leads to an issues that might be related: With local mode enabled, it is not possible to push locally.

balena device local-mode 9c3687ad88bfab6ae30ac121f1709b7a --enable   

Local mode on device 9c3687ad88bfab6ae30ac121f1709b7a is now ENABLED.

balena push  9c3687a.local                                                                   
Retrying "Supervisor API (GET http://192.168.0.121:48484/ping)" after 2.0s (1 of 5) due to: Error: connect EHOSTUNREACH 192.168.0.121:48484
Retrying "Supervisor API (GET http://192.168.0.121:48484/ping)" after 4.0s (2 of 5) due to: Error: connect EHOSTUNREACH 192.168.0.121:48484
Retrying "Supervisor API (GET http://192.168.0.121:48484/ping)" after 4.0s (3 of 5) due to: Error: connect EHOSTUNREACH 192.168.0.121:48484
Retrying "Supervisor API (GET http://192.168.0.121:48484/ping)" after 4.0s (4 of 5) due to: Error: connect EHOSTUNREACH 192.168.0.121:48484
Retrying "Supervisor API (GET http://192.168.0.121:48484/ping)" after 4.0s (5 of 5) due to: Error: connect EHOSTUNREACH 192.168.0.121:48484
Could not communicate with device supervisor at address 192.168.0.121:48484.
Device may not have local mode enabled. Check with:
  balena device local-mode <device-uuid>

For a reason I cannot explain, after a few reboots, scan now show the output for the development variant and I can now use the local mode!?!

Having the same issue on a Raspberry 4, booting from SSD connected to usb 3.0.

@le13francois @luandro

Yeah I had the same issue with RPi 3 and RPi 4 as shown here: Cannot push to Balena device that appears to be functional: Supervisor API ECONNREFUSED - #35 by klutchell
Normally this did not work until one reboot, then it worked. @klutchell - seems like you can open up the bug again and it should be wider now with Jetson Nano, RPi 3 and RPi 4 this time.

Thanks for cross-linking the threads @nmaas87, I’ve linked the same issue in our database so we can track occurrences!

@le13francois could you confirm your OS release version and we will attempt to reproduce internally?

Could you also include the journal logs when it’s in that state after first boot? They can be collected with journalctl -u balena-supervisor -u os-config-json -a --no-pager.

Could you also provide the version of balena CLI you are using? Does the behaviour change if you update to the latest?

@klutchell

This seems to occur randomly, and when it does there’s no way to ssh into the device to get journal logs. The device ip pings, and appear with Online (VPN only) on the Balena dashboard, but no logs appear there either. After a few reboots everything goes back to normal.

I’m using balenaOS 2.94.4 on a Raspberry 4, booting from a SSD connected to usb 3. Supervisor
12.11.36. I didn’t even bother testing another version of Balena CLI cause the issue is clearly not there.

This is really really annoying issue.

@le13francois @luandro Does this only happen with development images, or production as well? Did this only start happening with recent OS releases? Have you tried to ssh with ssh -p 22222 balenacloud_username@balena.local or ssh -p 22222 root@balena.local?

It sounds like the supervisor is not running on your device, or the API is otherwise unreachable. I would like to narrow down the conditions in which it occurs, as we haven’t seen this in our internal testing.

Yes I’ve tried ssh -p 22222 even with -vvv and the logs shows connection reset by peer. I haven’t noticed this behavior before, seems to be recent.

Seems like the device randomly boots without running anything, not even supervisor. Could it be related to booting thru USB? Doesn’t seem like it though. Why does rebooting a few times fix it? Weird…

UPDATE: Updated device to latest balenaOS 2.95.8 and supervisor 13.0.0, but didn’t solve the problem. After the OS update reboot, the device is back to osVariant: production. A few reboots later and it’s again showing development and starting services as it should.

I’m testing with two Pi 4s. The one which boots from SD card always works fine, the problem really seems to be related to booting from the SSD connected to USB. Any ideas why this could be happening?

Just to add: I also had the issue with RPi 3-64 and RPi 4-64 with different kinds of SD cards (old, slow ones and new, very speedy ones)

I’ve changed the SATA USB adapter and the problem never occured again… I’m almost certain that was the cause.

Thanks for sharing your solution @luandro! I’ll update our attached patterns to reflect that this may occur when booting from external storage devices.

1 Like

@le13francois we haven’t heard from you in a while, are you still experiencing this issue?

@nmaas87I would be happy to continue debugging your issues back on the original thread if you have time to continue troubleshooting. There is of course no rush since I know you changed your work shifts. We can chip away at this problem at your convenience!

Just drop me a message over there if you’d like to get into it again!