Local SSH does not work without internet access?

I manage a fleet of cell connected balena devices, and we have had multiple cases where the devices lose cell connection and local troubleshooting is required.

I am trying to put together a troubleshooting process for when our technicians show up on site, but I am struggling to find a way to log into the device locally when internet is not available.

I have been able to connect to the devices over a direct Ethernet connection using this guide when the Balena device has internet access, but when attempting to do so without internet access on the Balena device it appears to be unable to validate the private key.

Is this intentional?
If not, what am I doing wrong?
Is there a way to synchronize the keys for offline use, or is some other method available for local troubleshooting without an internet connection?

Hello @aferm thanks for your message and apologizes for the experience.

Do you know the status of the devices when they lose cellular connectivity? If the device is offline you can’t access to it using balena.

Does your application check using Network Manager (nmcli) if there is connectivity available and re-connect again?

On the other hand, if you can connect a device on the same network (via Ethernet) you can check these instructions.

Let us know if that works!

The devices show “Offline” on the dashboard, which is why we were trying the local ssh option. We do have a service that manages the cell connection, but for some reason it is failing to re-establish the connection in this situation.

We need to be able to gain access to the device in this situation to inspect the state.

The provided instructions for using a gateway device do not work, as the ethernet is not connected when the device originally goes offline, so we don’t have a local IP address to provide.

When connecting a laptop or another balena device to the offline device I am able to find the link local address using avahi, but the device refuses my private key. The same ssh command using the .local mDNS hostname succeeds when the device’s cell modem is connected.

UPDATE: Supplying the link local IPv6 address does not work with the gateway device instructions.

More context on the situation.

Our devices are deployed in remote areas with their cell modem being their only active network connection. They are also configured to receive a DHCP IP address over Ethernet for our provisioning process.

We have seen multiple times where a device will go offline, and our service that manages the cell connection never succeeds in reconnecting. Personnel on site have noted that messages on our display are changing, indicating that our software is running. When they cycle power to the device it is immediately able to connect to the cell network, but there are no logs from the period when the device was offline even though we have enabled persistent logging.

Since all state from the time in question appears to be lost when resetting the device, we need a way to log into the device on site without resetting it to inspect the state and determine the root cause.

Ideally our technicians would have a simple way to plug in on-site with either a laptop or another balena device to give me or our other engineers remote access to the device.

Since I can’t reproduce the exact problem in our office, I am trying to approximate it by booting one of our devices without a sim card and connecting directly over ethernet as would be the case in the field.

Without trying it myself, it seems like you’ve to setup some kind of DHCP server on the laptop of the engineers trying to connect to it. I know macOS has a built-in feature for “sharing” an internet connection which does this for you, I’m sure Windows and/or Linux has a similar function. This way, the Balena device will get an IP address from the laptop of the engineer. Retrieving this IP address will be somewhat more difficult, but I use the command arp -a for this (on a Mac).

But keep in mind that the fact that the Balena device won’t reconnect to the cellular network might indicate that the networking service of the device has failed. If that’s the case and the ethernet solution doesn’t work, you can try and reboot it and find out if it works then. But as I read your topic, it seems like this is the case because of the “only works with internet access”. You can try to remove the whole cellular from a development device and try to directly connect to it with ethernet, just to make sure that method works when the device is offline?

That’s just my take on your problem on how I would debug it as far as you can :slight_smile:

1 Like

Hi there,
What I would ask now:

  1. is the device in production mode? (sounds like it probably is if its difficult to ssh in)
  2. If so, local ssh’ing using a standalone ssh client (ssh <IP> -p 22222, without the device having internet/balena cloud access it won’t work - unless you add a public ssh key into the devices config.json, and have the corresponding private key on the laptop your engineer will be using.
  3. Alternatively you could try using the balena CLI to ssh in to the device from the laptop. I believe that given you are using an authenticated balena CLI (done via balena login) , you should be able to ssh in to your target device using balena ssh <IP> (this required CLI v13.3.0 or later, and balenaOS v2.44.0 or later)

The CLI option shouldn’t require adding SSH keys to the device’s config.json and engineers laptop. If you have already tried this, let me know and we can investigate further

  1. Yes, all of our devices run in production mode.
  2. This works, and even supports mDNS name resolution with IPv6 link local addresses. This is the behavior I expected with the keys added on the balena dashboard. Is there a way to set this set of keys from the dashboard? This is only usable if we can manage them centrally to add and remove authorized keys rather than building them into our image before flashing and having to log into each device to update them if they change. It would also be acceptable, but not ideal if we can manage these keys from our application.
  3. I haven’t tried this, but would prefer a solution that doesn’t require internet access on either device. While it is likely an on-site technician or engineer has internet access, it isn’t guaranteed because some of our installations require directional cell antennas mounted 10-15 feet off the ground.

Hi there,

Thanks for clarifying. In that case - unfortunately I believe there is no “built in” way to centrally manage these SSH keys from the dashboard. The best I can recommend right now is to write a script that SSH’es into your devices and writes the keys to the config.json of the device any time you want to change them.