SSH to service in local mode fails when using local address

Hello,

I am in the process of upgrading my fleet from the old resin/rpi-raspbian:jessie base image to the newer balenalib/rpi-raspbian:buster.
To try this without nuking the entire fleet, I have set up a device in local mode and so far migration seems to go pretty well.
However, I encountered a rather odd issue.

When using balena ssh <uuid>.local <service> I get the following error.

sh: -c: line 0: unexpected EOF while looking for matching `"’
sh: -c: line 1: syntax error: unexpected end of file

What surprises me, is that both balena ssh <uuid>.local and balena ssh <uuid> <service> do work.
The result is the same when I replace <uuid>.local with the local ip address.

Any ideas what might be going on here?

Hi, I’ve just tried doing this with a couple of my own device and don’t seem to able to reproduce the problem: entering a container works fine for me.

Would you give more information about your setup?

  • What CLI version are you using (balena -v) and on what OS?
  • What balenaOS version is running on the device?
  • How is your service image organized? It’s possible to make an image that will not have a shell inside, so ssh’ing there will not be possible. However, the error you get in this case should be different.

Thanks.

Also, is your device in local mode or not?

Hello,
Sorry for omitting such important information.

My development host is Windows 7 (I know…), balena cli is version 11.23.0.
The device is running Balena 2.47.0-rev1, dev build and it is indeed in local mode.

I don’t believe the Dockerfile used for my image does anything really special.

FROM balenalib/rpi-raspbian:buster

# Enable udev
ENV UDEV on

# Setup directories we will be using
RUN mkdir -p /key

# Update repos
RUN apt-get -q update && \
apt-get -q -y install wget && \
wget -O raspberrypi.gpg.key http://archive.raspberrypi.org/debian/raspberrypi.gpg.key && \
cp -a raspberrypi.gpg.key /key/ && \
apt-key add /key/raspberrypi.gpg.key

# Install required packages

RUN apt-get -q -y install vim screen curl openssh-client iproute2 && \
apt-get clean

RUN apt-get -q -y install libusb-1.0-0-dev libudev-dev usbutils usb-modeswitch usb-modeswitch-data udev && \
apt-get clean

RUN apt-get -q -y install dbus
RUN apt-get -q -y install openjdk-11-jdk && \
apt-get clean

# Clean up repos
RUN apt-get clean && rm -rf /var/lib/apt/lists/*

# Required for hid4java
# libudev0 is not available, link to the newer version
RUN ln -s /lib/arm-linux-gnueabihf/libudev.so.1 /lib/arm-linux-gnueabihf/libudev.so.0

COPY . /usr/src/app
RUN chmod +x /usr/src/app/check_modem.sh
RUN chmod +x /usr/src/app/run.sh

# Main process
CMD ["/usr/src/app/run.sh"]

Like I said, I can connect to the HostOS with the local address, but need the remote lookup to access the service.
Likewise I can run a shell from the HostOS using balena exec -it main_1_1 /bin/sh without any issues.

Yes, I agree the image should be fine, so I think it must be an issue with the CLI running on Windows 7.
Would you also provide a full output of the command below?

DEBUG=1 balena ssh <uuid>.local <service>

First what happens when I connect to HostOS with local address

$ balena ssh da3cefd.local
[debug] original argv0=“D:\Program Files\balena-cli\client\bin…\bin\node.exe” argv=[D:\Program Files\balena-cli\client\bin\node.ex e,D:\Program Files\balena-cli\client\bin\run,ssh,da3cefd.local] length=4
Last login: Thu Mar 26 13:55:36 2020 from <ip>
root@da3cefd:~#

Now the service using local address.

$ balena ssh da3cefd.local main
[debug] original argv0=“D:\Program Files\balena-cli\client\bin…\bin\node.exe” argv=[D:\Program Files\balena-cli\client\bin\node.exe,D:\Program Files\balena-cli\client\bin\run,ssh,da3cefd.local,main] length=5
sh: -c: line 0: unexpected EOF while looking for matching `"’
sh: -c: line 1: syntax error: unexpected end of file
Connection to da3cefd.local closed.

And finally the service using remote.

$ balena ssh da3cefd main
[debug] original argv0=“D:\Program Files\balena-cli\client\bin…\bin\node.exe” argv=[D:\Program Files\balena-cli\client\bin\node.exe,D:\Program Files\balena-cli\client\bin\run,ssh,da3cefd,main] length=5
[Debug] Fetching application by name da3cefd (string)
[Debug] Application not found
[Debug] Fetching device by UUID da3cefd (string)
root@da3cefd:/#

@TJvV, thank you for reporting this issue. I was able to reproduce the error with an MSYS2 bash shell on Windows 10. Were you also using a bash shell on Windows 7? Perhaps a bash provided by “git for windows”, or MinGW / Msys 1, or Msys 2 or some variation.

I have a fix in mind, which I will attempt shortly. I think what’s happening is that the CLI is getting shell escaping wrong at: https://github.com/balena-io/balena-cli/blob/v11.30.6/lib/utils/device/ssh.ts#L102-L127 I will attempt to get the CLI process to invoke ssh directly, rather than via a host OS shell.

Meanwhile, as a workaround, I suggest running the balena ssh command on a native Windows “Command Prompt” (cmd.exe) or PowerShell, which I have tested with success. I have also had success using bash through the Windows Subsystem for Linux (WSL 1), selecting Ubuntu in the Microsoft app store, and using a balena CLI release for Linux (standalone zip package). Oh, but I think WSL is only available for Windows 10.

@pdcastro I am indeed running this from my git bash (MinGW64, OpenSSH_7.1p2, OpenSSL 1.0.2g).

I just tried a few other methods.
Cygwin (OpenSSH_for_Windows_8.1p1, LibreSSL 2.9.2) has the same issue.
Windows cmd and Powershell (OpenSSH_for_Windows_8.1p1, LibreSSL 2.9.2) both work fine.

@TJvV, thank you for the extra information. We’re working on the fix for the issue and will let you know when the fix is released.

In the meantime, please run balena ssh on a native Windows “Command Prompt” (cmd.exe) or PowerShell, with a CLI release for Windows or a bash shell of the Windows Subsystem for Linux (WSL 1) (e.g. selecting Ubuntu in the Microsoft app store), and using a balena CLI release for Linux (standalone zip package) as my colleague mentioned above.

@TJvV, this issue should be fixed in CLI release 11.30.8 (releases). I have successfully tested balena ssh to a service container in a local mode device on a Git Bash shell (and also MSYS2). Let us know if you are still facing issues, and thank you again for reporting it.

@pdcastro This is a step in the right direction, but there is still something odd.

While it does connect, I don’t see the shell the way I’m used to.
It’s kind of like I’m getting a partial shell, where commands do appear to work, but some parsing breaks.

First the login, using <uuid>.local <service>

DEBUG=1 balena ssh da3cefd.local main [debug] original argv0="D:\Program Files\balena-cli\client\bin\..\bin\node.exe" argv=[D:\Program Files\balena-cli\client\bin\node.exe,D:\Program Files\balena-cli\client\bin\run,ssh,da3cefd.local,main] length=5 [debug] [C:\Program Files\Git\usr\bin\ssh.EXE, -t, -p, 22222, -o, LogLevel=ERROR, -o, StrictHostKeyChecking=no, -o, UserKnownHostsFile=/dev/null, root@da3cefd.local, (if [ -f /usr/bin/balena ]; then echo “balena”; else echo “docker”; fi) exec -i c725f7bbb024dcf519f3f656198d6e3618680b23f5999241de199989ec8342bb /bin/sh -c “if [ -e /bin/bash ]; then exec /bin/bash; else exec /bin/sh; fi”]

Try a few commands

echo $SHELL
/bin/bash
uname -a
Linux da3cefd 4.19.71 #1 SMP Fri Jan 31 09:51:26 UTC 2020 armv7l GNU/Linux

Try a clear screen (Ctrl+L)

^L
/bin/bash: line 3: $’\f’: command not found

The same commands using remote address

root@da3cefd:/# echo $SHELL
/bin/bash
root@da3cefd:/# uname -a
Linux da3cefd 4.19.71 #1 SMP Fri Jan 31 09:51:26 UTC 2020 armv7l GNU/Linux

Clear screen

root@da3cefd:/#
root@da3cefd:/#

Just like the clear screen, the bash autocomplete (<tab><tab>) also doesn’t work.

I also noticed I can actually quit the ssh using ctrl+c, which is something that doesn’t work that way with the remote address.

^CConnection to da3cefd.local closed.
ssh failed with exit code 255:
[C:\Program Files\Git\usr\bin\ssh.EXE, -t, -p, 22222, -o, LogLevel=ERROR, -o, StrictHostKeyChecking=no, -o, UserKnownHostsFile=/dev/null, root@da3cefd.local, (if [ -f /usr/bin/balena ]; then echo "balena"; else echo "docker"; fi) exec -i c725f7bbb024dcf519f3f656198d6e3618680b23f5999241de199989ec8342bb /bin/sh -c "if [ -e /bin/bash ]; then exec /bin/bash; else exec /bin/sh; fi"] Error: ssh failed with exit code 255: [C:\Program Files\Git\usr\bin\ssh.EXE, -t, -p, 22222, -o, LogLevel=ERROR, -o, StrictHostKeyChecking=no, -o, UserKnownHostsFile=/dev/null, root@da3cefd.local, (if [ -f /usr/bin/balena ]; then echo “balena”; else echo “docker”; fi) exec -i c725f7bbb024dcf519f3f656198d6e3618680b23f5999241de199989ec8342bb /bin/sh -c “if [ -e /bin/bash ]; then exec /bin/bash; else exec /bin/sh; fi”]
at whichSpawn (D:\Program Files\balena-cli\client\build\utils\helpers.js:215:15)
From previous event:
at runCommand (D:\Program Files\balena-cli\client\build\app-capitano.js:177:14)
at Object.exports.run (D:\Program Files\balena-cli\client\build\app-capitano.js:189:39)
at routeCliFramework (D:\Program Files\balena-cli\client\build\preparser.js:39:79)
at process._tickCallback (internal/process/next_tick.js:68:7)
at Function.Module.runMain (internal/modules/cjs/loader.js:834:11)
at startup (internal/bootstrap/node.js:283:19)
at bootstrapNodeJSCore (internal/bootstrap/node.js:623:3)

If you need help, don’t hesitate in contacting our support forums at
https://forums.balena.io

For CLI bug reports or feature requests, have a look at the GitHub issues or
create a new one at: https://github.com/balena-io/balena-cli/issues/

I hope this info helps

@TJvV, thank you for the the additional information, which has indeed helped. There is now a new CLI release, 11.30.10, that introduces a -t option to bypass the TTY autodetection that was introduced in 11.30.8. Could you try updating the CLI to version 11.30.10 and using the -t option? The command line would be:

DEBUG=1 balena ssh da3cefd.local main -t

I expect it will make a difference because I noticed the following “substring” in your debug output:

... exec -i c725f7b...

I had expected that “substring” to be:

... exec -i -t c725f7b...

The CLI code that autodetects TTY is at: https://github.com/balena-io/balena-cli/blob/v11.30.10/lib/utils/device/ssh.ts#L106-L108

If you can, let me know whether using the -t option indeed makes a difference, because I was not able to reproduce the error on Windows 10, macOS and Linux, including several Windows shells: Git Bash, MSYS2, WSL bash, Command Prompt and PowerShell. However, I was able to simulate the behaviour you described by modifying the TTY autodetection code to pretend that stdin is not a TTY.

If -t “solves” the problem for you, the next step would be to try to understand the reason why TTY autodetection fails in your environment. It could be something specific to Windows 7, or it could be something else. I could revert the code to remove TTY autodetection and always force TTY allocation, but this is also problematic because it prevents executing remote commands by piping them to balena ssh, for example with Git Bash:

echo "ls -la; exit;" | balena ssh da3cefd.local main

The above should execute "ls -la" on the device/service and print the output on your development host, but it only works if stdin is auto detected as not being a TTY. Before CLI 11.30.8, the piped command execution was not working for ssh to service containers on local devices,* and 11.30.8 fixed it at least on Windows 10, macOS, Linux. Piped commands to the host OS (rather than to services) have worked for a long time, and the feature was mentioned in balena’s January newsletter:

Which is why we would have to think twice before dropping TTY autodetection altogether, if autodetection is confirmed to be the reason for the issue you described.

* Note (to save you the trouble of reporting it!): piped commands to service containers on remote devices (given the device UUID and service name, and going through the balenaCloud gateway) do not currently work because of an issue in the balenaCloud gateway. I have raised it internally to the backend team.

Thanks again for testing it and helping us fix it.

Also, adding to my previous message, what is the output of the following commands on your Git Bash shell?

$ tty
/dev/cons0

$ tty -s; echo $?
0

$ node -e 'console.log(require("tty").isatty(0))'
true

The output above is what I get with Git Bash on Windows 10.

The -t flag does indeed help and piping (without flag obviously) does work.

Regarding your piping example; I’m not sure if it’s supposed to fail, but I had to change the cat command to echo to not get a “file does not exist” error.

I don’t see anything weird in the tty detection in my git bash

$ tty
/dev/pty0

tty -s; echo ?
0

$ node -e ‘console.log(require(“tty”).isatty(0))’
true

It seems like the problem lies in different versions of node.

$ node --version
v10.15.3

$ /d/Program\ Files/balena-cli/client/bin/node.exe --version
v10.19.0

When I try the same command using the supplied version:

$ /d/Program\ Files/balena-cli/client/bin/node.exe -e ‘console.log(require(“tty”).isatty(0))’
false

One more fun fact I just noticed with this version of balena-cli: I cannot login anymore.
I received a prompt saying my session expired, so I tried to log in and received this beautiful message.

$ DEBUG=1 balena login
[debug] original argv0=“D:\Program Files\balena-cli\client\bin…\bin\node.exe” argv=[D:\Program Files\balena-cli\client\bin\node.exe,D:\Program Files\balena-cli\client\bin\run,login] length=3

[…Balena logo which won’t format nicely…]

Logging in to balena-cloud.com
Prompts can not be meaningfully rendered in non-TTY environments
Error: Prompts can not be meaningfully rendered in non-TTY environments
at PromptUI.fetchAnswer (D:\Program Files\balena-cli\client\node_modules\inquirer\lib\ui\prompt.js:87:27)
at MergeMapSubscriber._tryNext (D:\Program Files\balena-cli\client\node_modules\rxjs\internal\operators\mergeMap.js:69:27)
at MergeMapSubscriber._next (D:\Program Files\balena-cli\client\node_modules\rxjs\internal\operators\mergeMap.js:59:18)
at MergeMapSubscriber.Subscriber.next (D:\Program Files\balena-cli\client\node_modules\rxjs\internal\Subscriber.js:66:18)
at MergeMapSubscriber.notifyNext (D:\Program Files\balena-cli\client\node_modules\rxjs\internal\operators\mergeMap.js:95:26)
at InnerSubscriber._next (D:\Program Files\balena-cli\client\node_modules\rxjs\internal\InnerSubscriber.js:28:21)
at InnerSubscriber.Subscriber.next (D:\Program Files\balena-cli\client\node_modules\rxjs\internal\Subscriber.js:66:18)
at Observable._subscribe (D:\Program Files\balena-cli\client\node_modules\rxjs\internal\util\subscribeToArray.js:5:20)
at Observable._trySubscribe (D:\Program Files\balena-cli\client\node_modules\rxjs\internal\Observable.js:44:25)
at Observable.subscribe (D:\Program Files\balena-cli\client\node_modules\rxjs\internal\Observable.js:30:22)
at Object.subscribeToResult (D:\Program Files\balena-cli\client\node_modules\rxjs\internal\util\subscribeToResult.js:12:23)
at MergeMapSubscriber._innerSub (D:\Program Files\balena-cli\client\node_modules\rxjs\internal\operators\mergeMap.js:82:53)
at MergeMapSubscriber._tryNext (D:\Program Files\balena-cli\client\node_modules\rxjs\internal\operators\mergeMap.js:76:14)
at MergeMapSubscriber._next (D:\Program Files\balena-cli\client\node_modules\rxjs\internal\operators\mergeMap.js:59:18)
at MergeMapSubscriber.Subscriber.next (D:\Program Files\balena-cli\client\node_modules\rxjs\internal\Subscriber.js:66:18)
at MergeMapSubscriber.notifyNext (D:\Program Files\balena-cli\client\node_modules\rxjs\internal\operators\mergeMap.js:95:26)
at D:\Program Files\balena-cli\client\node_modules\resin-cli-form\build\form.js:94:20
at runCallback (timers.js:705:18)
at tryOnImmediate (timers.js:676:5)
at processImmediate (timers.js:658:5)
at process.topLevelDomainCallback (domain.js:126:23)
at Object.exports.run (D:\Program Files\balena-cli\client\node_modules\resin-cli-form\build\form.js:71:18)
at Object.exports.ask (D:\Program Files\balena-cli\client\node_modules\resin-cli-form\build\form.js:126:18)
at Object.askLoginType (D:\Program Files\balena-cli\client\build\utils\patterns.js:78:22)
at doLogin (D:\Program Files\balena-cli\client\build\actions\auth.js:93:46)
at Command.action (D:\Program Files\balena-cli\client\build\actions\auth.js:106:15)
From previous event:
at runCommand (D:\Program Files\balena-cli\client\build\app-capitano.js:177:14)
at Object.exports.run (D:\Program Files\balena-cli\client\build\app-capitano.js:189:39)
at routeCliFramework (D:\Program Files\balena-cli\client\build\preparser.js:39:79)
at process._tickCallback (internal/process/next_tick.js:68:7)
at Function.Module.runMain (internal/modules/cjs/loader.js:834:11)
at startup (internal/bootstrap/node.js:283:19)
at bootstrapNodeJSCore (internal/bootstrap/node.js:623:3)

If you need help, don’t hesitate in contacting our support forums at
https://forums.balena.io

For CLI bug reports or feature requests, have a look at the GitHub issues or
create a new one at: https://github.com/balena-io/balena-cli/issues/

Again this seems to rely on the TTY detection.

Hi, you can login with token balena login -t <token> as a workaround until we fix the issue.

Just did a quick sanity check after downgrading back to 11.23.0, and the problem seems more involved with the invocation of node rather than the actual version.

$ which node
/d/Program Files/nodejs/node

$ node -e ‘console.log(require(“tty”).isatty(0))’
true

$ /d/Program\ Files/nodejs/node -e ‘console.log(require(“tty”).isatty(0))’
false

$ /d/Program\ Files/nodejs/node -e 'console.log(require("tty").isatty(0))'
false

@TJvV, as you were already concluding, the issue you had with balena login (“Prompts can not be meaningfully rendered in non-TTY environments”) and the “partial shell” with balena ssh, is related to your Git Bash shell environment on Windows 7, as opposed to changes in CLI versions 11.30.8 or 11.30.10. I’ve got some additional data points:

Git Bash on Windows 10:

$ which node && node --version
/c/Program Files/nodejs/node
v12.16.1

$ node -e 'console.log(require("tty").isatty(0))'
true

$ /c/Program\ Files/nodejs/node -e 'console.log(require("tty").isatty(0))'
true

Git Bash on Windows 7:

$ which node && node --version
/c/Program Files/nodejs/node
v12.16.1

$ node -e 'console.log(require("tty").isatty(0))'
true

$ /c/Program\ Files/nodejs/node -e 'console.log(require("tty").isatty(0))'
false

$ winpty /c/Program\ Files/nodejs/node -e 'console.log(require("tty").isatty(0))'
true

Note how the last invocation, preceded with winpty, changed the output of isatty. See more info on winpty at: https://stackoverflow.com/questions/48199794/winpty-and-git-bash