Tunneling socket could not be established (Node.js SDK)

Hi,

I was trying to create some endpoints on my own server that talk to the openBalena API via the Node.js SDK. Gathering data via balena.models.device.get(<uuidOrId>)seems to work fine. This is (probably) because most of it is from the database itself.

However, when I’m trying to reboot a device via the SDK, via balena.models.device.reboot(<uuidOrId>, { force: false }), I get the following error:

tunneling socket could not be established, cause=connect ECONNREFUSED 178.62.251.244:3128

The IP address is unknown for me, so that’s probably the reason why it’s failing. But I don’t really know how to debug this, because how did it get that IP address in the first place? My VPN is running on another IP address and the API and Registry also doesn’t run on that IP address.


On further investigation, that’s the IP address of the current server it’s running on. However, also for scaling purposes, not everything runs on the same server. So the VPN runs on it’s own server, and the API on its own server. In front of the API and the VPN, there are load-balancers. It should try to connect to the IP address of the Load Balancer instead of the server itself probably. Is this even possible at this time?

Thanks in advance!

I’ve checked the open-balena-api code, and it seems like it’s using the is_managed_by__service_instance.ip_address to determine the IP of the server. This seems to be okay, but because the VPN servers are behind a Load Balancer in Kubernetes, the direct port is not opened, only via the Load Balancer. I’m also using Kubernetes, so opening port 3128 directly is not really possible.

I’ve seen that I can change this using VPN_CONNECT_PROXY_PORT, but this probably means I’ve to reflash our devices?

And if I’m using 2 or more VPN servers (via Kubernetes), how does this work? As said, the VPN servers are behind a load balancer, so vpn.<domain> is directed to a Load Balancer. So device A can be connected to VPN #1, but device B can be connected to VPN #2. If this happens, is it still possible to use balena tunnel and also use the Balena Node.js SDK? And if not, how can I scale my VPN’s?

Thanks!

Hi Bart,

So the IP address the API holds is the IP address of the VPN server which the device is connected to; this much I think you’ve worked out for yourself. When a device connects to the VPN a script is fired in the VPN container to update the API with the information. The API uses the IP of the client making the request, so the IP of the VPN instance. If for some reason the device reconnects to the VPN, but it goes to a different VPN instance, and the API isn’t updated before an API lookup is made, then the VPN instance will also try and connect through other VPN instances, buy looking up the VPN IP in the API too. The idea here is to make a system which can always route to the device.

I think you need to look at how the VPN instances route to the API, and give each VPN instance a public address. I don’t think it was considered to have your VPN instances behind a balancer with a single IP address on the frontend.

Hope this helps.

1 Like

Hi Rich,

Thanks for your explanation. It confirms my thoughts.

So to solve this problem, I’ve to expose port 3128 of the specific VPN server to the public instead of via the Load Balancer? I have to check if I can do that, because I run everything via Kubernetes, which is not really designed to open ports on a specific server and route traffic to it.

And do I need to expose port 443 as well for the specific server(s)?

I’ll post my findings here. Another solution is probably separate the VPN from the Kubernetes cluster, but I like the way it scales (if it works with the VPN).


I’m curious, how do you guys scale the VPN for Balena? Because you’ll need a Load Balancer to distribute the load between VPN servers, right?

Bart,

We use multiple VPN instances each with a public IP address in AWS. You can see this by running dig on the VPN hostname.

~ » dig vpn.balena-cloud.com

; <<>> DiG 9.16.1-Ubuntu <<>> vpn.balena-cloud.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 31730
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;vpn.balena-cloud.com.		IN	A

;; ANSWER SECTION:
vpn.balena-cloud.com.	66	IN	CNAME	ab62c62b8a0004e8cbc98d804db4adee-ab269b2a66b510a2.elb.us-east-1.amazonaws.com.
ab62c62b8a0004e8cbc98d804db4adee-ab269b2a66b510a2.elb.us-east-1.amazonaws.com. 43 IN A 3.227.28.93
ab62c62b8a0004e8cbc98d804db4adee-ab269b2a66b510a2.elb.us-east-1.amazonaws.com. 43 IN A 35.169.89.252
ab62c62b8a0004e8cbc98d804db4adee-ab269b2a66b510a2.elb.us-east-1.amazonaws.com. 43 IN A 35.169.76.143

;; Query time: 0 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Thu Dec 31 12:59:55 GMT 2020
;; MSG SIZE  rcvd: 185

The balenaOS device will connect to one of the 3 instance IP addresses.

And do I need to expose port 443 as well for the specific server(s)?

I don’t think so.

Hi Rich,

Thanks for your explanation! Haven’t thought about using dig. And didn’t know AWS worked that way with multiple A records. I should take a look at that, but I’m using DO. I’ll look into it if I can figure something out!