registryEndpoint and vpnEndpoint in config.json seem to be ignored?

Setting registryEndpoint and vpnEndpoint in config.json seems to have no effect. I’ve set these in the boot partition’s config.json for both a 3.0.0 and a 3.1.12+rev1 NUC image, and at first boot (and subsequent boots), I get:

/mnt/state/root-overlay/etc/balena-supervisor/supervisor.conf contains:
SUPERVISOR_IMAGE=registry2.balena-cloud.com/v2/f383818305f117d7daa4a9fe30fb3d7b

/mnt/state/root-overlay/etc/openvpn/openvpn.conf contains:
remote cloudlink.balena-cloud.com 443

How do I get balenaOS images to actually use these two config.json settings?

Hello @scscsc thanks for your message.

Could you please explain what are your trying to achieve here?

Thanks

@mpous - We have a customer that does not allow any direct communications to AWS (long story), and thus we attempted to put a Cloudflare worker proxy + Spectrum in front of the AWS backed balena-cloud hostnames, but it turns out that balenaOS doesn’t actually use the registryEndpoint or vpnEndpoint. Rather they come from the response in https://api.balena-cloud.com/os/v1/config … and since we were proxying that response without modifying it, balenaOS was still talking to the original hostnames. We made it a bit further by having our CF worker proxy modify the v1/config responses in-flight… but didn’t want to mess with swapping vpn certs and whatnot. We are now exploring open-balena - however, it looks like balena-supervisor seems to always come from registry2.balena-cloud.com … which is causing us a lot of headaches.

Also, the redsocks proxy stuff won’t work here, because the customer has their own outbound traffic proxy we are required to use - meaning we can’t have redsocks point to that and then have another proxy beyond it. Additionally, the VPN never goes over the redsocks proxy, which is the next headache we have to deal with for this deployment.

If balena-cloud ever offers custom hostnames as a paid feature, we’d be the first in line to pay for it.

1 Like

Thanks @scscsc for the clarification.

I understand that you tried to connect behind a proxy using this redsocks example right? Network Setup on balenaOS - Balena Documentation could you please confirm what proxy is your customer using?

Hi, following up on your answer - the proxy settings in the hostOS using redsocks is the solution that our customers are using to deploy on situations like the one you describe. There a couple of statements above that would benefit from clarification:

meaning we can’t have redsocks point to that and then have another proxy beyond it

Could you please extend the above statement? Why is configuring the hostOS to use the outbound proxy a problem?

the VPN never goes over the redsocks proxy,

Once the hostOS is configured, all TCP traffic is routed via the proxy. It can also be configured so that UDP DNS requests are proxied through it. If that’s not working or there is something particular about this setup we would like to know so we can investigate what the problem is exactly.

@alexgg -

Once the hostOS is configured, all TCP traffic is routed via the proxy. It can also be configured so that UDP DNS requests are proxied through it. If that’s not working or there is something particular about this setup we would like to know so we can investigate what the problem is exactly.

The first thing the README at GitHub - balena-labs-projects/proxy-tunnel: Connecting your device(s) to balenaCloud from behind a compatible proxy. (which is linked from the Network setup page) says is:

redirect all TCP traffic (except for VPN)

On balenaOS it appears that redsocks only proxies traffic for the bridge interface, which does not include the resin-vpn interface. Maybe I’m missing some special config I need to include VPN and DNS traffic over the proxy?

Could you please extend the above statement? Why is configuring the hostOS to use the outbound proxy a problem?

  • Easy: enterprise --> balena-cloud - in this case, we can simply use the proxy to mask balena-cloud hostnames
  • Hard: enterprise --redsocks--> required enterprise-specific-forward traffic proxy --> balena-cloud - some enterprises will ONLY allow outbound traffic through their own forward/outbound proxy, where they often do MITM inspection of traffic and very tight filtering on destination hostnames/IPs/etc.

In the Easy case, we can simply introduce our own proxy server to mask balena-cloud hostnames enterprise --redsocks--> our proxy --> balena-cloud. In the hard case, there is no way to do: enterprise --redsocks--> required enterprise-specific-forward traffic proxy --> our-proxy --> balena-cloud which means enterprise-specific-forward traffic proxy still sees balena-cloud’s s3 and registry domains are backed by AWS and blocks the traffic.

Using cloudflare to front those domains would fix this, but since balena-cloud doesn’t offer custom hostnames per tenant, we went down the path of building our own reverse-proxy so that it is transparent to the enterprise forward proxy, but that requires that we make changes to device config in-flight back to the gateway, which in turn opens up a can of worms that isn’t worth the headache (for example, there is a /v2/config proposal, which would immediately knock all of our devices offline).

1 Like

Hi, thanks for the link to the proxy-tunnel documentation - I am asking internally to understand why that is the case.

I have done a simple test to confirm that VPN traffic is also proxied using my own computer as proxy:

  1. On my workstation, I run the glider proxy:
docker run --rm -it --net host nadoo/glider -verbose -listen :8123
  1. I configure a balenaOS device to use it by adding the following /mnt/boot/system-proxy/redsocks.conf and rebooting:
base {
log_debug = off;
log_info = on;
log = stderr;
daemon = off;
redirector = iptables;
}

redsocks {
type = socks5;
ip = ${SERVER_IP};
port = 8123;
local_ip = 127.0.0.1;
local_port = 12345;
}

Where SERVER_IP is the IP address of the workstation running the proxy.

The device reboots and I can see it appear in the dashboard. If I then stop the glider proxy in my computer, the VPN connection is lost. You can check the openvpn logs in the device to see how the device is trying to connect via the proxy.

Thanks for the extended explanation - a good understanding of the use case will add weight to the custom domains per tenant feature request.

So, the problem is not that balenaOS cannot be configured to use the enterprise proxy, but that AWS backed domains are not allowed by this enterprise proxy, and using a different proxy is not an option.

One more thing, about the transparent reverse proxy, when you say:

 there is a /v2/config proposal, which would immediately knock all of our devices offline

Are you referring to the supervisor API? The reason to have versioned endpoints afaik is to introduce breaking changes with backwards compatibility, so /v1/config would still work.

1 Like

A final note, I have added a feature request to our public roadmap, please upvote and leave any further explanatory comment that you think is missing.

2 Likes

Thanks for the VPN example. We’ll dig into this to see if we can confirm.

1 Like

@alexgg / @mpous - Even with us moving fully to a self-hosted open-balena instance (at *.edge.docker.localhost), and config.json fully pointed to it:

{
    "applicationId": 1,
    "userId": 3,
    "apiKey": "...omitted...",
    "deviceType": "generic-amd64",
    "appUpdatePollInterval": 60000,
    "listenPort": 48484,
    "vpnPort": 443,
    "apiEndpoint": "https://api.edge.docker.localhost",
    "vpnEndpoint": "cloudlink.edge.docker.localhost",
    "registryEndpoint": "registry.edge.docker.localhost",
    "deltaEndpoint": "https://delta.edge.docker.localhost",
    "mixpanelToken": "unused",
    "logsEndpoint": "https://logs.edge.docker.localhost",
    "balenaRootCA": "...omitted..."
}

The /mnt/state/root-overlay/etc/balena-supervisor/supervisor.conf still defaults to:

SUPERVISOR_IMAGE=registry2.balena-cloud.com/v2/50dd4f24898632338c0d4f7016f39c03
SUPERVISOR_VERSION=v14.12.0
LED_FILE=/dev/null

and the resulting balena-supervisor config.v2.json file ends up with:

      "DOCKER_ROOT=/mnt/root/var/lib/docker",
      "DOCKER_SOCKET=/var/run/balena-engine.sock",
      "BOOT_MOUNTPOINT=/mnt/boot",
      "MIXPANEL_TOKEN=unused",
      "LED_FILE=/dev/null",
      "DELTA_ENDPOINT=https://delta.edge.docker.localhost",
      "LISTEN_PORT=48484",
      "NODE_EXTRA_CA_CERTS=",
      "SUPERVISOR_IMAGE=registry2.balena-cloud.com/v2/50dd4f24898632338c0d4f7016f39c03",
      "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
      "VERSION=master",
      "SUPERVISOR_CONTAINER_ID=aa2672052c1ebc0e6c296cc4d13922302638c1dad8c2b467b4232bef8c24c122"

How do I get balenaOS to obey the registryEndpoint from config.json instead of always using registry2.balena-cloud.com?

Looks like it is hardcoded at build time into supervisor.conf by https://github.com/balena-os/meta-balena/blob/master/meta-balena-common/recipes-containers/balena-supervisor/balena-supervisor/update-balena-supervisor and there doesn’t seem to be anything in entry.sh that makes any attempt to reference config.json before fetching the supervisor container.

the openvpn.conf file properly populates from the open-balena instance with the right hostname… so it is just the hardcoded supervisor.conf URL that remains an issue.

Is https://github.com/balena-os/meta-balena/blob/master/meta-balena-common/recipes-support/balena-config-vars/balena-config-vars/balena-config-vars the right place to check config.json for a registryEndpoint and update supervisor.conf?

@alexgg / @mpous - any thoughts on this?

Hi, the supervisor.conf file includes the details about the preloaded supervisor image - the supervisor is deployed to the balena-cloud.com registry, so the image points at it. It will also be updated when the supervisor updates. You can think of the balena-cloud.com registry as a public registry that can be used by the device to pull images, even if it’s managed by openBalena, kind of like dockerhub.

What’s the problem you have with that being the case? If you configure the proxy feature in the device, these pulls will also be proxied.

@scscsc,
Since it’s been a bit, I wanted to ping you directly to make sure you saw AlexG’s response above. Hopefully the notification will grab your attention. :slight_smile:

Sorry, totally missed this…

Thanks @alexgg and @the-real-kenna!

While I appreciate that the balena team manages/updates supervisor on balena-cloud, some of our customers flat out refuse to allow any traffic to balena-cloud.com. For this reason we went entirely self-hosted open-balena … but still have an awkward link back to balena-cloud for the supervisor image.

To clearly answer your question of “What’s the problem you have with that being the case? If you configure the proxy feature in the device, these pulls will also be proxied.” - this particular customer happens to be a major retailer that competes with Amazon, and requires that all outbound traffic go through an HTTP CONNECT proxy where individual hostnames must be whitelisted. registry2.balena-cloud.com is a CNAME that resolves to a handful of amazonaws.com records - and thus will never be approved by this customer.

It would be fantastic if we could add open-balena-registry-proxy to our open-balena stack and then configure open-balena-api to return a supervisorRegistryEndpointProxy option when balenaOS calls https://api.my-open-balena.com/os/v1/config. balenaOS can then just change the hostname in SUPERVISOR_IMAGE, but leave the rest of the URL intact.

I’d even be fine throwing our own CNAME in front of registry2.balena-cloud.com that uses Cloudflare proxying, but that requires the api server be capable of returning our custom CNAME and balenaOS actually applying it to SUPERVISOR_IMAGE and supervisor.conf.

Hello, while it is technically possible to populate your local openBalena registry instance(s) with Supervisor images from our public registry using balena CLI/SDK, it is quite verbose/involved. We do offer balenaMachine, which is a private copy of balenaCloud and which can run within your customer’s network(s). That codebase does import Supervisor images locally, so than could be a possible solution…

We went down the balena-machine route with the balena team and it didn’t work for our usecase for quite a few reasons.

Mirroring images from an existing container registry is pretty straight forward, we just need balenaOS to allow us to override the SUPERVISOR_IMAGE url or base hostname.