Supervisor fails to resolve DNS on v4, v5 in offline/air-gapped setup using open-balena

We deploy open-balena to an air-gapped network where the router resolves all the required balena domains: e.g. api.aivero.lan and advertises that DNS server via DHCP.

We balena os configure RaspberryPi3 with balenaOS v2.80.3 and these connect nicely.

We also tried adding a dnsServers: "null" entry to config.json to disable the automatic injection of 8.8.8.8 into the list of DNS servers. In certain cases having 8.8.8.8 caused a timeout waiting on a response from this server which is not reachable due to our air-gapped network.

However, these old images don’t have the fixed/updated HQ camera sensor-mode 5 so we need a newer version.

However, the newest v5.0.8, or v2.115.18+rev2 versions do not connect to open balena. The supervisors errors with getaddrinfo EAI_AGAIN api.aivero.lan:

EDIT: The latest openBalena version for RaspberryPi3 that has the HQ camera fix AND connects correctly is the v2.94.4
For the RaspberryPi4 we are using v2.88.4+rev0 which has both the HQ fix AND connects correctly.

root@9dc1123:~# balena ps
CONTAINER ID   IMAGE                                                            COMMAND                  CREATED          STATUS                             PORTS     NAMES
c699ff174f56   registry2.balena-cloud.com/v2/c5636e5430e2762232e60e19e79c773f   "/usr/src/app/entry.…"   49 seconds ago   Up 41 seconds (health: starting)             balena_supervisor
root@9dc1123:~# balena logs c699ff174f56 -f
INFO: Found device /dev/mmcblk0p1 on current boot device mmcblk0, using as mount for '(resin|balena)-boot'.
INFO: Found device /dev/mmcblk0p5 on current boot device mmcblk0, using as mount for '(resin|balena)-state'.
INFO: Found device /dev/mmcblk0p6 on current boot device mmcblk0, using as mount for '(resin|balena)-data'.
find: /mnt/root/tmp/balena-supervisor/services: No such file or directory
[info]    Supervisor v15.0.4 starting up...
[info]    Setting host to discoverable
[debug]   Starting systemd unit: avahi-daemon.service
[debug]   Starting systemd unit: avahi-daemon.socket
[debug]   Starting logging infrastructure
[info]    Starting firewall
[warn]    Invalid firewall mode: . Reverting to state: off
[info]    Applying firewall mode: off
[success] Firewall mode applied
[debug]   Starting api binder
[debug]   Performing database cleanup for container log timestamps
[info]    Previous engine snapshot was not stored. Skipping cleanup.
[debug]   Handling of local mode switch is completed
(node:1) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
[info]    API Binder bound to: https://api.aivero.lan/v6/
[event]   Event: Supervisor start {}
[info]    Starting API server
[info]    Supervisor API successfully started on port 48484
[debug]   Ensuring device is provisioned
[debug]   Connectivity check enabled: true
[debug]   Starting periodic check for IP addresses
[event]   Event: Device bootstrap {}
[info]    Waiting for connectivity...
[info]    VPN connection is not active.
[info]    New device detected. Provisioning...
[success] Initialised splash image backend
[info]    Reporting initial state, supervisor version and API info
[info]    Attempting to load any preloaded applications
[error]   LogBackend: unexpected error: Error: getaddrinfo EAI_AGAIN api.aivero.lan
[error]         at GetAddrInfoReqWrap.onlookupall [as oncomplete] (node:dns:119:26)
[event]   Event: Device bootstrap failed, retrying {"delay":30000,"error":{"cause":{},"isOperational":true,"errno":-3001,"code":"EAI_AGAIN","syscall":"getaddrinfo","hostname":"api.aivero.lan"}}
^C
root@9dc1123:~# ^C
root@9dc1123:~# ping api.aivero.lan
PING api.aivero.lan (192.168.88.243): 56 data bytes
64 bytes from 192.168.88.243: seq=0 ttl=64 time=1.528 ms
64 bytes from 192.168.88.243: seq=1 ttl=64 time=1.777 ms
^C

In the hostOS we can nslookup, ping or curl api.aivero.lan just fine.

Inside the supervisor container nslookup resolves it to the correct IP, but shows it as a Non-Authoritative answer.


@acostach any insights here? Thank you :slight_smile:

We have a temporary workaround:

The latest openBalena version for RaspberryPi3 that has the HQ camera fix AND connects correctly is the v2.94.4

For the RaspberryPi4 version v2.88.4+rev0 has both the HQ fix AND connects correctly.


The question is how we can get the new 5.x.x versions fixed such that RPI3 and RPI4 connect correctly.