Issues with DBus in generic AARCH64 image

Hello,

We’re trying to build an application for an AARCH64 device. As we have no possibility to flash BalenaOS on the device, we decided to run BalenaOS in Docker, with the generic AARCH64 option available in BalenaCloud.
We managed to do the setup correctly, I think, as all our containers are automatically deployed to BalenaOS and are started. However, one of our containers fails to run. This container makes use of DBus and if we check the logs of that container, the following error is shown:

(node:33) UnhandledPromiseRejectionWarning: Error initializing network manager: Error: Failed to connect to socket /host/run/dbus/system_bus_socket: No such file or directory
(Use node --trace-warnings ... to show where the warning was created)
(node:33) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag --unhandled-rejections=strict (see Command-line options | Node.js v15.9.0 Documentation). (rejection id: 2)
(node:33) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

As you can see, the container can’t access the dbus system bus socket on the host container, eventhough it’s there. We found that this behavior happens if the dbus label is not provided in the docker-compose file, however we did add this label. This is part of our docker-compose file:

environment:
DBUS_SYSTEM_BUS_ADDRESS=unix:path=/host/run/dbus/system_bus_socket
labels:
io.balena.features.dbus: ‘1’

I’m not sure how to continue to get this container working in Balena. Do you have any idea what we’re doing wrong or what we have to do to fix this issue?
Thanks in advance for the help.

Hi,

Can you please share with us your docker-compose file so we can have a look?

The docker-compose file:

version: “2.1” # Currently the maximum version that is supported by Balena, see Multiple containers - Balena Documentation

volumes:
prometheus-data:
grafana-data:
meter2mqtt-sqlite:
meter2mqtt-secrets:
auth-sqlite:
auth-private:
auth-public:

services:
prometheus:
build:
context: .
dockerfile: dockerfiles/prometheus
volumes:
- prometheus-data:/prometheus
restart: “unless-stopped”
command:
- ‘–config.file=/etc/prometheus/prometheus.yml’
- ‘–storage.tsdb.path=/prometheus’
- ‘–storage.tsdb.retention.time=92d’ # Should be 3 months, but months are not available. Make it 92d (92 days) instead.

node-exporter:
image: prom/node-exporter:latest
restart: “unless-stopped”

grafana:
build:
context: .
dockerfile: dockerfiles/grafana
depends_on:
- prometheus
volumes:
- grafana-data:/var/lib/grafana
environment:
- GF_USERS_ALLOW_SIGN_UP=false
- GF_SECURITY_ALLOW_EMBEDDING=true
- GF_USERS_DEFAULT_THEME=light
- GF_DASHBOARDS_DEFAULT_HOME_DASHBOARD_PATH=/etc/grafana/provisioning/dashboards-json/skdbyte_dashboard.json
- GF_AUTH_LOGIN_COOKIE_NAME=skdbyte_refresh_token
- SKDBYTE_PROMETHEUS_URL=http://localhost:9090
- GF_SECURITY_ADMIN_USER=Admin
- GF_SECURITY_ADMIN_PASSWORD=admin
- SKDBYTE_GRAFANA_URL=http://localhost:3000
restart: “unless-stopped”

meter2mqtt:
build:
context: .
dockerfile: dockerfiles/meter2mqtt
working_dir: /home/node/app
environment:
- NODE_ENV=production
- NODE_PATH=/home/node/app/node_modules
- TYPEORM_CONNECTION=sqlite
- TYPEORM_DATABASE=/home/sqlite/meter2mqtt.sqlite
- TYPEORM_LOGGING=false
- TYPEORM_ENTITIES=/home/node/dist/Entity/.js
- TYPEORM_MIGRATIONS=/home/node/dist/Migrations/
.js
- TYPEORM_MIGRATIONS_TABLE_NAME=migrations
- SKDBYTE_SUPERVISE_INTERVAL=2000
- SKDBYTE_MQTT_ADDRESS=mqtt://mqtt
- SKDBYTE_DEFAULT_METER_POLL_INTERVAL=3600000
- SKDBYTE_MAX_SLAVES_WITHOUT_LICENSE=10
- SKDBYTE_PROMETHEUS_URL=http://localhost:9090
ports:
- “8002:8002”
volumes:
- meter2mqtt-sqlite:/home/sqlite
- meter2mqtt-secrets:/home/secrets/public
depends_on:
- mqtt
restart: “unless-stopped”
command: “npm run start”

mqtt:
image: “eclipse-mosquitto:1.6.13”
networks:
- skdbyte
restart: “unless-stopped”

mqtt2prometheus:
build:
context: .
dockerfile: dockerfiles/mqtt2prometheus
working_dir: /home/node/app
ports:
- “8003:8003” # Must be bound to in order for the prometheus scraper to access it.
environment:
- NODE_ENV=production
- NODE_PATH=/home/node/app/node_modules
- SKDBYTE_LICENSED=1
depends_on:
- mqtt
restart: “unless-stopped”
command: “npm run start”

admin:
build:
context: .
dockerfile: dockerfiles/admin
working_dir: /home/node/app
environment:
- NODE_ENV=production
- NODE_PATH=/home/node/app/node_modules
- REACT_APP_PORT_AUTH=8001
- REACT_APP_PORT_CONFIG=8001
- REACT_APP_PORT_METER=8002
- REACT_APP_PORT_GRAFANA=3000
ports:
- “8000:3000”
depends_on:
- auth
restart: “unless-stopped”
command: “./node_modules/serve/bin/serve.js -s build -l 3000”

auth:
build:
context: .
dockerfile: dockerfiles/auth
working_dir: /home/node/app
environment:
- NODE_ENV=production
- NODE_PATH=/home/node/app/node_modules
- TYPEORM_CONNECTION=sqlite
- TYPEORM_DATABASE=/home/sqlite/auth.sqlite
- TYPEORM_LOGGING=false
- TYPEORM_ENTITIES=/home/node/dist/Entity/.js
- TYPEORM_MIGRATIONS=/home/node/dist/Migrations/
.js
- TYPEORM_MIGRATIONS_TABLE_NAME=migrations
- DISPLAY=:0
- DBUS_SYSTEM_BUS_ADDRESS=unix:path=/host/run/dbus/system_bus_socket
- JWT_TOKEN_TTL=5 minutes
- JWT_REFRESH_TOKEN_NAME=skdbyte_refresh_token
- GF_SECURITY_ADMIN_USER=Admin
- GF_SECURITY_ADMIN_PASSWORD=admin
- SKDBYTE_GRAFANA_URL=http://localhost:3000
- SKDBYTE_LICENSED=1
labels:
io.balena.features.dbus: ‘1’
volumes:
- auth-sqlite:/home/sqlite
- auth-private:/home/secrets/private
- auth-public:/home/secrets/public
restart: “unless-stopped”
command: “npm run start:prod”
privileged: true

We use this script to start the balenaos container: balenaos-in-container/balenaos-in-container.sh at master · balena-os/balenaos-in-container · GitHub

We removed the dns part on line 162, as this was causing the container to have dns issues.

We start the container with the following command:
./balenaos-in-container.sh --image resin/resinos:2.12.4_rev1.dev-generic-aarch64 --id test -c /config.json --detach --extra-args “-p 3000:3000 -p 8000:8000 -p 8001:8001 -p 8002:8002 -p 8003:8003”

The auth container is the one with the dbus issues btw.

Do you have any update on this? Just to be sure that this post isn’t forgotten.

This is quite interesting, I’m not sure I’ve ever seen dbus being accessed from a service on balenaOS in a container like this. Just to clarify, you would like the auth service to access the dbus of the hostOS or the balenaOS?

We would like to access the dbus of the hostOS. We need access to the WLAN and other network cards. It’s also a solution if this is possible with the dbus of balenaOS while it’s runnning in a container.

Hi Thijs,

Kyle suggested adding a bind mount to your container as an extra argument to docker, i.e.,

-b /run/dbus/system_bus_socket:/run/dbus/system_bus_socket

I would also suggest adding the argument --network host, as this will give you complete access to the devices network, and you also won’t have to expose every port.

If you don’t want balenaOS to share the network with the host os, I think you can just use the argument

--cap-add NET_ADMIN

Although, there might be some more privileges you need to add (see here).

First I would go with sharing the network just for debugging purposes at least.

I’ve tried adding that bind mount (I expect u meant -v instead of -b, that’s what I tried at least), but this doesn’t seem to work. I’ve played around a bit, and figured out that any bind mount I create in /run on the balenaOS isn’t actually there. i.e. if I bindmount /run/dbus/system_bus_socket:/run/testfolder/system_bus_socket there won’t be a folder named testfolder in /run. However, if I check the volumes of the container, it does say the bindmount is there. I’m not sure why this is happening, but the only thing I could come up with is that the /run folder get’s overridden during the container creation. If I make a bind mount like this: /run/dbus/system_bus_socket:/root/dbus/system_bus_socket the file is there. However, I can’t configure balena to check for the dbus in any other location than /run/dbus/system_bus_socket

I’ve tried adding the --network host, but this didn’t help. The only noticeable change is that I can’t access the device via SSH anymore. (it lost it’s ip, I managed to fix this by restarting the device)

If I use the --cap-add NET_ADMIN argument I don’t have this issue, but the auth container is still complaining about that it can’t find /host/run/dbus/system_bus_socket (which is there in the balenaOS container, but I don’t think it’s the one from the hostOS)

Ah yes, I forgot that /run is remounted as tmpfs in balenaOS at runtime, so we can’t put mounts there for now. Let me try some things and I’ll get back to you.

So I think the best way to achieve this is configure the host OS dbus to listen on a local network socket (in addition to the default unix file socket). This can be done via session.conf in either /etc/dbus-1 or /usr/share/dbus-1, but is a little out of scope for support so I’ll leave that part to you.

Then, if you are in host networking mode, you can simply set DBUS_SYSTEM_BUS_ADDRESS=tcp:host=localhost,port=12434 or if you are in bridge networking you can dynamically determine the address of the host by looking at the assigned container bridge address and subnet. You can also remove io.balena.features.dbus as itsn’t serving any purpose.

Specifically it is this line that is the issue in your case, combined with the fact that balenaOS mounts /run as tmpfs.

Another workaround I dismissed was re-purposing another feature label, like io.balena.features.firmware and sneaking the hostOS socket mount into /lib/firmware but I can’t recommend that method and it’s definitely way out of scope for support. This TCP socket workaround is definitely the route I would go depending on how many hosts I needed to configure.

Accessing the dbus via TCP didn’t work completely, as we seemed to not have the rights to do some of the necessary calls. But by re-purposing the firmware label so we could access the dbus via socket did do the trick for us. Thanks for your help!