I’m trying to use a Hailo 8L chip in balena. In the Dockerfile for my container I am building the kernel module and downloading the Hailo device firmware. In the entrypoint I load this kernel module and copy the device firmware to /lib/firmware.
Everything seems to work well. The kernel module is loaded and the firmware is copied onto the device. In the host OS I see a /dev/hailo0 entry as expected, however I don’t see this in the container. If I restart the container, I do get a /dev/hailo0 in the container.
The container has privileged: true and the Dockerfile has UDEV=1, though I am not sure if that last one is necessary. The base image is balenalib/generic-aarch64:bookworm. I also tried balenalib/iot-gate-imx8plus-debian:bookworm (I am running this on a Compulab IOT-Gate device), but this didn’t fix my issue. My balenaOS version is balenaOS 6.3.12+rev3.
As far as I know, only privileged: true and io.balena.features.firmware: '1' should be necessary. The rest of this were tests to see if it made any difference, but it didn’t appear to be the case.
To summarize:
If I reboot my device, the entrypoint of my container correctly loads the kernel module and device firmware.
The device correctly shows up in the host OS.
The device does not show up in the container.
If I restart the container (balena restart <container_id>), the device does show in the container.
hi @hgaiser1 what I believe happens is that the container only has a snapshot of the devices available in the hostOS at container startup time - so when the /dev/hailo0 device appears in the hostOS after the driver load, the copy of the devices in the containers is not updated.
If you only need hotplug support, you can make your container replace tmpfs at /dev with a devtmpfs by running a privileged (or at least CAP_SYS_ADMIN) container and doing the following on its startup script:
#!/bin/sh
newdev='/tmp/dev'
mkdir -p "$newdev"
mount -t devtmpfs none "$newdev"
mount --move /dev/console "$newdev/console"
mount --move /dev/mqueue "$newdev/mqueue"
mount --move /dev/pts "$newdev/pts"
mount --move /dev/shm "$newdev/shm"
umount /dev
mount --move "$newdev" /dev
ln -sf /dev/pts/ptmx /dev/ptmx
Be aware that enabling hotplugging of devices might be a security risk as the kernel might just enable HID like keyboards that can be used to exploit the system. I would explore maybe having a one-off service that loads the driver and then does not run again, and then have a lower privileged container start after using a dependency, like an inotify event.
Thanks for your response. I think you’re right that it only gets a snapshot of /dev. Why does this work for USB devices like keyboard and mouse? They get added after the container starts too. Is it because they are advertised through udev, for which there is support in the balena images?
I like the idea of having a container setup the kernel module and firmware, and have another service that runs the actual application. Would a depends_on in docker-compose.yaml not work? (meaning the container that runs my application would depend on the container that loads the kernel module).
I suppose that is because these dependencies are already met the moment a container starts (so before it finished loading the kernel module). I will look into the inotify event like you said, though if you have some reference to go by that would be appreciated :).
Why does this work for USB devices like keyboard and mouse?
if you are using balenalib images with UDEV=1 I would expect udev to take care of these. You could probably also make udev work for your use case, but the balenalib images have other problems.
Would a depends_on in docker-compose.yaml not work?
I suspect depends_on will start the dependant container right after the first one starts without waiting for the actual node to be ready leading to race conditions.
I will look into the inotify event like you said, though if you have some reference to go by that would be appreciated
Basically you install inotify-tools and you add something like this to your entry script:
That will make the container wait until the filename is created inside workdir. It may work directly for /dev, but if not you can always have a shared volume where one container creates a flag file to signal the other.
I would expect the inotify event to require a shared volume to work (with a “flag” file), since the dev device will never be created inside the container when the kernel module gets loaded. I would expect the inotify event to wait forever.
Would it work (and does it seem reasonable) if the application container checks if the dev device exists (in the entrypoint) and exits if it doesn’t? The container restart policy should kick in and it will just keep restarting until the dev device exists, correct? I could add a sleep if it doesn’t exist to make it restart less frequently.
The restart mechanics seems to work fine for me. Let me know if there is anything wrong with this approach, otherwise I think that will be the cleanest way to go forward.