Rebooting device causes devices to disappear

Context

Every morning my devices reboot in order to reestablish their cellular connection.

(this may be an issue in itself. Perhaps I should try to solve the problem with the cell connection rather than rebooting, but I’ll still need to occasionally reboot when all other cell connection restoration steps have been tried, so this thread is still relevant.)

The issue began when I upgraded from balenaOS 2.31.5+rev1 to balenaOS 2.45.1+rev1. I’m working on getting my application to work on higher OS versions so downgrading is not a great option.

Problem

When the device reboots, the first container that it starts doesn’t have access to the /dev/video0 device. I thought this was because it didn’t have enough time, but even after adding an infinite wait loop that merely checks if /dev/video0 is available it still hangs indefinitely.

Restarting the container solves the problem. In other words, this is only an issue for the first container that’s started right after reboot.

Replicating the Problem

All you need to do to see the problem in action is reboot this test device:
https://dashboard.balena-cloud.com/devices/b9e8b35bb8569a31a4b225b41764f5fa/summary

I have granted support access to it. You should expect to see:

Device /dev/video0 not found
Device /dev/video0 not found
Device /dev/video0 not found
...
Device /dev/video0 not found
Device /dev/video0 not found

Non-solutions

A potential solution is:

if "/dev/video0" is connected:
  run startup script
else:
  kill and restart container

I don’t want to do this because:

  1. It’s not the right way to solve the problem—this isn’t happening on older OSs and I shouldn’t need to restart a container to get it to connect to a device that’s already there.
  2. I am trying to leave my docker-compose.yml with setting restart: "no" so that I can get emails when devices aren’t working and ssh into the container to debug it with its (broken) state maintained.

Hi @cnr,

I’ve just restarted the device to see if I can replicate it. From what I can tell the container is started before the USB bus has finished enumerating the devices. After reboot I was able to shell in and watch it, and the video devices aren’t present until at least 30 seconds after the container has started.

I’ll check with our balenaOS team and see if I can get any insight into what’s happening.

Cheers,
James.

1 Like

Thank you!

Hi again @cnr,

I’ve had a response from the team and the suggestion is rather than delaying your service from starting before the USB device is available, to start the service with UDEV enabled so that you can detect the camera being “connected”. Here’s a thread discussing doing a very similar thing: Docker container cannot access dynamically plugged USB devices

Hi,

Thank you for this information.

I do not use a balenalib base container, so the UDEV=1 solution will be insufficient for me.

Should I add balenalib’s entry.sh to my own entry script to make my container dynamically plugged?

Hi Connor,
that might be worth a try. You will also have to look into the dockerfile for the OS you are using, to install dependencies. I just looked into this one: https://github.com/balena-io-library/base-images/blob/28844485a91b1408ffc550faa3b59e64809bc453/balena-base-images/amd64/ubuntu/bionic/build/Dockerfile . I would think at lease udev would have to be installed in the container but you might need other packages.

Gotcha, I’ll start iterating on that.

In the meantime, can you give me the highlevel explanation on why this happens? Why is it that the OS doesn’t wait for devices to be loaded before starting the first container, but then by the time a second container comes up it has no trouble passing in the device?

For one thing as James mentioned above, the device shows up very lately and USB devices could show up at any time as they can be plugged in dynamically. I do not know if it makes sense or is even possible to wait for USB device enumeration to be complete before starting containers.
Looking at a privileged container in ‘normal’ (non balena ) docker I can see that changes to the device file system (like plugging in a new USB disk) generally do not update the /dev folder in a running container. I guess you need UDEV running in the container for that to happen.

This didn’t work. Do you have any hints for getting the hotplugging devices in my container?

Hello @cnr ,

Did you try enabling UDEV and it didn’t work for you? Could you explain a bit more what you tried and which step failed?

Hi!

As I said here,

I have to implement manually

@cnr have you tried running the container in privileged mode and adding UDEV=1 ? That would be the first step to see if video0 gets mounted while the container is running.

Cheers,
Nico.

Maybe I’m confused. The UDEV=1 option is for docker containers based on the balenalib base images, right?

Hi @cnr,

Yes, setting UDEV=1 only works for balenalib-based containers because of their configuration. Is there a reason you are unable to use a balenalib base image? You’re right that it is possible to replicate the udev behaviour of our base images.

Cheers,
James.

Yeah I had to avoid the balenalib base images for my build because I needed to support some Nvidia specific stuff. I’ve now had the image for a while and upgrading to a balenalib base image won’t be the solution I employ for this fix.

Could you please share any steps and resources you have for replicating the udev behavior of the balenalib base images?

I’ll ask one of the base images team to jump in here and see if they can provide you some pointers.

Great thank you!

Would it be helpful if I started a new thread so that

  1. The question can be more relevant to the topic at hand
  2. More people can find the thread more easily?

Hello @cnr

You need to replicate this in your container entrypoint: https://github.com/balena-io-library/base-images/blob/28844485a91b1408ffc550faa3b59e64809bc453/balena-base-images/amd64/ubuntu/bionic/build/entry.sh#L36-L42

  • start udevd or systemd-udevd depending on which one you have available;
  • run udevadm trigger

You can check if udevd is running by running udevadm monitor and plugging or unplugging some usb devices.

1 Like

Great! This link also would have helped me