Hack to share external storage across services in multi-container configuration with vanilla host OS

Here’s a hack that lets me mount and share external storage devices across containers without having to modify the BalenaOS image and without requiring the containers to run in privileged mode. Posting here both to share it with the community and to ask for feedback in case there are hidden gotchas I haven’t found.

Let’s say there are three containers involved. The first container is disk-manager and it’s the thing that detects and mounts the storage device. The other two are service1 and service2 and they use the storage device.

We’ll assume the storage device shows up as /dev/sda1 when it’s connected.

disk-manager

The service definition in docker-compose.yml will need some specific settings:

privileged: true
environment:
  DBUS_SYSTEM_BUS_ADDRESS: ‘unix:path=/host/run/dbus/system_bus_socket’
labels:
  io.balena.features.dbus: 1

Make sure you can send D-Bus commands from disk-manager. There are D-Bus libraries for various languages but for purposes of this example, let’s assume we’re using a Debian or Ubuntu base image and we want to do everything from the command line. Run apt install systemd to get the various systemd command-line tools.

The service should wait for the storage device to appear. Take your pick of techniques here: you could use udev (in which case add UDEV: 1 to the environment stanza), or mount /dev as a devtmpfs filesystem and watch for devices, or explicitly expose the specific device in a devices clause in the service configuration.

When the device is detected, do any necessary application-specific setup. You will want to make at least one subdirectory on the storage device for services to use. For purposes of the example, we’ll assume there is a single services subdirectory that is shared by service1 and service2. This will come into play later.

mount /dev/sda1 /mnt
mkdir /mnt/services

Once a device is attached and properly initialized, the hack begins.

The crux of it: We’re going to ask the host’s systemd to mount the storage device in a location that can be bind-mounted into containers. Balena doesn’t allow arbitrary bind mounts of host OS directories, but it does support mounting a few specific directories: devices, processes, and journald logfiles. Devices and processes are special directories we can’t modify, but the journald log directories are just ordinary directories.

We’ll be using a command called systemd-run to ask the host OS to run commands for us. That command tries to detect whether systemd is running, and will fail by default when you run it in a container. But its detection is pretty easy to fool: just create a directory /run/systemd/system.

Now we create the mount point and mount the storage device in the host OS.

systemd-run --wait mkdir /run/log/journal/demo-storage
systemd-run --wait mount /dev/sda1 /run/log/journal/demo-storage

service1 and service2

Run both of these with

labels:
  io.balena.features.journal-logs: 1

In their startup logic, check for the existence of the subdirectory the disk manager created, which should appear underneath /run/log/journal/demo-storage. DO NOT CREATE THE SUBDIRECTORY HERE, JUST CHECK IF IT EXISTS. Exit if it doesn’t exist. For example, in shell, using the services subdirectory mentioned above:

if [ ! -d /run/log/journal/demo-storage/services ]; then
  echo "Storage directory not found" 1>&2
  # Sleep a bit to cut down on log spew
  sleep 10
  exit 1
fi

And you’re good to go. You can access /run/log/journal/demo-storage/services from both containers and it will go to the external storage device.

Note that you need to make the services exit and get restarted if the data subdirectory doesn’t exist. If you just sleep in a loop waiting for the subdirectory to appear, it never will because the storage device wasn’t mounted on the host OS at the time the container was started.

But wait, won’t log rotation delete all my files?

The system automatically trims journald logfiles periodically, but it won’t touch your external storage for two reasons:

  1. The journal system requires that logs be in a subdirectory whose filename is a valid machine ID. A machine ID is a fixed-length hexadecimal value. The name of the mount point (demo-storage in the example) won’t be detected as a log directory.
  2. Even if, for some reason, you name the mount point with a hexadecimal filename of the right length, journald’s log trimming code doesn’t recursively walk directory trees. It only considers journal files directly underneath /run/log/journal/<machine-id>. So it will never visit any files in the deeper subdirectory where the services are actually writing their data.

Refinements

The walkthrough above was stripped down to make it easier to follow. In a production setup, you might consider cleaning it up a tad, for example:

  • Don’t hardwire /run/log/journal in the services. Instead, pass in an environment variable to tell the services where to store their data. This is good practice in general, especially if you’re going to be running the containers outside of BalenaOS for testing. In that case you can run them with a regular Docker bind mount and point them to the mount location.
  • Have the disk manager use the Balena supervisor API to explicitly start the other services when the device is available, so they don’t have to keep failing and getting restarted. This also gives you a little more control over what happens if the user removes the storage device.
  • If you’re using the systemd-run command rather than a D-Bus client library, don’t install the entirety of systemd into your container, just the bits you actually need.

Bonus: alternate approach

What I was doing before I figured out this hack was to run each container in privileged mode with UDEV=1. Each service’s start script would look for the storage device in /dev and mount it into the container’s filesystem before dropping privileges and launching the actual service.

This worked okay, but it means the containers need to include this device-mounting code and thus that they work differently on BalenaOS than they do in other environments where you’d use a normal Docker bind mount. And of course it also means that everything needs elevated privileges which is not great security practice.

Hi @sgrimm,

First of all, congratulations on finding this hack, I think it’s pretty clever and a great showcase of your understanding of balenaOS :slight_smile:
I even spent some time in the past trying to share a mounted volume across containers and was not able to do it (hack or not), for reference here is my attempt: GitHub - balena-io-examples/balena-storage: Sample project to showcase storage mounting on balenaOS..

Next I want to point out that we are currently working on having this functionality built-in into balena-supervisor, here is the relevant GitHub issue: The supervisor should automount removable storage and provide it to specified containers · Issue #1532 · balena-os/balena-supervisor · GitHub. I can’t provide an ETA but you can track progress on GH (and I’m as excited as you for this one!)

Lastly, I don’t want to be a spoilsport, but please exercise caution when using this hack and do not rely on it to securely store sensitive data. I will ask our OS team for additional input but I can tell you this is certainly not intended behaviour and hence not something you should rely on long term. As you say there might be stuff that’s been overlooked that can cause trouble down the road.

In any case, I’ll share this internally and get back to you.

I will happily throw this away once there’s an official way to do it!

There’s one aspect of my approach that is maybe going to be hard to cover with the automount solution, though for brevity’s sake I didn’t get into it in my post. The disk manager has an opportunity to initialize new storage devices before any of the other containers see them. In my real implementation of this, if I detect that the storage device is new, I run mkfs to create a filesystem and then I create a skeleton directory tree with the appropriate owners and permissions so the non-root-privileged services can write their files. (This is for a non-consumer application where people aren’t expecting existing data on a storage device to be preserved.)

If the other services just mounted the device immediately as suggested by that GitHub issue, they would see whatever random FAT32 filesystem was on the device by default, and things would break. More importantly, though, the fact that they’d mounted it would probably cause the underlying device to be marked as in use, meaning I’d be unable to blow it away and make the correct filesystem on it.

That said, hassle-free automounting is probably the right solution for most people who want this.

If the other services just mounted the device immediately as suggested by that GitHub issue, they would see whatever random FAT32 filesystem was on the device by default

I wouldn’t want to Supervisor to assume what FS the device should have so if it is not formatted correctly then the user must fix it. We could provide warning logs when a device is plugged in with a fs which won’t work.

the fact that they’d mounted it would probably cause the underlying device to be marked as in use, meaning I’d be unable to blow it away and make the correct filesystem on it.

This point is really interesting because if someone wanted to make some device which wipes and formats USB sticks then if what you’re saying is true then that wouldn’t be possible. Thanks for bringing this up, I’ll note it in the github issue (if you comment on the issue as well we can interact there). I began working on this feature but didn’t make much progress before having to work on something else.

Have there been any advances on sharing external storage across services? I’m really tempted to make use of this hack, but afraid of losing important data later on.

Hi. No explicit ETA that I could share. Your best bet is to keep an eye out for any new activity in the Github issue(s) listed above.

This may be worth taking a look at as an alternative: Using NFS Server to share external storage between containers