Doing some serious stuff with Resin.io (or at least trying)

After a few experiments, I’ve tried to install a legacy application which is using mariadb as database.

During the installation, the database is initialized with some data in /var/opt/my-app (so far I have no control on this directory). How can I tell Resin Docker I want this directory to be persisted ?

I’ve tried to create a symbolic link : ln -s /data/var/opt/my-app /var/opt/my-app but /data is just fully reinitialized at the first run of the container since it’s not a named volume, it’s a mounted host directory (if it was a named volume, what I write during build time would be merged with the resulting volume).

Hi Tristan,

resinOS will only persist the data directory, so I’d recommend you to check the MariaDB configuration, since there’s surely a way to make it store its data to a directory of your choice.

Hi Juan,

I’m sad you are not bringing anything new in your answer. Have you read carefully what my problem is ?

The fact that resinOS will only persist the data directory, is not my problem. My problem is it’s not a named volume, it’s a mounted host directory volume, why can’t something change about this ? This way, data initialization made during image building would not be deleted.

Secondly : I’m installing a legacy application which is a debian package binary I have no control on. This debian package is doing some data initialization during the image building. How to handle this ? If I configure MariaDB to store its data in /data after installation, it wouldn’t change anything since the data initialization would have already occured and would be lost.

Are you saying there is no way to handle this situation with Resin.io ?

Hi, we’ll look at the difference between the named volume and how things are done now.

In the meantime, wouldn’t setting things up in your start script work? Something like:

  1. If there’s no folder for the persistent data on /data, then back things up there
  2. If there’s a relevant folder /data, set up the symlink

This could be for example in your start script that gets run at the start of your application, for example (just a demo):

#!/bin/bash

###
# Persistent - volatile setup
###
TARGET_DIR="/data/var/opt/my-app"
VOLATILE_DIR="/var/opt/my-app"
if [ ! -d "${TARGET_DIR}" ]; then
    echo "Permanent folder does not exists, creating and moving data from volatile..."
    mkdir -p "$TARGET_DIR"
    mv "$VOLATILE_DIR"/* "$TARGET_DIR"/
else
    echo "Permanent folder exists..."
fi
# Setting up linking back
rm -rf "$VOLATILE_DIR"
ln -s "$TARGET_DIR" "`dirname $VOLATILE_DIR`"
###

# Your mariadb start, etc
...

By the way, not totally sure what do you mean “/data is just fully reinitialized” - the word reinitialized is not completely clear to me. What happens, is that your /data in your app container is overlaid by the permanent storage directory in the host, and the overlay masks anything that was there during your build.

Does this help?

Also, there are packages which do not work properly in containers, and there are situations when indeed the right answer is “try fixing it upstream”. The corollary being that “otherwise you will always be working around issues in your own Dockerfile or deployment”. If it’s an important-enough package for you, the source of it can be found usually, but that’s my hypothesis, since don’t know what exactly are you using. Just some food for thought.

Ok, thanks for looking at it !

The difference is pretty clear in Docker docs :

“Volumes are initialized when a container is created. If the container’s parent image contains data at the specified mount point, that existing data is copied into the new volume upon volume initialization. (Note that this does not apply when mounting a host directory.)”

source : https://docs.docker.com/engine/tutorials/dockervolumes/

What I’m trying to do is simply to initialize some persistent data at image build time. If you are saying I could do this at first container start instead of image build time, you may be right, but here I’m trying to install a legacy app in use in my company (edf.fr) for 3 years on which I have rly not much control. It’s only the first step to get Resin.io accepted and take a fresh new start with cleaner way to install apps.

About your last comment, to be honest I already know this storage place will not be the only problem, because when I do a full install inside the container with no volumes, no symlinks, I still meet a few errors, but I’m working on it.

At the moment your best option is an initiation step in your start script, similar to the one mentioned above, that seems to match what you originally requested. Whether or not the app is under your control, the application image is, and the start script is as well, and thus you will need to adjust that to the legacy app (as the /opt/... linking to /data/... is already one kind of workaround that you’d need to include in your deployment). Have you had a chance to try this one out?

I’ve filed an issue to bring this question up, and keep you posted. /data usage has a lot of corner cases, and thus cannot comment at the moment whether or not it’s a workable approach in this particular case.

@Tristan107 Further follow up, I’ve been checking this out, and I see a different behaviour of the volumes than you say, and the documentation you linked seems to support it, and closer inspection your own quote says different things than how it was interpreted:

“Volumes are initialized when a container is created. If the container’s parent image contains data at the specified mount point, that existing data is copied into the new volume upon volume initialization. (Note that this does not apply when mounting a host directory.)”

This is only within the container, and within images/layers that make up the container. There the data is inherited. This however not the case for the part that had the emphasis.

Further in the same docs about running the container with host volumes, further down the page:

This command mounts the host directory, /src/webapp, into the container at /webapp. If the path /webapp already exists inside the container’s image, the /src/webapp mount overlays but does not remove the pre-existing content.

Thus when I was trying it out, I could only get the same behaviour as it exists already: whatever is in the /data directory, is overlaid by the external data when the container is run with a volume pointing there. So my guess is that the pattern you request is not really available in Docker in general.

If you have a chance, you can try these out with Docker, and happy to discuss if you see any other behaviour, or if you can recreate a suitable behaviour with your local docker (not on a resin device, for simplicity’s sake), and can see what you are thinking about better. Thank you!

I’ve read all what you’ve written, but I still don’t know what your point is.

When you mount a host directory, there is indeed this “overlay” mechanism which is exactly like if the content of the container directory was deleted and replaced with the content of the host mounted directory. This occurs when you specify your own host directory. That’s not what we want here.

Now, if you let docker manage where he stores volume data, he does it in /var/lib/docker/volumes/my-volume-name/_data/ "/var/lib/docker being docker install path, if you have given a name to the volume or /var/lib/docker/volumes/generated-volume-id/_data/ if it’s an anonymous volume.
In these last 2 cases (named and anonymous volumes), there is a merge mechanism which occurs when the container is created between the image directory and the corresponding volume directory (the data existing in the image directory are copied to the corresponding volume at container creation.)

It would help us to be on the same page, if you could give us a basic demo to illustrate what you are trying to do in your last paragraph. For example a minimal Dockerfile and the command line call that you start the resulting container (that we can run ourselves) that does the behaviour you are trying to see on a resin device. Would that be possible?

Sure :

FROM resin/valid_image

RUN touch /data/my_precious_data.txt

I don’t want “my_precious_data.txt” to be “deleted” when the container starts (or overlayed by the volume content), I want it to be copied to the volume like Docker makes it possible.

I’m sorry I wasn’t clear, I meant a working example of the actual behaviour of what you are trying to achieve…

This is what I’ve seen so far:

Step 1. create a volume, with a name that mimics a bit the multi-app nature of resin devices:

$ docker volume create data_123
data_123

Step 2. user this dockerfile that mimics your setup:

FROM resin/intel-nuc-alpine

# Create a file at buildtime
RUN mkdir -p /data && \
    touch /data/buildtime_$(date +"%Y%m%d_%H%M%S") && \
    ls -la /data

# Create a file at runtime
CMD touch /data/runtime_$(date +"%Y%m%d_%H%M%S") &&\
    ls -la /data

and build it:

$ docker build . -t volumetest --no-cache            
Sending build context to Docker daemon  4.608kB
Step 1/3 : FROM resin/intel-nuc-alpine
 ---> 6d809cef67ac
Step 2/3 : RUN mkdir -p /data &&     touch /data/buildtime_$(date +"%Y%m%d_%H%M%S") &&     ls -la /data
 ---> Running in 7e4d457ecacb
total 8
drwxr-xr-x    2 root     root          4096 Jul 12 16:35 .
drwxr-xr-x    1 root     root          4096 Jul 12 16:35 ..
-rw-r--r--    1 root     root             0 Jul 12 16:35 buildtime_20170712_163549
 ---> 7f32d82166dc
Removing intermediate container 7e4d457ecacb
Step 3/3 : CMD touch /data/runtime_$(date +"%Y%m%d_%H%M%S") &&    ls -la /data
 ---> Running in 5148ab535b56
 ---> 10e067ecb815
Removing intermediate container 5148ab535b56
Successfully built 10e067ecb815
Successfully tagged volumetest:latest

Step 3 is to run this container, pointing to the volume:

$ docker run -ti  --rm -v data_123:/data  volumetest 
starting version 3.2.2
total 8
drwxr-xr-x    2 root     root          4096 Jul 12 16:35 .
drwxr-xr-x    1 root     root          4096 Jul 12 16:35 ..
-rw-r--r--    1 root     root             0 Jul 12 16:35 buildtime_20170712_163549
-rw-r--r--    1 root     root             0 Jul 12 16:35 runtime_20170712_163557

Thus the build file has been copied, and the runtime file is created there.

Then running it again:

$ docker run -ti  --rm -v data_123:/data  volumetest 
starting version 3.2.2
total 8
drwxr-xr-x    2 root     root          4096 Jul 12 16:35 .
drwxr-xr-x    1 root     root          4096 Jul 12 16:35 ..
-rw-r--r--    1 root     root             0 Jul 12 16:35 buildtime_20170712_163549
-rw-r--r--    1 root     root             0 Jul 12 16:35 runtime_20170712_163557
-rw-r--r--    1 root     root             0 Jul 12 16:35 runtime_20170712_163559

Thus the runtime file from last time was kept (the changes are permanent) and the new runtime created file is there too…

Step 4. Then if the image is rebuilt:

$ docker build . -t volumetest --no-cache            
Sending build context to Docker daemon  4.608kB
Step 1/3 : FROM resin/intel-nuc-alpine
 ---> 6d809cef67ac
Step 2/3 : RUN mkdir -p /data &&     touch /data/buildtime_$(date +"%Y%m%d_%H%M%S") &&     ls -la /data
 ---> Running in 975cf5d2617f
total 8
drwxr-xr-x    2 root     root          4096 Jul 12 16:39 .
drwxr-xr-x    1 root     root          4096 Jul 12 16:39 ..
-rw-r--r--    1 root     root             0 Jul 12 16:39 buildtime_20170712_163905
 ---> 95c026b69c6f
Removing intermediate container 975cf5d2617f
Step 3/3 : CMD touch /data/runtime_$(date +"%Y%m%d_%H%M%S") &&    ls -la /data
 ---> Running in 2817e73cc8e0
 ---> 5204caa3fab8
Removing intermediate container 2817e73cc8e0
Successfully built 5204caa3fab8
Successfully tagged volumetest:latest

and run again, then the new buildtime file is not actually copied in there but the very first existing file remained, but the runtime file does gets created:

$ docker run -ti  --rm -v data_123:/data  volumetest 
starting version 3.2.2
total 8
drwxr-xr-x    2 root     root          4096 Jul 12 16:39 .
drwxr-xr-x    1 root     root          4096 Jul 12 16:39 ..
-rw-r--r--    1 root     root             0 Jul 12 16:35 buildtime_20170712_163549
-rw-r--r--    1 root     root             0 Jul 12 16:35 runtime_20170712_163557
-rw-r--r--    1 root     root             0 Jul 12 16:35 runtime_20170712_163559
-rw-r--r--    1 root     root             0 Jul 12 16:39 runtime_20170712_163911

So is this the behaviour that you are looking for?

Sure, you have illustrated well how Docker works nowadays (it has evolved with time), for the very last test, it looks like a Docker bug because it’s not rly what is documented but after some search : the copy mechanism is not occuring when [the volume already exists and is not empty] (this “not empty” condition appears in many discussions on Docker forums : https://github.com/moby/moby/issues/18670)

In a real use case, if initizialisation data need to change, you need to remove the volume anyway, or clear data inside, or use an other volume name. Clearing data is already allowed by Resin.io as far as I know.

Also, you haven’t shown the result of using “-v /mnt/data_123:/data” from start : it wouldn’t create or copy anything, you would end with an empty “/data” directory which is clearly not a desired behaviour.

So in just one sentence : yes, all what docker is doing when you use a named volume is the result of years of discussions on Docker forums and looks like a good flexible behaviour.

Whether or not what docker is doing in general is a “good” behaviour on devices (instead on the server) is an open question, and that’s why we are trying to clarify things so we can discuss how any possible change affects the system as a whole. Bringing this up internally, and will update here if there’s anything new.

Ok, nice. I think you’ve chosen Docker to benefit of all its features, so using named volumes just offer more possibilities without removing any.

You are opposing “devices” and “servers” on multiple occasions (like when talking about security updates) but I don’t see these things so different in usage, only the hardware is a bit specific and constrained, the good practices should remain the same on both “devices” and “servers” imo. “devices” are often used as gateways, so yeah more server-side than desktop-side.

Hi, we’ve discussed it internally, and the current status is:

  • resin.io’s persistent /data predates Docker’s corresponding volume behaviour, so we are not using that because originally it wasn’t available.
  • work is ongoing to bring multicontainer support to resin.io devices, and that will likely address this, as the methods of storing data is also part of the development. Any changes to the current behaviour will be likely coming with multicontainer release, the earliest.

Hope that will help, and in the meantime there are likely workarounds for your use case using a data moving logic that can be included in a suitable start script, as mentioned above.

Ok, thanks. I’m looking forward to this multicontainer support, which is indeed a condition to use Docker for “serious” things (and docker in docker doesn’t look sexy at all).

Any ETA ? like this autumn ? this winter ?

Hi,

we are still really interested in a release date estimation about multi-container support here.

Is it more like 1 month, 6 months or 1 year ?

It’s currently in the works, but don’t have any hard ETA for it (it’s a big change with lot of moving parts). We are aiming for the autumn, and will keep everyone posted here in the forums.

Ok, thanks, I mark this thread as solved now.