I know that balena.io can deploy delta updates of docker images to my devices, but for at least one of my containers, the build time is very long and the resulting container is large. This requires a long time to compute the delta image.
I know that one option is to use local development and build on my local device. But is it possible to cache a “base image” for my project at the point where I know nothing is going to change, and use that as a starting point during the build?
Hi @jason10 not sure I follow exactly. For building the build system should use the previous build as the base for cache, so shouldn’t rebuild your service if nothing changed. Unfortunately I think it will still try run the delta against the two versions and we probably could improve that considerably. Is that the problem you are describing?
No, the problem is there is a big section of my dockerfile that installs CUDA, tensorflow, tomcat, Java, and other tools that don’t change and then the much shorter Node.js and .war file installation.
Perhaps I could break it into two containers, one that is the unchanging runtime, and second which contains the Node.js and .war file, and when launching copies the new codes to a shared volume…
Does that help? It’s what I am seeing is a full rebuild of all containers, but accelerated delta updates. And for a 4GB image, the acceleration is significant yet painfully long.
The builder should cache previous layers that haven’t changed during a build, so if you’ve a Dockerfile that has something like this:
FROM <base>
RUN <install CUDA, Java, etc..>
COPY <myApp> <appPath>
RUN <buildApp>
CMD ["startMyApp.sh"]
Then as long as none of the dependencies in the RUN <install CUDA, Java, etc..> step haven’t changed, that layer should be cached. If you’re not seeing that, it would be really useful to let us see the portions of the Dockerfile that aren’t working (but should be).
Splitting this into a multistage build might also help you, depending on the circumstances.
The caching and delta updates definitely appear to be working, If I make a change to a different container then the delta download of the 4GB container is not required. However the build time is the same: too long.
Maybe if I combine some of RUN statements into a bash script the overall container image size will decrease.
I ran an experiment where I removed the CUDA and Java portions. The build time lowered and the container image size decreased.
What is the easiest way to share the dockerfile.template with you?
Hi, you can share it in a github gist https://gist.github.com/ or attached as a file on this discussion, if the portion you can/want to share doesn’t contain sensible information
The build time is 25 minutes. Most of this I want to cache, only a small part of the repository changes compared to the installation of OpenJDK, CUDA, and tensorflow.
The Balena Builder should cache layers that you don’t invalidate. If you don’t modify the commands at the top of the Dockerfile, then those should be cached out of the box. Is that not the case for you? If that’s a problem, then you might find ways to re-organize the Dockerfile so that steps that are unlikely to change get moved to the top, and eventually into base images of their own that extend the ones we provide.
I have reorganized my commands, putting the most likely to change stuff at the end. During development, when I’m not sure if I need to change the Balena base image, I of course put the risky stuff at the top so I have to wait as little as possible for it to fail.
What I’m seeing is a full build every time of all containers. Truthfully, I haven’t timed the pushes to balena but it certainly appears that every command is executed and the longer ones always take longer.
For example, consider:
I know that this is for building a balena base image, but imagine that I need JDK and Python 3.6.8 and CUDA in the same container. (I would love to refactor the container, but that’s not an option at the time).
If I copy and paste those JDK container commands into my mega-Dockerfile.template are you telling me that your builder should zip through the JDK building container commands the second time I do a git push to the balena remote, if and only if,
None of the commands before the JDK commands has changed
Yes, that is correct, each step/layer in your Dockerfile should be cache as long as nothing before it has changed.
If you want to test it, you can run balena push <appname> consecutively and you should observe using cache log entries in the build logs before each cached step on the follow-up builds.
How can that work if the command is like a CURL command fetching data? Only if the commands is exactly the same? Or only if the data fetched is exactly the same?
If the command hasn’t changed, then it shouldn’t be run on subsequent builds unless a previous layer was invalidated. This Docker Best Practices should give you a good idea on what should and shouldn’t be done in order to utilize layer caching correctly. Can you run balena push twice and see if and up to which point caching will be used? As I said, you should see using cache with green background amongst the build steps.
Here are two logs from “git push balena …” where using cache appears in four of the five services but only partially for the the fifth service, which is requiring 22 minutes to build.
The only change was adding
#a comment
to the end of Server/Dockerfile.template – for the service barnserv
First we start with the log from git push balena:
Then we do the commands at the top of this gist, and follow with the log from git push balena:
If you search for [barnserv] Using cache you’ll find that the barnserv service used the cache up to step 11/44 then executed every step form there. Is this expected? Is this exceeding a limit in the docker or balena builder?
Ah, I think I understand. The 10th command is COPY . ./ which is going to copy the Dockerfile.template, which has changed, therefore all following commands cannot use the cache.
Here is another a discussion of a Dockerfile with the same problem.