balena-cli is not using the cache on fresh machine

Langhalsdino · February 9, 2020, 9:38pm

Hi,

i want to share builds with other people in my team. Therefore i set up a docker registry that stores the build images. Sadly when other people are pulling the image the cache seems not to be used.
Since our builds take ~3h on a fast workstation (CUDA, OpenCV, Tensorflow, …) caching is very usefull.

How to reproduce the behaviour

push a build image to you registry
docker tag myapp_main registry.yourdomain.com/bla
docker push registry.yourdomain.com/bla:latest
prune all local images
docker system prune --all
pull image
docker pull registry.yourdomain.com/bla:latest
docker tag registry.yourdomain.com/bla:latest myapp_main:latest
do a new build without changing the dockerfile or its resources
-> Cache is not used

Do you have any idea on how to tag the images, in order to get chaching to work?

afitzek · February 10, 2020, 11:35am

Hi,
Can you provide us with more details on how you build the image in the first place and when the cache is used? Also is this an emulated build? In which case we start a docker daemon in a qemu to perform cross architecture builds.

Langhalsdino · February 10, 2020, 4:37pm

I build the docker image cross platform. Therefore the qemu platform is used to build it.

Build and deploy in one stage:
balena deploy myapp --logs --dockerfile Dockerfile --source . --emulated --build

Is there a way to get the cached images into the ~/.balena/bin folder since it seems like the qemu builds are stored there or am i completely off?

shaunmulligan · February 12, 2020, 8:29am

Hi Frederic, the built images would be stored in your local docker registry once they are built. But I think most likely the CLI won’t by default use all the on disc images that exist while building. docker used to do this automatically but due to some cache poisoning exploits they disabled automatically using all images and only use images that were build by the engine on your host. So one needs to pass a specially flag that tells the build to use all the available images, or a list of images as cache. I suspect the cli deploy command is not using that flag when building, but it should be possible to add.

Langhalsdino · February 12, 2020, 8:45am

@shaunmulligan Can you point me in a direction on what i should look into in order to modify the balena-cli, to do so?

shaunmulligan · February 12, 2020, 8:55am

I think it would be a matter of adding it to https://github.com/balena-io/balena-cli/blob/master/lib/actions/build.coffee . We would need to try add the --cache-from option to the build (more details on that option here: https://docs.docker.com/engine/reference/commandline/build/#specifying-external-cache-sources and it seems the images would need to be built using the buildkit backend of docker too, but unsure on that). I have asked the CLI maintainers to weigh in on this as well as they would be able to give more specific details on how to add that option.

Langhalsdino · February 12, 2020, 9:12am

The --cache-from option is what i looked into, too. But i just could not figure out, where i could hardcode it into the balena-cli for a quick fix and later do a Pull Request with a better implementation. I am no expert in coffee script and the project seems quiet complex to understand. Therefore i am looking for file you can point me to, that describes how to sett the build options with docerode or similar, so i can work my way from there.

shaunmulligan · February 12, 2020, 9:19am

Yeah, I have been trying to find where the options are set as well, but have had no luck and also have little to zero coffee experience, so think we need to wait until the CLI team come online, they will be able to point us in the right direction super fast.

Langhalsdino · February 12, 2020, 9:19am

Cool, thank you for your help. Looking forward to hear from the CLI team

Langhalsdino · February 12, 2020, 9:37am

Just as information for the CLI team. The build argument needs to be parsed through dockerode. Since it isusing the docker api docker_build
The cachefrom option is defined as string: JSON array of images used for build cache resolution..

pdcastro · February 12, 2020, 11:49am

Therefore i am looking for file you can point me to, that describes how to set the build options with dockerode or similar, so i can work my way from there.

@Langhalsdino, I’ve created a draft PR, not yet tested and thus maybe not working, with the likely changes to add a --cachefrom option to the balena build and deploy commands: Add '--cachefrom' option to balena build / deploy commands · Issue #1616 · balena-io/balena-cli · GitHub

If you get to test it, your feedback will be very welcome!

Langhalsdino · February 12, 2020, 11:51am

I will try it out in the next hours, since this feature is very important to us
We spend about 8h a day on building images and this feature can cut it down to ~1h a day

Langhalsdino · February 12, 2020, 7:57pm

Yes i tried it and it does not seem to work. My current guess is, that the Dockerfile will get treated similar to how docker compose gets treated, therefore we might need to implement something like this.

More of the details can be found here

Langhalsdino · February 12, 2020, 10:30pm

I do not know why, but it started working now with docker-compose files and regular dockerfiles with no change at all

Thanks for the PR

Just as a recap for other who might be interested:

Clone the repository and build the app

git clone https://github.com/balena-io/balena-cli.git
cd balena-cli
git checkout 1616-deploy-cachefrom
npm install
npm rund build:fast
export MY_PATH_TO_BALENA=$(pwd)

build your app with caching

cd your/balena/project/path
docker pull registry.gitlab.com/myapp
${MY_PATH_TO_BALENA}/balena-cli/bin/balena deploy my-app --logs --dockerfile Dockerfile --source . --emulated --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from registry.gitlab.com/myapp -t registry.gitlab.com/myapp

push new build to registry

docker push registry.gitlab.com/myapp

samothx · February 12, 2020, 10:37pm

Thanks for posting your results, I will inform the maintainer that the modification works for you.

pdcastro · February 14, 2020, 3:35pm

@Langhalsdino, thank you for the feature suggestion and for helping with the testing. The --cache-from option for the balena build and balena deploy commands was released in CLI version 11.26.0.
For the benefit of others reading this thread, it implements the same feature documented for the docker build --cache-from option, “Specifying external cache sources”:

Topic		Replies	Views
Balena build locally pull cache Product support	10	460	August 26, 2023
setting --nocache or -c option still builds upon previous image Product support	2	144	February 6, 2024
Is it possible to cache a "base image" for my project and use it during the build? Product support	15	1408	May 28, 2019
Optimizing release build Product support builder	3	487	December 18, 2020
Balena Cloud builds - Only first multi container seems to cache Product support	2	213	May 19, 2022

balena-cli is not using the cache on fresh machine

How to reproduce the behaviour

Related topics