preloaded images too large

Hi,

We are preloading an image with a multicontainer application.
When the application is loaded from balena cloud after registering, or in local mode, the disk usage of the data partition is roughly 6GB.

However, when we preload an image with the same application using the CLI, the resulting image is 20GB large. As this is a Jetson nano based device with only 16GB storage, this is obviously not going to work…
I suspect this is because the different containers that make up the application use the same base layers. Could it be that the preloading process loads all the base layers into the image for every container, instead of just once?

–Robin

Hi Robin, this sounds very strange. From what I know about the preloader it essentially mounts the .img's data partition and actually runs a regular docker pull from our balenaCloud registry to that data partition so I would expect it to use exactly the same layer sharing as if it was pulled by the device itself. Could you give us the exact command you are running and potentially some code or at least the number and rough sizes of the containers you are preloading so we can try create a reproduction on our side.

Hi Shaun!

The command used was

balena preload ffaiv1_orange-noproxy_ffai-basic.img --app ffai-basic

Actual size of the resulting preloaded image is 19.9 GB.

This was executed on an x86 ubuntu 18.04 machine.
The target balena device is a Jetson nano based device.
Is it necessary to run the preload command on the same platform as the target platform?

Image sizes are:

root@210f126:~# balena images
REPOSITORY                                                       TAG                      IMAGE ID            CREATED             SIZE
registry2.balena-cloud.com/v2/14cd12eb3ea62e77a52990c0ced7147a   delta-a9b860d8b7edea1b   41418ff49efe        2 hours ago         5.82GB
registry2.balena-cloud.com/v2/e9d090e0c7588508697af5d233d3a763   delta-99163faea7a6aff8   fa3bfbdd2112        4 hours ago         150MB
registry2.balena-cloud.com/v2/eb83f9d5298313bcbbbe6c1567adc00f   <none>                   bbf4c10e6ed8        12 hours ago        1.01GB
registry2.balena-cloud.com/v2/3395c79f226fccdf897d3f51a157ba50   <none>                   02950c2d1e52        30 hours ago        176MB
registry2.balena-cloud.com/v2/64a6f987844f2a63ce17d69cc33d351a   <none>                   c0cf6845ae11        4 days ago          5.82GB
balena/aarch64-supervisor                                        v11.14.0                 25d6abae14f0        2 weeks ago         72.3MB
balena-healthcheck-image                                         latest                   a29f45ccde2a        8 months ago        9.14kB

Actual used storage on target device:

root@210f126:~# df -h
Filesystem                      Size  Used Avail Use% Mounted on
devtmpfs                        1.8G     0  1.8G   0% /dev
tmpfs                           2.0G  8.0K  2.0G   1% /tmp
/dev/mmcblk0p13                 461M  389M   45M  90% /mnt/sysroot/active
/dev/disk/by-state/resin-state   19M  240K   17M   2% /mnt/state
overlay                         461M  389M   45M  90% /
tmpfs                           2.0G     0  2.0G   0% /dev/shm
tmpfs                           2.0G   38M  1.9G   2% /run
tmpfs                           2.0G     0  2.0G   0% /sys/fs/cgroup
/dev/mmcblk0p12                  80M  1.9M   78M   3% /mnt/boot
tmpfs                           2.0G   28K  2.0G   1% /var/volatile
/dev/mmcblk0p16                  14G  6.8G  6.0G  54% /mnt/data
/dev/mmcblk0p14                 461M  2.3M  431M   1% /mnt/sysroot/inactive

Perhaps also interesting to note, the ffaiv1_orange-noproxy_ffai-basic.img image referenced in my previous post is basically a balena-cloud-jn30b-nano-2.56.0+rev1-v11.14.0 image with an LTE system-connection added, different boot splash image, and config.json with SSH keys and api key preloaded.

Thanks Robin this should be a good amount of detail for our CLI team to look into it. Just to answer your question:

This was executed on an x86 ubuntu 18.04 machine. The target balena device is a Jetson nano based device. Is it necessary to run the preload command on the same platform as the target platform?

No, it should not matter at all, preload was designed to by run on x86 laptops and machines during manufacture so that definitely isn’t the cause. What is weird is that I was half expecting that the combination of the raw docker image sizes would roughly equal 20GB but its not close to that. One thing maybe is worth asking is if there was a chance that you ran preload twice on the same image file? even if at one point it failed and you re-run it. It could be that some how that causes the accumulation of the images.

To be absolutely sure, I’ve gone ahead and gone through the process again.
The result seems the same:

robin@ubuntu:~/balena/ffai$ ls -hal
total 26G
-rw-r--r-- 1 robin robin 1.3G Sep 22 07:53 balena-cloud-jn30b-nano-2.56.0+rev1-v11.14.0-cm-orange.img

robin@ubuntu:~/balena/ffai$ cp balena-cloud-jn30b-nano-2.56.0+rev1-v11.14.0-cm-orange.img ffaiv1_orange-noproxy_ffai-basic_preloaded-v2.img 

robin@ubuntu:~/balena/ffai$ balena preload ffaiv1_orange-noproxy_ffai-basic_preloaded-v2.img --app ffai-basic --debug
[debug] new argv=[/home/robin/.nvm/versions/node/v12.18.4/bin/node,/home/robin/.nvm/versions/node/v12.18.4/bin/balena,preload,ffaiv1_orange-noproxy_ffai-basic_preloaded-v2.img,--app,ffai-basic] length=6
Building Docker preloader image. [===                     ] 12%
Step 1/7 : FROM docker:17.12.0-ce-dind
Building Docker preloader image. [======                  ] 25%
Step 2/7 : RUN apk update && apk add --no-cache python3 parted btrfs-progs util-linux sfdisk file coreutils sgdisk
 ---> Using cache
Building Docker preloader image. [=========               ] 37%
Step 3/7 : COPY ./requirements.txt /tmp/
 ---> Using cache
Building Docker preloader image. [============            ] 50%
Step 4/7 : RUN pip3 install -r /tmp/requirements.txt
 ---> Using cache
Building Docker preloader image. [===============         ] 62%
Step 5/7 : COPY ./src /usr/src/app
 ---> Using cache
Building Docker preloader image. [==================      ] 75%
Step 6/7 : WORKDIR /usr/src/app
 ---> Using cache
Building Docker preloader image. [=====================   ] 87%
Step 7/7 : CMD ["python3", "/usr/src/app/preload.py"]
 ---> Using cache
 ---> 6cc2e4174f90
Successfully built 6cc2e4174f90
Building Docker preloader image. [========================] 100%
| Checking that the image is a writable file
| Finding a free tcp port and getting balena settings
| Checking if the image is an edison zip archive
- Creating preloader container
/ Starting preloader container
\ Fetching application ffai-basic
| Reading image informationWaiting for Docker to start...
\ Reading image informationDocker started
- Reading image information
? Select a release current
\ Fetching application 1748846
| Resizing partitions and waiting for dockerd to startLeaving splash image alone
- Resizing partitions and waiting for dockerd to startExpanding partition n°16 of /img/balena.img
| Resizing partitions and waiting for dockerd to startResizing ext4 filesystem of partition n°16 of /img/balena.img using /dev/loop21
| Resizing partitions and waiting for dockerd to startFile system OK
- Resizing partitions and waiting for dockerd to startWaiting for Docker to start...
\ Resizing partitions and waiting for dockerd to startDocker started

Pulling 5 images [========================] 100%
/ Cleaning up temporary files

robin@ubuntu:~/balena/ffai$ ls -hal
total 26G
-rw-r--r-- 1 robin robin 1.3G Sep 22 07:53 balena-cloud-jn30b-nano-2.56.0+rev1-v11.14.0-cm-orange.img
-rw-r--r-- 1 robin robin  19G Sep 22 11:08 ffaiv1_orange-noproxy_ffai-basic_preloaded.img
-rw-r--r-- 1 robin robin  19G Sep 22 14:10 ffaiv1_orange-noproxy_ffai-basic_preloaded-v2.img

Great thanks for going through that all again Robin. I think this definitely looks like a bug or at least some very undesirable behaviour on the CLI. I have asked the CLI team to look deeper into this. I have created a CLI issue to track this here: https://github.com/balena-io/balena-cli/issues/2043 perhaps if you can add your CLI, OS , laptopOS version etc to that ticket it would aid the investigation.