analyze Images size and code

Hi,
what is the best way to analyze the containers sizes?
we are using a multi service app, and we created quite a few services, but every push becomes really slow, for every minor change.
i want to understand which containers takes too much space? are they sharing the layers? (we are using some containers with a base of node and some python)

Hi, when you push to a device, at the end of the build process there is info on the size of the containers that are generated. Is that information useful for you?

Yes it is useful, thanks.
But it makes no sense to me, how can every service be 0.5 GB, the services dont share base images?
every service takes about 500MB, thats way too much.

We also have some documentation on this. Have you seen https://www.balena.io/docs/learn/deploy/build-optimization/ and https://www.balena.io/docs/learn/more/masterclasses/services-masterclass/#6-multi-stage-builds ?

thanks for the fast responses, i’m trying to do it multistage and will inform.
one other thing i’m really suffering from is installing grpcio tools.
it takes really long time to install those packages for some, and i use it on every service.
this is how it looks: (i build it using balena push, when i just build it using docker on my PC, it works fine)

[debug] handling message: {“message”:"\u001b[34m[main]\u001b[39m Running setup.py install for grpcio: still running…"}
[main] Running setup.py install for grpcio: still running…
[debug] handling message: {“message”:"\u001b[36m[Info]\u001b[39m Still Working…"}
[Info] Still Working…
[debug] handling message: {“message”:"\u001b[34m[main]\u001b[39m Running setup.py install for grpcio: still running…"}
[main] Running setup.py install for grpcio: still running…
[debug] handling message: {“message”:"\u001b[36m[Info]\u001b[39m Still Working…"}
[Info] Still Working…
[debug] handling message: {“message”:"\u001b[34m[main]\u001b[39m Running setup.py install for grpcio: still running…"}
[main] Running setup.py install for grpcio: still running…
[debug] handling message: {“message”:"\u001b[36m[Info]\u001b[39m Still Working…"}
[Info] Still Working…
[debug] handling message: {“message”:"\u001b[34m[main]\u001b[39m Running setup.py install for grpcio: still running…"}
[main] Running setup.py install for grpcio: still running…
[debug] handling message: {“message”:"\u001b[36m[Info]\u001b[39m Still Working…"}
[Info] Still Working…
[debug] handling message: {“message”:"\u001b[34m[main]\u001b[39m Running setup.py install for grpcio: still running…"}
[main] Running setup.py install for grpcio: still running…

Hi there,

Personally, I’m not a python user, but it appears as if this is a commonly reported issue on the gprc repo. It looks like there are several workarounds people are suggesting, however I’m not certain exactly what the issue is. I’ll ask the team here if anyone has any advice.

Cheers,
James.

As a quick follow up, we did a little more investigation and it appears that the package contains native code. On common platforms this code has been pre-built to speed up installation time, however when those prebuilt binaries are unavailable it has to compile it from source. I would suggest doing this install in a separate line of your dockerfile before anything that is likely to bust the cache, if possible.

Good luck!
James.

thanks for the answer, tried several workarounds, none of them worked.

a follow up question, since every service i use use the grpcio, is it possible to make a base that all the containers will start from that base?

Hi,
Getting a base image for this sounds like a good idea indeed.

You should be able to achieve that with balena build CLI - it will let you get the image on your local machine and then push it to a registry, so you can refer to it from your app Dockerfile.
https://www.balena.io/docs/reference/balena-cli/#build-source

Let us know if that works for you.

I will try balena build but i feel i miss something very basic in the balena OS behavior with Dockers.
I assumed that if i have a multiservice app, the base image is downloaded only once for all the services, (assuming they have the same base image from balena images), and for every service, the size of its service is the size of the things we install on the enviorment.
is that true?

Hi,

I assumed that if i have a multiservice app, the base image is downloaded only once for all the services, (assuming they have the same base image from balena images), and for every service, the size of its service is the size of the things we install on the enviorment.

Yes, you are correct, the base image layers are downloaded once if services share the same base images. But the size of the image is usually calculated and displayed taking into account all the layers it refers, including the base image layers.
So, if you have 2 services both having the same base image with the size 3G, and they add service-specific layers with the size 2G, each service image size will be calculated and displayed as 5G. However, when you download these 2 images on the device, the real disk space taken is not 10G, but 7G.

Hope this helps.

It helps a lot actually.
and how can i see in the host os the size of all the containers together?

@mellerdaniel you should be able to see the image size if you in the hostOS and run balena images, but it does not show you the what is shared, it just shows you the raw size of all images. To see the cumulative total of all the space used on disk you would need to look at the out put of df -hand look the line that says "mounted on"/mnt/data`