balena push vs build and deploy

I have a multi-service, multi-stage build for pi zero devices based on alpina images. I typically develop and optimize the image specs/dockerfile by building using balena build -a my-app-name, which builds it on my mac using my docker instance, but do the actual build and application deployment using balena push my-app-name which builds on the servers and then deploys to the application group. So far this seems to have worked but I’m a bit unclear on a few things:

  • My understanding is that when I build using build, docker will automatically cross-compile based upon the architecture that comes from the application profile. Is that correct?
  • the container size is different depending on whether I build on my docker or in the cloud (the latter is smaller). Is this because the local build automatically includes emulation libraries?
  • Is doing a balena deploy --build equivalent to doing a push? It seems there is more flexibility with build, for example, by providing build arguments that can control the build.

If you’re using balena build to build your images then you should be using balena deploy to deploy the built artifacts. balena push bundles your directory and sends it for building (again) on our cloud builders and deploys in one go. The equivalent to push is indeed balena deploy --build. Both build and deploy --build will indeed cross compile if needed, but push will most likely use the ARM builders which means there’s no cross-copmilation.

Thank you. That’s clear. What would account for the differences in image size? 328MB when I build locally and 315MB when I build using push? It makes me worry a bit that the two are not in fact equivalent.

Hi, I’m not sure currently as to the difference in size which I will investigate, but in the interim I wanted to point you at this Masterclass that has a lot more information/examples on the build/push/deploy commands. https://github.com/balena-io/balena-cli-masterclass#5-building-and-deploying-an-application-without-the-builder

Hi @0xff
Could you clarify where are those image size numbers from?
Do the two releases show different image sizes in the release summary page in the balenaCloud dashboard?

This could be related to previously cached layers in the cloud builder or your cli build, while some of the dependencies have been updated.
Could you try running a balena deploy with the additional --nocache argument and compare the results while also doing a git push balena master:balena-nocache ?

If they still differ, then this might be related to the fact that our CLI still uses 1024 to convert MB to KB in some places, while our builder fetches the image size as reported by the docker daemon, which uses 1000 for conversions.
We do have a CLI issue to track this conversion issue. Let me point you to that for reference:

Kind regards,
Thodoris

I don’t know how I missed the cli-masterclass doc but it’s a great resource. Thanks for pointing it out.

Ah, I think you are correct that the issue is between the way docker shows the sizes. I just built using the standard balena build -a command and saw that the image size for this service as reported was

[Build] solmon Image size: 314.15 MB

Evidently I was looking at sizes in via docker image ls where I see:

balena_solmon latest b90c8f90f78b 3 seconds ago 329MB

The difference of about 4.8% is exactly the difference between (1024*1024) = 1,048,576 vs 1000000, confirming your suspicion.

But here’s another issue related to balena deploy app_name vs build push app_name. When I use the former, I get the following message:

 solmon  ok/usr/src/app/node_modules/bindings/bindings.js:121
 solmon          throw e;
 solmon          ^
 solmon  
 solmon  Error: Error relocating /usr/src/app/node_modules/@serialport/bindings/build/Release/bindings.node: _ZN2v820EscapableHandleScope6EscapeEPPNS_8internal6ObjectE: symbol not found
 solmon      at Object.Module._extensions..node (internal/modules/cjs/loader.js:1025:18)
 solmon      at Module.load (internal/modules/cjs/loader.js:815:32)
 solmon      at Function.Module._load (internal/modules/cjs/loader.js:727:14)
 solmon      at Module.require (internal/modules/cjs/loader.js:852:19)
 solmon      at require (internal/modules/cjs/helpers.js:74:18)
 solmon      at bindings (/usr/src/app/node_modules/bindings/bindings.js:112:48)
 solmon      at Object.<anonymous> (/usr/src/app/node_modules/@serialport/bindings/lib/linux.js:2:36)
 solmon      at Module._compile (internal/modules/cjs/loader.js:959:30)
 solmon      at Object.Module._extensions..js (internal/modules/cjs/loader.js:995:10)
 solmon      at Module.load (internal/modules/cjs/loader.js:815:32)

when the container starts. Not a problem when I use the balena push app_name model.

I suspect this is caused by my limited understanding of how the specification of the architecture affects local builds.

Hi there,

Just to confirm; you’re seeing a build failure when doing balena push but not while doing balena build? Is it possible that your project depends on a tagged container that may have changed (and your workstation has cached older versions)? (eg FROM balenalib/%%BALENA_MACHINE_NAME%%-node:latest) If so it may be worth modifying your Dockerfile to point to a specific version rather than latest or stable? You can confirm this by manually running docker pull of the base images to see if there are new versions.

Thanks,
James.

Sorry, not a build failure, a run failure when the container spins up.

Oh. I presume your Dockerfile copies the full contents of the node_modules directory into the final stage container without modification?

Yes - I do the build in my build image and then copy all the node_modules into my run image - both are based on alpine images.

My best guess is that there is a runtime library required by serialport that is not present in the run image. This seems suspicious given that you said it builds fine on your local workstation. Can you try pulling the latest versions of the base images you’re using from the registry to verify that they’re the same?

Thanks,
James.

My concern is that when it builds on my local workstation it uses local node bindings to map the serialport javascript code with the appropriate C code instead of the ones for the pi. Is that possible?

I’ve been building using the push model and it’s fine, except I would like to use build arguments which the push model doesn’t support.

Hi

balena build does indeed cross-compile, if provided with the --emulated flag, for example when building for the pi on an amd64 machine

Can you please double check you’re adding the --emulated flag to balena build? eg: balena build -a <app> --emulated

Regards

federico

Sorry about late response. Using the --emulated flags seems to not make any difference. I still get the same binding errors.

Hi @0xff,

So given you’re copying the node_modules directory in completeness, I’ve a suspicion that this comes down to the way balena push deals with the build and that maybe you’ve got a node-gyp built binding in the node_modules directory that’s for the wrong architecture.

Is there a node_modules/build directory in your node_modules directory? When you originally created the directory, did you just carry out an npm install from the relevant source directory with package.json in it from your development machine or is it populated as part of the initial Dockerfile (it would be useful to see relevant portions of your Dockerfile here)? If the latter, then it should work as long as you’re not switching between base image architectures (for example npm installing using a 64bit Arm image and then copying them into a 32bit Arm image).

Best regards,

Heds

Here’s the entire Dockefile.template:

FROM balenalib/%%BALENA_MACHINE_NAME%%-alpine-python:3.6-build as build

WORKDIR /usr/src/app

COPY /lib/requirements3.txt /lib/requirements3.txt
RUN pip3 install -r /lib/requirements3.txt

RUN apk upgrade --no-cache \
  && apk add --no-cache nodejs npm

# Install the node tools
COPY package.json package.json
RUN npm install && npm cache clean --force && rm -rf /tmp/*

# Install
FROM balenalib/%%BALENA_MACHINE_NAME%%-alpine-python:3.6-run
WORKDIR /usr/src/app

ENV UDEV=1
ENV DBUS_SYSTEM_BUS_ADDRESS=unix:path=/host/run/dbus/system_bus_socket
ENV INITSYSTEM=off

RUN apk upgrade --no-cache \
  && apk add --no-cache  nodejs npm iputils modemmanager networkmanager curl

COPY --from=build /usr/local/lib/python3.6/site-packages /usr/local/lib/python3.6/site-packages
COPY --from=build /usr/src/app/node_modules node_modules

COPY . ./
RUN cp bashRC.txt /root/.bashrc
RUN /bin/mkdir -p /data/config/ /data/logs && touch /data/config/ssid.txt
RUN chmod 700 connectCheck.sh

CMD ["/bin/bash","-c",". cmd.sh"]

Have you tried this with a single-stage build instead?

We started with single-stage builds but the image is excessively large. We cut the size to 1/3 by using the build base image to build the various modules and the run base for the actual image.