Local build issues with ARM64

Hi,

I’m starting up with Balena and am getting troubles with building locally my application for a different architecture than my build host (targetting a raspberry pi 5, thus arm64). I’m using a multi-container setup, thus with docker-compose.yml.

I have a few docker hub containers (like postgres), which work fine with balena push. But now I want to add some custom thing, so I want to build locally using balena build. I have written a Dockerfile for the application, in a pretty common two stage way: build then prod with just the statically-linked binary (it’s a rust app).

My issue is: the build does not complete!

For some weird reason, the build of my custom container stops after the first RUNstatement, and the build process throws up `failed to get destination image “sha256:f7a280d9484d4df4d582c01fc83d933a79b288ce9567d7661068ebce7d6c2348”: image with reference sha256:f7a280d9484d4df4d582c01fc83d933a79b288ce9567d7661068ebce7d6c2348 was found but its platform (linux/amd64) does not match the specified platform (linux/arm64/v8)`.

I tried to sprinkle --platform=linux/arm64/v8on my FROMstatements, but to no avail.

The docker-compose looks like the following:

version: '2'
services:
  postgres: postgres:18
  environment:
    POSTGRES_DB: ...
    ...

  # other services with docker hub images

  app:
    build: ./app

The Dockerfile look like the following:

FROM rust:1.91.1-slim AS build
RUN dpkg --add-architecture arm64 && apt-get update && apt-get install -y --no-install-recommends musl-dev:arm64 && rustup target add aarch64-unknown-linux-musl
COPY src /src
RUN cd src && cargo build --release --target aarch64-unknown-linux-musl

FROM scratch
COPY --from=build /src/target/aarch64-unknown-linux-musl/release/app /app
CMD ["/app"]

I’m running balena-cli version 22.4.16 on Arch Linux with Docker 28.5.2.

Any idea?

Many thanks!

After many (many) troubleshooting steps, I was able to finish the build. For anyone stumbling on this topic:

  • If your Docker install is old (i.e. you’ve been upgrading your linux for years), make sure you use the “new” containerd image store. Otherwise, you won’t be able to have images that are not of your host arch.
  • Don’t try to use --platform with buildkit variables or even hardcoded values, Balena CLI will trip on anything other than a template dockerfile (Dockerfile.template), and put --platform parameters the opposite way you would do with buildkit.
    So instead of my FROM --platform=linux/amd64 rust:1.91-slimfollowed by FROM scratch, do FROM rust:1.91-slimfollowed by FROM -platform=%%BALENA_ARCH%% scratch

HOWEVER

I cannot deploy the image to my fleet.

balena deploy fails at the very last step:

[Info]    No "docker-compose.yml" file found at "/home/tuetuopay/dev/balena-test-app"
[Info]    Creating default composition with source: "/home/tuetuopay/dev/balena-test-app"
[Info]    Everything is up to date (use --build to force a rebuild)
[Info]    Creating release...
[Info]    Pushing images to registry...
[Info]    Saving release...
[Error]   Deploy failed
Unable to extract image digest (content hash) from image upload progress stream for image:
registry2.balena-cloud.com/v2/1aec1e262ef8c964238850c925f35bec:latest

On the dashboard, the release is just marked as failed.

For the love of god, how is “building and deploying a container for a foreign arch” not something that is trivial for an IoT management platform that by essence will manage mostly foreign architectures?

Sorry for venting. But I’ve wasted already way too much time on this.

If anyone has an idea, I’m all ears. This is definitely me doing something wrong because I can’t imagine such a trivial operation being broken on this platform. But right now I can’t see why.

I completely failed to make it work and basically gave up.

For anyone finding this topic: I had to completely drop the idea of multi-stage Dockerfiles with mixed architecture on Balena. The only two reliable ways I found to build aarch64 images are:

  • have the built binaries somehow in your build dir and have the Dockerfile be a collection of COPYstatements
  • build the image separately using the regular docker buildx tools, then have balena build a single-line dockerfile that’s just a FROM your-prebuilt-image

So yeah, I hope this wart will be fixed some day in the balena CLI :slight_smile:

Hey there @Tuetuopay , sorry to hear you had so much trouble getting started.

On a platform like Arch Linux, you would need an emulation layer installed in order to build for a non-native architecture. This is handled automatically by Docker Desktop on Windows and MacOS on but Linux you would need to do something like register QEMU binfmt on the host.

But an even easier path that we usually recommend is just use our free, native cloud builders. If you use balena push instead of balena build our dedicated ARM builders will handle everything natively.

Hope this helps, cheers.

Thanks for sharing your experience! Sounds frustrating, but your workaround with prebuilt images or buildx is really helpful for anyone facing the same issue.

This seems like an unfortunate limitation, as the cross-compilation paint has been dry for years and buildx + containerd storage offsers a streamlined experience. However, I am well aware Balena has been around for years too, with a lot of technical baggage, and actually pioneered Docker on multiple architectures, so I’m definitely not throwing you guys under the bus. It’s not an easy solution to deal with to reconcile years of divergence.

With buildx + containerd storage, a lot of the custom balena build pipelines could be dropped.

Yup, that’s the happy path and I fully agree it’s much easier. Yet, having a single pipeline with a standard Dockerfile to build the image both locally (to e.g. test it on local devices not running BalenaOS), and in the cloud (for production) would be a terrific experience!

My dream would be images built by our own CICD pipelines that are just pushed to devices without Balena Cloud Builders (source code stays in our Gitlab, as do some images), and we know the images we have and we can test locally are the same as those deployed on devices.

Also, it’s the only supported way to use OpenBalena :slight_smile:

Glad it helped! Having your own Docker registry for Balena to directly pull from is another option, albeit more involved. I tried omitting the build part in the docker-compose: the CLI will try to forcefully pull the image, so a locally tagged image is not an option.

I actually need to revisit this topic, as I’ve dug deeper in the “Unable to extract image digest” error, and it turns out it’s a trivially fixable issue, see this GitHub thread.

Thanks for the feedback @Tuetuopay !

Does buildkit (buildx) and containerd storage work on your arch host to build for ARM? If so, you can perform a local build with your native docker clients and just use balena CLI to deploy to openBalena. We recommend using docker compose to build so the images are tagged in a way that the balena deploy command expects.

The balena CLI doesn’t support buildkit as it requires a separate gRPC connection that was only supported by dockerode in mid-2024 and we haven’t had an urgent need for it. Our on-device balena-engine also doesn’t support buildkit to reduce the binary size.

Yes, the builds work fine with buildx + containerd storage, and with the right tags are no different than containers from balena build. My current workflow is to build images, push them to our own internal registry, then use balena build with a docker-compose that has no build steps, only references to those images. And you’re right, the main interest in using balena build is to get the correct tags for balena deployto find afterwards.

Gotcha about buildx, I did not know it had a completely different protocol than the old build. Perhaps some documentation would help!

Anyways, thanks for the details :slight_smile: Wish you the best of luck!