I have successfully developed, released, updated, and managed my project on a test device using an openBalena architecture based on 2 VMs (a client with balena-cli and a server with openBalena).
However, in production environment, I don’t have internet access, so, I created another pair of VMs. I installed balena-cli, openBalena, and the necessary software (such as Docker), I removed internet access.
I re-flashed the device with the image containing references to the production environment using
I see it online in the production environment. But I’m having trouble to deploy the application.
Initially, I tried simply building on the dev balena-cli machine balena build --fleet <slug>, then docker save -o project.tar <docker_image>, transferred it to the internet-isolated environment, then docker import project.tar, and finally balena deploy <other-slug> <new_generated_docker_image_uuid> but i get this error
[Error] Deploy failed
Get "https://registry2.DOMAIN.local/v2/": tls: failed to verify certificate: x509: certificate signed by unknown authority
Maybe it’s due to different certificates between openBalena dev and openBalena prod.
In both environments I use two different self-signed certificates.
All the documentation I’m reading considers a client that can reach multiple servers.
Or a client that deploys to a device with limited connectivity, for which the project image is preloaded during installation, and then only control and data reception are handled over the network.
My case is different, the production client and server are internet-isolated. I hope there’s a way other than using docker save and docker import, but I can’t find it.
The core issue was likely that I hadn’t added the self-signed certificate to the trusted certificate store. I don’t recall doing this initially, which probably led me to skip it.
Initially, I suspected the problem stemmed from how I’d generated the self-signed certificate (thinking it didn’t support multiple domains). So, I regenerated the certificate for the first server and then moved it to the second. It was at that point I realized the issue persisted, and ultimately, I resolved it as described above.
Since I’m still not entirely sure if the root cause was solely the missing certificate in Docker’s trust store or if it also related to the certificate generation process, I’ll document how I created the multi-domain certificate. I had a tough time finding clear information online and only succeeded after multiple attempts.
Creating a Multi-Domain Self-Signed Certificate
First, create a SAN (Subject Alternative Name) file as follows:
Then, proceed with the standard installation procedure on both servers. They will now share the same certificate despite using different domains. In my specific case, these were subdomains, but the process works identically for completely unrelated domains.
Anyway, I haven’t been able to fully resolve the problem, as I wrote in this other thread:
which is:
Jun 10 10:44:23 89a1389 balena-supervisor[3973]: [event] Event: Image downloaded {“image”:{“name”:“registry2.new-domain.local/v2/5cd9eb7b19e96f6658726f26fe30f14c@sha256:6878587e36cfd28bc337c2fe9a19eff1ef9d9ee823cf81d351910a8b70695738”,“appId”:1,“appUuid”:“1f5dce674dbb47f49f987a60365a885e”,“serviceId”:1,“serviceName”:“main”,“imageId”:11,“releaseId”:11,“commit”:“c09a84612c324396c2cb981f7beaf57d”}}
Jun 10 10:44:23 89a1389 balena-supervisor[3973]: [event] Event: Take update locks {“appId”:“1”,“force”:false,“services”:[“main”]}
Jun 10 10:44:24 89a1389 balena-supervisor[3973]: [event] Event: Service install {“service”:{“appId”:1,“serviceId”:1,“serviceName”:“main”,“commit”:“c09a84612c324396c2cb981f7beaf57d”,“releaseId”:11}}
Jun 10 10:44:24 89a1389 balena-supervisor[3973]: [error] Scheduling another update attempt in 600000ms due to failure: Error: Failed to apply state transition steps. (HTTP code 400) bad parameter - No command specified Steps:[“start”]
Thanks for your quick reply and the clarification!
You’re absolutely right to ask why I was manually managing the PKI. The truth is, I went through all that hassle because I was really struggling to understand the root cause of the issue. As I mentioned in my post, I initially suspected it might just be about making the self-signed certificate trusted by Docker, but I wasn’t entirely sure, and finding clear information was tough.
I’ll proceed with your suggested steps to reset the PKI state by removing the certs and pki volumes, updating the DNS_TLD environment variable, and restarting the composition. I appreciate you pointing me in the right direction!
Just to confirm: once I’ve regenerated everything as you described, will it then be sufficient to simply run the balena join command from the client machine and following the wizard to inject the new configurations?
The core issue wasn’t Balena itself, but my misuse of Docker commands.
I was using docker import to load images, which flattens them and strips vital metadata (like CMD, ENTRYPOINT, ENV). This resulted in images that didn’t behave correctly.
The fix was to use docker load instead. This command correctly restores images, preserving all original metadata and layers. Once I switched to docker load, everything worked as expected.
Summary
Summary of the process for deploying a Docker image from an internet-connected build server to an internet-isolated device:
On the Internet-Connected (Build) Server:
Identify the image: docker image list
Use this to find the <image-id> of the image you want to copy
Save the image: docker save -o /my-path/my-image.tar <image-id>
This command saves the complete Docker image (including all layers and metadata) into a .tar archive at the specified path.
On the Internet-Isolated (Target) Server:
(First, transfer the my-image.tar file to this server.)
Load the image: docker load < my-image.tar
This command correctly loads the image from the .tar archive, preserving all its original metadata, making it fully functional.
Tag the loaded image (Optional): docker tag <id-image-loaded> <repository>:<version-tag>
This step assigns a memorable name and tag to the loaded image. It’s optional because if you deploy the image directly using its ID with Balena, Docker might prune (delete) the untagged image from its catalog after the deployment. Tagging ensures it persists.
Deploy with Balena: balena deploy <slug> <id-image-loaded>
Replace <slug> with your Balena application slug and <id-image-loaded> with the actual ID of the image you just loaded (or its new <repository>:<version-tag> if you tagged it).