Hi,
I’m having 2 similar issues that I think are linked. One of them is blocking me. Here is all the background on my setup.
Context
I recently installed a brand new fresh OpenBalena 3.6.0
on a Digitalocean Droplet using Ubuntu 22.04
.
I’m in the process of testing everything before migrating my fleet to this new OpenBalena.
My fleet currently is managed by an old OpenBalena 1.3.0
, which is running on a separate Droplet on Ubuntu 18.04
.
With the new OpenBalena, I am using balena-cli 13.6.1
.
With the old OpenBalena, I used to run balena-cli 11.31.26
.
With the new OpenBalena, I am using BalenaOS versions such as 2.94.4 2.83.21+rev1
With the old OpenBalena, I was using BalenaOS version such as 2.32.0+rev1 2.46.1+rev1 2.48.0+rev1
With the new OpenBalena, I use the following base image in Dockerfile.template:
FROM balenalib/%%BALENA_MACHINE_NAME%%-debian-node:14-bullseye-run
With the old OpenBalena, I used the following base image in Dockerfile.template:
FROM balenalib/%%BALENA_MACHINE_NAME%%-node:14-buster-run
I am using Raspberry Pi Zero W (raspberry-pi
/ armv6hf
) as well as Raspberry Pi Zero 2 W (raspberrypi0-2w-64
/ aarch64
).
As you may see, the Raspberry Pi Zero 2 W cannot run on my old Infrastructure, because it’s minimal BalenaOS version is above the maximal BalenaOS of my old OpenBalena instance.
This is the entire reason for upgrading. Being able to use both the old Raspberry Pi, and the new.
I have an application that uses Node.JS. That application uses the better-sqlite3
npm module.
That module requires a binary file to be compiled for the armv6hf
arch, and in theory has a prebuilt binary for the aarch64
arch, which npm/yarn can pull automatically.
This used to work perfectly on my old OpenBalena. However it does not work on the new OpenBalena, in 2 different ways, depending on the Raspberry Pi Zero W or the Raspberry Pi Zero 2 W.
Error with Raspberry Pi Zero 2 W
This is the first error I encountered.
When using balena deploy <myfleet> --emulated --build
, I get a warning at each build step that reads:
[Build] main ---> [Warning] The requested image's platform (linux/arm64/v8) does not match the detected host platform (linux/amd64) and no specific platform was requested
This warning never appeared in my previous OpenBalena. I am indeed building on a linux/amd64
for linux/arm64/v8
. What bothers me is that the warning suggests no platform was requested. I looked around for ways to explicitly specify the platform in docker, but nothing made that warning go away.
So I rolled with the warning. My build was successful.
However, once deployed, when the Node.JS script first requires the better-sqlite3
module, a fatal error occurs.
[Logs] [2022-7-1 2:30:03] [main] /usr/src/app/node_modules/bindings/bindings.js:121
[Logs] [2022-7-1 2:30:03] [main] throw e;
[Logs] [2022-7-1 2:30:03] [main] ^
[Logs] [2022-7-1 2:30:03] [main]
[Logs] [2022-7-1 2:30:03] [main] Error: /usr/src/app/node_modules/better-sqlite3/build/Release/better_sqlite3.node: cannot open shared object file: No such file or directory
[Logs] [2022-7-1 2:30:03] [main] at Object.Module._extensions..node (internal/modules/cjs/loader.js:1144:18)
[Logs] [2022-7-1 2:30:04] [main] at Module.load (internal/modules/cjs/loader.js:950:32)
[Logs] [2022-7-1 2:30:04] [main] at Function.Module._load (internal/modules/cjs/loader.js:790:12)
[Logs] [2022-7-1 2:30:04] [main] at Module.require (internal/modules/cjs/loader.js:974:19)
[Logs] [2022-7-1 2:30:04] [main] at require (internal/modules/cjs/helpers.js:93:18)
[Logs] [2022-7-1 2:30:04] [main] at bindings (/usr/src/app/node_modules/bindings/bindings.js:112:48)
[Logs] [2022-7-1 2:30:04] [main] at new Database (/usr/src/app/node_modules/better-sqlite3/lib/database.js:48:64)
[Logs] [2022-7-1 2:30:04] [main] at Socket.<anonymous> (/usr/src/app/<myscript>.js:18:14)
[Logs] [2022-7-1 2:30:04] [main] at Object.onceWrapper (events.js:520:26)
[Logs] [2022-7-1 2:30:04] [main] at Socket.emit (events.js:400:28) {
[Logs] [2022-7-1 2:30:04] [main] code: 'ERR_DLOPEN_FAILED'
[Logs] [2022-7-1 2:30:04] [main] }
This means the binding binary (in therory downloaded as prebuilt) is nowhere to be found.
I say in theory, because indeed, the build step where yarn installs decencies was very fast, whereas it used to tak ~15min on my machine when it needed to build better-sqlite3
. But since this errors says the file is not there, I am unsure if it actually downloaded it.
I logged in with SSH, and the file actually was there.
I manually ran yarn
to install dependencies again through SSH, and after that, the software ran fine.
So at that point I though “alright, I’ll just move the yarn
call inside my init script, instead of inside the Dockerfile, that way the dependencies are downloaded on the device directly”.
And that worked.
Then came time to deploy the same software on the Raspberry Pi Zero W…
Error with Raspberry Pi Zero W
Since this device needs to build the better-sqlite3
module from scratch every time (no prebuilt binaries), it was impractical to run yarn
in the init script. This would mean a >40min build time on first boot as well as on every update from the registry.
So I decided to create a second Dockerfile specifically for the Raspberry Pi Zero W, which would install yarn dependencies in the Dockerfile, like I used to in my old OpenBalena.
Again, during build, I have the warning (this time the platform is linux/arm/v6
, which is correct).
[Build] main ---> [Warning] The requested image's platform (linux/arm/v6) does not match the detected host platform (linux/amd64) and no specific platform was requested
But here comes the final issue, which I cannot work around. Building better-sqlite3
in the Docker build, like I used to in my previous OpenBalena, results in an error once deployed, odly similar to the one mentioned above, yet not exactly the same. Here it is:
[Logs] [2022-7-1 2:30:03] [main] /usr/src/app/node_modules/bindings/bindings.js:121
[Logs] [2022-7-1 2:30:03] [main] throw e;
[Logs] [2022-7-1 2:30:03] [main] ^
[Logs] [2022-7-1 2:30:03] [main]
[Logs] [2022-7-1 2:30:03] [main] Error: /usr/src/app/node_modules/better-sqlite3/build/Release/better_sqlite3.node: wrong ELF class: ELFCLASS64
[Logs] [2022-7-1 2:30:03] [main] at Object.Module._extensions..node (internal/modules/cjs/loader.js:1144:18)
[Logs] [2022-7-1 2:30:04] [main] at Module.load (internal/modules/cjs/loader.js:950:32)
[Logs] [2022-7-1 2:30:04] [main] at Function.Module._load (internal/modules/cjs/loader.js:790:12)
[Logs] [2022-7-1 2:30:04] [main] at Module.require (internal/modules/cjs/loader.js:974:19)
[Logs] [2022-7-1 2:30:04] [main] at require (internal/modules/cjs/helpers.js:93:18)
[Logs] [2022-7-1 2:30:04] [main] at bindings (/usr/src/app/node_modules/bindings/bindings.js:112:48)
[Logs] [2022-7-1 2:30:04] [main] at new Database (/usr/src/app/node_modules/better-sqlite3/lib/database.js:48:64)
[Logs] [2022-7-1 2:30:04] [main] at Socket.<anonymous> (/usr/src/app/<myscript>.js:18:14)
[Logs] [2022-7-1 2:30:04] [main] at Object.onceWrapper (events.js:520:26)
[Logs] [2022-7-1 2:30:04] [main] at Socket.emit (events.js:400:28) {
[Logs] [2022-7-1 2:30:04] [main] code: 'ERR_DLOPEN_FAILED'
[Logs] [2022-7-1 2:30:04] [main] }
Conclusion
As you can see, the error for the Raspberry Pi Zero W is wrong ELF class
. As if it was built for amd64.
Considering this, and considering the warning, I am suspecting very heavily that the balena deploy
command does
- something different than the old one (balena-cli
13.6.1
→11.31.26
). - something wrong with architectures (since the node module is not compiled for the correct arch, or absent)
I would very much appreciate the Balena Team helping me debug this. I’ve been on this “migration” for a few nights now, and I keep hitting roadblocks. I managed to overcome all of them, but this one I cannot go around. If I need to compile this module on every boot/update on every device, it is inconceivable.
I want to get the same experience as my old OpenBalena instance.
Thanks for reading. I am at your disposal for extra information.
Tim