most services won’t start on rpi zero w

Hi,

I’m trying to run my synthesizer project on a rpi zero w (see Make a MIDI Synthesizer out of a Raspberry 🎶 - #2 by dtischler )

But when the device boots, only one or two services start and the 2 )or 3) others keep restarting without ever starting. I can’t find anything in the log that helps me

I’e tried running the docker image manually and it does not seems to work, but I don’t know why either, for exemple when I run balena run --rm alpine date it hangs for about 40 ~ 50 seconds and stop without displaying anything:

image

I tried the Diagnostics health checks on the dashboard, only the check_service_restarts failed:

Some services are restarting unexpectedly: (service: /autoconnect_3967432_1900041 restart count: 41)

the “Device diagnostic” reported a lot of errors, but mostly related to the container restarting all the time AFAICT

the “Supervisor state” also reported something strange. despite being on the latest release, it says:

{
  "api_port": 48484,
  "ip_address": "192.168.0.105",
  "os_version": "balenaOS 2.80.8+rev2",
  "mac_address": "B8:27:EB:D2:B5:D5",
  "supervisor_version": "12.8.8",
  "update_pending": true,
  "update_failed": true,
  "update_downloaded": false,
  "status": "Idle",
  "download_progress": null
}

but maybe that’s only because not all services were started

I don’t know where I should look next :confused:

any ideas ?

ps: I opened support access for the device 90d187c547256b612ed77ae59fd014ca or anyone can try by deploying the MIDI Synthesizer project on another device. It works well on the rpis 2, 3 and 4. I only have issues with the rpi zero w

thx!

Hi, let me take a look at your device

Hi again, this is a strange issue. I can confirm that I’m seeing the autoconnect service restart and that running the alpine service seems to be broken, however other containers, the supervisor and the hello-world container run without issue. Would you mind me removing the alpine image and re-downloading it to make sure that is not a red herring (it may be that the image got corrupted on download).

Have have you experienced this issue on other devices or just this one? I’m also going to run your app on my pi zero to see if I can replicate the problem.

I just removed the alpine image (balena image rm alpine) and it re downloaded it on the next run but I got the same issue.

something I forgot to mention, when I reboot the device, it’s not always the same service that manage to starts…

I have not tested this on other pi0z because that’s the only one I got (it’s brand new, ordered it just for this). the sd card is not new though, I’ll try with a new one to make sure it’s not the issue here

yes, please try with the new SD card, I’m also testing the MIDI Synthesizer project on my device to see if I observe similar behavior

it’s not working better with a new sd card :frowning:

(I’ve opened support access to this device too : 0ce105adad558022ad09d508812fa9e3 )

the other one is offline cause it’s the same device just a new sd card

Hi, just wanted to let you know I was able to reproduce the problem on my pi zero device, even with the alpine image issue. I’m still not sure if both problems are related, but the same test with an ubuntu image succeeds.

I’m stepping away from the computer now but I’ll try to keep looking at this during the week. Probably my next step will be to test the app in local mode and test the services one by one to see if they the issue reoccurs. If you are willing you, can try the same thing on your device.

Please keep us updated if you make any progress.

Hi again, thanks again for reporting this, this has been a fun investigation.

I think I got an answer for your problem. It looks like the alpine image issue and your problem running the services are related after all.

We think this is related to this issue ARMv6 machine pulls v7 image from manifest list · Issue #269 · balena-os/balena-engine · GitHub which has been fixed in upstream moby but has not yet been merged in balena engine.

When you push a multi-arch image to the builder, with a docker-compose as below

version: "2"

services:
  autoconnect:
    restart: always
    image: multi-arch-image

The builder will use the --platform docker flag to pull the image in order to make sure that the proper arch image makes it to your device. The problem is that docker pull --platform will use try to use the closest arch available (or whatever is in dockerhub if the image is single arch).

This generally is not a problem, since trying to run the wrong architecture image will have docker report exec format error. However, for your images, (e.g. texthtml/midi-synthesizer-autoconnect), they have a build for arm/v7 architecture but not for arm/v6. Since the architectures are so similar, the engine just fails silently, leading to the issue you are experiencing.

This is the same thing you are observing when running balena run --rm alpine date, due to the bug linked above, the engine will pull the v7 image and cause the behavior you reported.

The solution in your case is to make sure you have a arm/v6 image for your dockerhub images. We are also working to improve multi-arch support on balena engine so you’ll be able to create multi arch images directly in our builders.

Please let us know if this solves the problem for you. Thanks again!

Thank you @pipex !

This is the same thing you are observing when running balena run --rm alpine date , due to the bug linked above, the engine will pull the v7 image and cause the behavior you reported.

indeed, I managed to get alpine working by forcing docker to fetch it by it’s digest identifier:

root@0ce105a:~# balena run --rm alpine@sha256:18c29393a090ba5cde8a5f00926e9e419f47cfcfd206cc3f7f590e91b19adfe9 date
Thu Sep  2 21:49:19 UTC 2021

The solution in your case is to make sure you have a arm/v6 image for your dockerhub images. We are also working to improve multi-arch support on balena engine so you’ll be able to create multi arch images directly in our builders.

ah ! I just made some test, but it’s not easy, the docker rust image does not exist for arm/v6 either and cross compilation is not so easy :sweat_smile:

This is going to take some time !

Let us know how it goes @mathroc ! :slightly_smiling_face:

thx for the reminder, I took another look at it and I finally managed to compile it to aarch64 and armv6 automatically

That (and the fix from Backport platform-detection fixes from containerd by robertgzr · Pull Request #270 · balena-os/balena-engine · GitHub I think) made it work :slight_smile: (At leasts the containers are starting, I’m not sure the pi zero is powerful enough to run Fluidsynth on top of docker. I have some more investigation to do)

Great news! Thanks for letting us know, hopefully your pi zero is up to the task :slight_smile: