Balena thinks service is downloaded but it will not start

dmwoods · February 23, 2021, 6:08pm

I’ve seen this on several devices recently. We will push a release, and the BalenaCloud console will indicate “downloaded” for all the services, but some of them (always the same group) will not start.

According to the supervisor log when I attempt to start the service the API call gets a 404. And surely enough, in /mnt/data/docker/containers there’s no directories corresponding to the containers that won’t start.

Rebooting the device often makes this problem go away, but I had one recently where I had to push another release, and then reboot and then the containers folder was updated.

I guess my question is, can anyone comment on how my devices are getting into a state where Balena thinks the services are downloaded but they don’t appear to be?

The group of services that have the problem are all quite large in size but the devices are on a high-quality ethernet connection and there’s no obvious (at least to me) networking errors, and no space issues on my /mnt/data partition.

These devices are Generic_x86_64 or Intel_NUC, Balena OS 2.68.1 with supervisor 12.3.0 or 12.3.5, though also one device with 2.4.6 / 10.6.27 (which we have some of in the field)

Thanks!

20k-ultra · February 26, 2021, 4:55pm

Hey, I understand the confusion with this. The Supervisor is a really complex application that works really well but there are some pain points that we need to address to avoid this class of issues. For example, your specific state was most likely caused by the on device database file getting out of sync with what is actually running in the engine. We plan on improving this by removing that database file which will help significantly but there’s no clear deadline for that as we try to focus on patching such issues and adding feature requests from users.

In the more immediate future I am working on a PR that will add more debug logging so when we look at the Supervisor’s journal logs on device we can see why it isn’t starting containers.

20k-ultra · February 26, 2021, 4:58pm

Whenever you encounter such instances don’t hesitate to reach out and I personally look at such instances and can resolve them pretty fast. It is rare that these situations happen but with the ability to update the OS and Supervisor we have seen the issues usually happen with big jumps in these versions. For example see Properly handle legacy volumes · Issue #1604 · balena-io/balena-supervisor · GitHub.

dmwoods · February 26, 2021, 8:12pm

Hi, Miguel, thanks for reaching out. I will have new releases for my test fleet early next week and I’ll leave one of them in this state (assuming it does really take a reboot to resolve and not just magic) and hopefully get you to have a look at it while it’s failing to start the containers.

The issue you referenced definitely sounds like a possibility, and we did make a big supervisor jump, so hopefully we are onto something here.

20k-ultra · February 26, 2021, 8:41pm

Awesome. I would really appreciate if I can have some time to debug a device if you encounter the issue again.

dmwoods · March 1, 2021, 5:35pm

Just a heads up, I reached out directly with the UUID and support access for a device currently in that state.

thanks!

dmwoods · March 2, 2021, 3:31pm

This was solved with great assistance from Balena. In effect we had a dependency chain of starting containers that was broken due to one exiting too quickly. It was not super-obvious to us this was happening, so thanks for the help!

phil-d-wilson · March 2, 2021, 3:58pm

I’m glad you’re sorted Dave! Thanks for letting us know.
Phil

Topic		Replies	Views
BalenaOS won't start all services balenaOS raspberrypi4	9	1329	March 5, 2021
BalenaOS Does Not Update a Container Product support	6	463	April 29, 2021
Balena services downloaded but no running! Product support docker , network , balena-cli	25	914	June 29, 2022
Pull and Run a container with balenaEngine balenaEngine	15	1576	June 28, 2019
Devices unable to start new containers Product support	1	23	February 18, 2025

Balena thinks service is downloaded but it will not start

Related topics