Application Works on RaspberryPi3, but not Intel NUC, Endless Container Restart Loop

support
pendinguserresponse
#1

Hi, I am trying to set up 2 applications to work with the same git repository. I am using Balena’s Dockefile.template system and my builds are working just fine. The application uses 3 containers, with 1 coming up after the other 2. This works flawlessly on the Raspberry Pi version (balenaOS 2.31.5+rev1) but on the NUC (balenaOS 2.32.0+rev2) I am having issues.
The 2 initial containers keep restarting endlessly. The logs show a loop of installing->starting->started->killing…
The containers are set to restart: always
If I ssh into the HostOS I am able to do a balena run <image_id> and the container seems to run just fine. I am only exposing a single port on each of the containers so I do not believe that I the issue is https://github.com/balena-io/balena-supervisor/issues/824
Any thoughts would be greatly appreciated. Thanks

#3

I believe that this does have something to do with ports in the docker compose file.
I set up a separate application that just has one of the containers.
This is the compose file

version: '2'
services:
  redis:
    build: ./redis
    restart: always
    ports:
      - "6379:6379"

It ends up in the same loop.
By commenting out the ports portion, It runs just fine.
Additionally, If I just run the container like balena run -p 6379:6379 <image_id> it works fine as well.
Is there something I am doing wrong here? Or could there be a bug?

#4

I’ve got the same problem. Seems to be a potential issue with balenaOS 2.32.x. Work around seems to be to downgrade to 2.31.x unless you can live with just using expose in docker-compose.yml rather than ports.

See Container stuck continuously restarting

#5

This is actually exactly what I did, downgraded to balenaOS 2.31.2+rev1 and it’s working just fine now. Thanks for confirming it’s not just me :slight_smile:

1 Like
#6

No worries, I spent forever thinking it was me as well… :slight_smile:

#7

@nleonardi, @kelfish, thank you for reporting this issue. Given the behaviour described regarding the ports instruction and the balenaOS 2.32.0+rev2 version, I believe this issue is an instance of:

Quoting the issue:

When an EXPOSE instruction is used in a Dockerfile , and an overlapping expose or ports instruction is also used in the docker-compose.yml file, the supervisor will restart the app container in a loop.
The temporary workaround is to remove the overlapping setting from either the Dockerfile or the docker-compose.yml file.

A supervisor code fix has already been merged and is in the process of integration with a new balenaOS release. The whole process (with OS testing for all the different device types) usually takes a couple of weeks.

1 Like
#9

I’m unfortunately affected by this issue as well. I’m also using openbalena, so I’m finding that I frequently need updated OS builds that include the latest and greatest fixes that stay in sync with server side changes there.

Sadly, the very long month+ validation lag time on OS builds is making it really, really hard to stay up to date and tease out issues. Building my own would theoretically be an option, but it’s a significant time investment…

I totally get the desire to only offer certified builds to your customers that have been through the testing process, but for those of us working on the open source side of things - are uncertified, preview builds of the OS available somewhere? Even if it was only for the common raspberrypi3 platform, and even if it was a development only build, it would be a huge boost to be able to validate issues are actually fixed in advance of the final build being released…

#11

The release schedule depends on a lot of things as there are a lot of different hardware to support. When we test a version on one device type and find an issue that would affect all the device types as well, there’s not much point testing all across the now-known-bad release, but to fix the issue and release a version that works. Releasing a version with known issues would be a support nightmare (in the “open” section as well as the Cloud section).

Having said that, we are working on getting actual “nightly” builds made available (but depends on some internal changes that are on the way to the best of my knowledge).

Also, balenaOS is completely open source, so nothing’s stopping anyone from building a release for themselves, based on the code available. For example here’s the relevant Raspberry Pi 3 and Intel NUC repository (or can find it for any device types, on the above linked balenaOS releases page as well, in the repo version column) (note still, that the current repo versions are on the known-bad 2.33.0 version, which is not released, would wait till 2.34.x is incorporated into the different device type repositories).

The custom builds are, for example, described on the balenaOS site: https://www.balena.io/os/docs/custom-build/ If you find anything amiss there, don’t hesitate to let us know!

#13

Thanks - nightlies for just one ARM (pi 3) and one X86 (NUC) would be amazing and are exactly what I’m looking for. I totally get the need to validate OS releases really carefully before deploying them broadly, but nightly builds would actually be a really useful tool to let the community validate there aren’t any lurking blocker issues before the release occurs.

(I think much of the concern could be mitigated with a scary warning page, and possibly also by only releasing development flavors.)

If I have to set up my own continuous integration system to produce builds, it’s not the end of the world, but it seemed like duplicate effort for something the Balena team probably is already doing internally. It’s great to hear that it’s something that’s coming and is just blocked on infrastructure investments!

#14

Thanks a lot, I’ve added you feedback to our internal issue tracker, and will keep you posted! :slight_smile: