Container reboot and "Failed to attach 1 to compat systemd cgroup"

The “Failed to attach 1 to compat systemd cgroup” is reported when rebooting a container by use of console command ‘reboot’. The error output looks like this:

27.08.19 12:37:52 (+0200) main Failed to attach 1 to compat systemd cgroup /docker/2b4d221cf062a04b2d2bb85dd636c7a337f8e5c748d8b91f61da1a6835b92f28/init.scope: No such file or directory
27.08.19 12:37:52 (+0200) main Failed to open pin file: No such file or directory
27.08.19 12:37:52 (+0200) main Failed to allocate manager object: No such file or directory
27.08.19 12:37:52 (+0200) main [!!!] Failed to allocate manager object, freezing.

If the device is rebooted by use of the ‘Restart’ button in the balena dashboard, then the container is started correctly and everything works as expected. Performing the console ‘reboot’ command will freeze the container again.

One major changes has been migration from resin/intel-nuc-debian:stretch to resin/intel-nuc-ubuntu:bionic. Unsure if this is related.

Any guide how to solve this?

Device is running: balenaOS 2.39.0+rev3, supervisor 10.0.3

@aliasbits there isn’t really much to go on here. Restarting the container doesn’t reboot the device, so it could be that whatever your container is doing on startup is preparing something, which then is accessible after a container restart.

Could you explain a bit more about what your container is doing/trying to do, and what your motivations to change from a debian base, to an ubuntu base were?

The change from debian to ubuntu was performed in order to align against internal development processes and to allow for the latest upstream version of some packages.

The application is working as expected on both. Some of the system changes in the entrypoint are:

  • Setup config files
  • Setup environment vars
  • Setup network
  • Setup preconditions before serviced service startup
  • Start various systemd services
  • Setup loopback filesystems on /dev/loop0 ( this has been disabled which does not make any difference )

Many of the changes are performed in a persistent balena volume.

I added a "echo ‘entrypoint’ " in the top of the entrypoint.sh script. ‘Entrypoint’ printout is show as expected on normal startup, but is missing when performing the console ‘reboot’. Hence, the entrypoint script is not executed after reboot. Please note that the image does not perform any docker/balena change to the system. Everything is setup by the docker compose file and the entrypoing script.

Could you share the entrypoint script at all?

I cannot share the entrypoint script

The button ‘Reboot’ in dashboard retain the container as is and start it up as expected

Without some indication of what that entrypoint script is doing, it’s very hard to say what the issue is. A container restart, and a reboot, are both synonymous from the containers perspective; the only thing which could be different is the host system.

Hi there,

As @richbayliss mentioned, without some insight into your entry script it’ll be difficult to assist. If you are so inclined, you can try updating to the balenalib base image set since systemd in the container has been known to cause issues. Do note the breaking changes listed here: https://www.balena.io/docs/reference/base-images/base-images/#major-changes.

If you do try this, please let us know how the test goes!

I also have this problem - started with balenaOS 2.41.0r3.

Hi @xginn8

Any status on when resin/ base images will be deprecated, because balenalib breaking changes needs to be planned for in our case.

Hi,

Technically the resin/ base images are deprecated. We don’t update the resin/ base images any more. We’d recommend moving to balenalib ones moving forward.

Another key thing is that any further OS releases, testing, fixes are done for balenalib images only.

We still need to reproduce and investigate where the following error is coming from Failed to attach 1 to compat systemd cgroup

Regards
ZubairLK

I’ve made https://github.com/balena-os/meta-balena/issues/1645 to track/investigate this

Please comment on the issue with as much detail as you can. Especially if you guys can reproduce using a simpler test case

Hi @aliasbits,

I’ve been trying to reproduce the issue but have failed to do so.

Would it be possible to share your application in a zip file via a direct message? Trim out the bits that are unnecessary and leave just enough to reproduce this.

I’m quite stuck at this point I’m afraid…

It looks like the issue was reproduced by the balena team, according to comments at:

The GitHub issue can be subscribed to for updates, and additional comments are welcome. Thank you for reporting it!

Hi @aliasbits, do you have a development device where you can run a bit of a test for me?

  • ssh into the host OS
  • vi /etc/docker/daemon.json and add
{
  "exec-opts": ["native.cgroupdriver=systemd"]
}
  • restart balena-engine i.e. systemctl restart balena

Do you still see your app container in a failed to attach 1 to compat systemd cgroup issue?

There is some more information here https://github.com/balena-os/meta-balena/issues/1645#issuecomment-528338112

Thanks

The above change on the host fixes reboot in the container when using console command ‘reboot’

What is the deployment/release flow for adding this to resin/balenalib base images?

https://github.com/balena-os/meta-balena/pull/1658 will be merged in the next meta-balena release and then each device goes through its own update/testing cycle.

We also have a suggested workaround to try for existing devices as well.

In your application, you’ll be starting systemd with some arguments. entry.sh if based on our examples.

You’ll need to add the following command before starting systemd systemd.legacy_systemd_cgroup_controller

e.g. https://github.com/balena-io-playground/balenalib-systemd-example/blob/6623fc63892e92386a4885d93ef94fb14b5fb9e4/app/systemd/entry.sh#L88 needs to be appended with systemd.legacy_systemd_cgroup_controller