Container fails to start - "failed to attach 1 to compat systemd cgroup"

I do wonder what’s going on with my devices that causes the service to restart anyway? I’m often seeing that the container restarts itself (which causes me to have to log in again on my wallboards). Maybe there’s still another problem to solve beyond the service restarting causing a crash-loop.

Indeed. nothing should just restart balena-engine and maybe we are looking at two issues. One that crashes balena-engine, and this systemd restart loop.

I have a suggestion to try.
Can you change the first line of your dockerfile from

FROM balenalib/%%BALENA_MACHINE_NAME%%

to

FROM balenalib/%%BALENA_MACHINE_NAME%%:stretch

You fetch the latest debian which is buster which has systemd 241. I am unable to reproduce the “Failed to attach 1 to compat systemd cgroup” issue in stretch which is using the older systemd.

Now is the issue a mix of hostOS systemd, kernel, containerOS systemd. I don’t know yet and this still needs looking into/fixing. My suggestion is to let you proceed with your work.
And perhaps that will let us find the other issue i.e. why the balenaEngine restarts (which it shouldn’t for no reason)

Thanks - I’m giving this a try - will let you know tomorrow if there were any restarts during the night.

Not really any better I’m afraid. Around two hours in the web browser locked up. I tried restarting the container and it’s just stuck somehow - VNC connects but gives me a black screen. I had to reboot the Pi to regain control.

Thanks for the update. I have passed on your findings to the engineer who is looking at this, as he is offline right now.

The logs seem to point to something else failing… sigh
Would it be possible for you to share your application in a zip with me via a direct message? I think it manages to rustle up quite a few things specially on 2.41 (which has newer pi kernel, pi firmware, new systemd etc)
Changing those shouldn’t ‘technically’ make a difference… but there are always subtleties…

04.09.19 20:31:02 (+0100)  wallboard  (xfce4-session:129): xfce4-session-WARNING **: failed to run script: Failed to execute child process "/usr/bin/pm-is-supported" (No such file or directory)
04.09.19 20:31:02 (+0100)  wallboard  
04.09.19 20:31:02 (+0100)  wallboard  (xfce4-session:129): xfce4-session-WARNING **: failed to run script: Failed to execute child process "/usr/bin/pm-is-supported" (No such file or directory)
04.09.19 20:31:02 (+0100)  wallboard  
04.09.19 20:31:02 (+0100)  wallboard  (xfce4-panel:147): Wnck-CRITICAL **: wnck_workspace_is_virtual: assertion 'WNCK_IS_WORKSPACE (space)' failed
04.09.19 20:31:04 (+0100)  wallboard  libGL error: MESA-LOADER: failed to retrieve device information
04.09.19 20:31:05 (+0100)  wallboard  MESA-LOADER: failed to retrieve device information
04.09.19 20:31:05 (+0100)  wallboard  MESA-LOADER: failed to retrieve device information
04.09.19 20:31:06 (+0100)  wallboard  Fontconfig warning: "/etc/fonts/fonts.conf", line 100: unknown element "blank"
04.09.19 20:31:07 (+0100)  wallboard  libGL error: MESA-LOADER: failed to retrieve device information
04.09.19 20:31:07 (+0100)  wallboard  [153:214:0904/193107.446893:ERROR:object_proxy.cc(621)] Failed to call method: org.freedesktop.Notifications.GetCapabilities: object_path= /org/freedesktop/Notifications: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.Notifications was not provided by any .service files
04.09.19 20:31:07 (+0100)  wallboard  MESA-LOADER: failed to retrieve device information
04.09.19 20:31:07 (+0100)  wallboard  MESA-LOADER: failed to retrieve device information
04.09.19 20:31:07 (+0100)  wallboard  ATTENTION: default value of option force_s3tc_enable overridden by environment.
04.09.19 20:31:08 (+0100)  wallboard  [153:195:0904/193108.110697:ERROR:top_sites_backend.cc(92)] Failed to initialize database.
04.09.19 20:31:08 (+0100)  wallboard  Draw call returned Invalid argument.  Expect corruption.
04.09.19 20:31:19 (+0100)  wallboard  Wed Sep  4 19:31:19 UTC 2019
04.09.19 20:31:24 (+0100)  wallboard  Updating DNS... Traceback (most recent call last):
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/local/lib/python3.5/dist-packages/urllib3/connection.py", line 160, in _new_conn
04.09.19 20:31:24 (+0100)  wallboard      (self._dns_host, self.port), self.timeout, **extra_kw)
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/local/lib/python3.5/dist-packages/urllib3/util/connection.py", line 80, in create_connection
04.09.19 20:31:24 (+0100)  wallboard      raise err
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/local/lib/python3.5/dist-packages/urllib3/util/connection.py", line 70, in create_connection
04.09.19 20:31:24 (+0100)  wallboard      sock.connect(sa)
04.09.19 20:31:24 (+0100)  wallboard  ConnectionRefusedError: [Errno 111] Connection refused
04.09.19 20:31:24 (+0100)  wallboard  
04.09.19 20:31:24 (+0100)  wallboard  During handling of the above exception, another exception occurred:
04.09.19 20:31:24 (+0100)  wallboard  
04.09.19 20:31:24 (+0100)  wallboard  Traceback (most recent call last):
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 603, in urlopen
04.09.19 20:31:24 (+0100)  wallboard      chunked=chunked)
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 355, in _make_request
04.09.19 20:31:24 (+0100)  wallboard      conn.request(method, url, **httplib_request_kw)
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/lib/python3.5/http/client.py", line 1107, in request
04.09.19 20:31:24 (+0100)  wallboard      self._send_request(method, url, body, headers)
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/lib/python3.5/http/client.py", line 1152, in _send_request
04.09.19 20:31:24 (+0100)  wallboard      self.endheaders(body)
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/lib/python3.5/http/client.py", line 1103, in endheaders
04.09.19 20:31:24 (+0100)  wallboard      self._send_output(message_body)
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/lib/python3.5/http/client.py", line 934, in _send_output
04.09.19 20:31:24 (+0100)  wallboard      self.send(msg)
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/lib/python3.5/http/client.py", line 877, in send
04.09.19 20:31:24 (+0100)  wallboard      self.connect()
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/local/lib/python3.5/dist-packages/urllib3/connection.py", line 183, in connect
04.09.19 20:31:24 (+0100)  wallboard      conn = self._new_conn()
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/local/lib/python3.5/dist-packages/urllib3/connection.py", line 169, in _new_conn
04.09.19 20:31:24 (+0100)  wallboard      self, "Failed to establish a new connection: %s" % e)
04.09.19 20:31:24 (+0100)  wallboard  urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x76006170>: Failed to establish a new connection: [Errno 111] Connection refused
04.09.19 20:31:24 (+0100)  wallboard  
04.09.19 20:31:24 (+0100)  wallboard  During handling of the above exception, another exception occurred:
04.09.19 20:31:24 (+0100)  wallboard  
04.09.19 20:31:24 (+0100)  wallboard  Traceback (most recent call last):
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/local/lib/python3.5/dist-packages/requests/adapters.py", line 449, in send
04.09.19 20:31:24 (+0100)  wallboard      timeout=timeout
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 641, in urlopen
04.09.19 20:31:24 (+0100)  wallboard      _stacktrace=sys.exc_info()[2])
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/local/lib/python3.5/dist-packages/urllib3/util/retry.py", line 399, in increment
04.09.19 20:31:24 (+0100)  wallboard      raise MaxRetryError(_pool, url, error or ResponseError(cause))
04.09.19 20:31:24 (+0100)  wallboard  urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='10.114.104.1', port=48484): Max retries exceeded with url: /v1/device?apikey=80d7e0a9ced6e88bf0260d53468fa6c5bb5b3c612276faa5699ee89a085acd (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x76006170>: Failed to establish a new connection: [Errno 111] Connection refused',))
04.09.19 20:31:24 (+0100)  wallboard  
04.09.19 20:31:24 (+0100)  wallboard  During handling of the above exception, another exception occurred:
04.09.19 20:31:24 (+0100)  wallboard  
04.09.19 20:31:24 (+0100)  wallboard  Traceback (most recent call last):
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/bin/dns.py", line 47, in <module>
04.09.19 20:31:24 (+0100)  wallboard      sys.exit(main())
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/bin/dns.py", line 22, in main
04.09.19 20:31:24 (+0100)  wallboard      (ip, hostname) = getSystemInfo()
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/bin/dns.py", line 13, in getSystemInfo
04.09.19 20:31:24 (+0100)  wallboard      ipInfo = requests.get('%s/v1/device?apikey=%s' % (os.environ['RESIN_SUPERVISOR_ADDRESS'], os.environ['RESIN_SUPERVISOR_API_KEY']), headers={'content-type': 'application/json'}).json()
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/local/lib/python3.5/dist-packages/requests/api.py", line 75, in get
04.09.19 20:31:24 (+0100)  wallboard      return request('get', url, params=params, **kwargs)
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/local/lib/python3.5/dist-packages/requests/api.py", line 60, in request
04.09.19 20:31:24 (+0100)  wallboard      return session.request(method=method, url=url, **kwargs)
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/local/lib/python3.5/dist-packages/requests/sessions.py", line 533, in request
04.09.19 20:31:24 (+0100)  wallboard      resp = self.send(prep, **send_kwargs)
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/local/lib/python3.5/dist-packages/requests/sessions.py", line 646, in send
04.09.19 20:31:24 (+0100)  wallboard      r = adapter.send(request, **kwargs)
04.09.19 20:31:24 (+0100)  wallboard    File "/usr/local/lib/python3.5/dist-packages/requests/adapters.py", line 516, in send
04.09.19 20:31:24 (+0100)  wallboard      raise ConnectionError(e, request=request)
04.09.19 20:31:24 (+0100)  wallboard  requests.exceptions.ConnectionError: HTTPConnectionPool(host='10.114.104.1', port=48484): Max retries exceeded with url: /v1/device?apikey=80d7e0a9ced6e88bf0260d53468fa6c5bb5b3c612276faa5699ee89a085acd (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x76006170>: Failed to establish a new connection: [Errno 111] Connection refused',))
04.09.19 20:31:24 (+0100)  wallboard   done.
04.09.19 20:32:25 (+0100)  wallboard  [ TIME ] Timed out waiting for device /dev/mmcblk0p6.

That last line is quite suspicious. But I think there are other things happening here too.

Managed to make some progress on the original Failed to attach 1 issue where we started.
I’ve added detail on github https://github.com/balena-os/meta-balena/issues/1645#issuecomment-528338112

Ok. We have workaround number 1 for the Failed to attach 1 to compat systemd cgroup issue…

root@468439d:~# cat /etc/docker/daemon.json 
{ 
"exec-opts": ["native.cgroupdriver=systemd"] 
}
root@468439d:~#

Now to try workaround 2

Can you please add this initcall_blacklist=bcm2708_fb_init to /mnt/boot/cmdline.txt ? Does that make xfce start working on your wallboard?

I tried your app, I can see chrome open with the above workaround on 2.41. But I can’t run the wallboard fully as I think you have some env-vars which have secrets api keys.

p.s. We have taken down 2.41 from production due to these issues. We are tracing and fixing them. Mix of kernel/systemd/firmware issues coming together.

And I have also started an internal thread to improve testing in this area. Our testing uses https://github.com/balena-io-playground/x11-window-manager but that doesn’t use systemd… So I’m trying to see if we can add x11+systemd into our testing flow as well.

Thanks - do appreciate the help!

All I really want to be able to do is to have a web browser that can display two or three tabs and cycle between then. And I need VNC because I’ll need to provide login credentials as things start up. If I don’t need systemd then I’m happy to do without it…

I’ll add that text to the cmdline.txt file now.

Well after implementing that adjustment, the Pi is seriously quicker. It’s up and running still, hasn’t crashed yet (time will tell) but I’m impressed at how much quicker it is!

Thanks for the feedback, glad to know it works! Will let our team know, and see how we can improve the OS from these things we’ve learned. In the meantime, let us know if you have any other questions!

Will do - thanks for working with me on this one and looking forward to a bOS 2.42 soon!

So far the container has been up for 12 hours… however there is a problem: at some point chromium locked up. VNC was still working but then that crashed too.

06.09.19 08:29:18 (+0200) wallboard (WW) glamor: Failed to allocate 1920x1080 FBO due to GL_OUT_OF_MEMORY.
06.09.19 08:29:18 (+0200) wallboard (WW) glamor: Expect reduced performance.
06.09.19 08:29:18 (+0200) wallboard Xorg: …/…/…/…/glamor/glamor_fbo.c:57: glamor_pixmap_ensure_fb: Assertion `fbo->tex != 0’ failed.
06.09.19 08:29:18 (+0200) wallboard [158:158:0906/062918.402956:ERROR:chrome_browser_main_extra_parts_x11.cc(62)] X IO error received (X server probably went away)
06.09.19 08:29:18 (+0200) wallboard xinit: connection to X server lost
06.09.19 08:29:18 (+0200) wallboard [158:158:0906/062918.484105:ERROR:zygote_communication_linux.cc(275)] Failed to send GetTerminationStatus message to zygote
06.09.19 08:29:18 (+0200) wallboard caught signal: 1
06.09.19 08:29:18 (+0200) wallboard xfsettingsd: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.0.
06.09.19 08:29:18 (+0200) wallboard xfdesktop: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.0.
06.09.19 08:29:18 (+0200) wallboard xfwm4: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.0.
06.09.19 08:29:19 (+0200) wallboard [158:158:0906/062919.015525:ERROR:zygote_communication_linux.cc(275)] Failed to send GetTerminationStatus message to zygote
06.09.19 08:29:19 (+0200) wallboard [158:158:0906/062919.127411:ERROR:zygote_communication_linux.cc(275)] Failed to send GetTerminationStatus message to zygote
06.09.19 08:29:19 (+0200) wallboard 06/09/2019 06:29:18 deleted 60 tile_row polling images.
06.09.19 08:29:19 (+0200) wallboard extra[1] signal: -1
06.09.19 08:29:19 (+0200) wallboard XIO: fatal IO error 11 (Resource temporarily unavailable) on X server “:0.0”
06.09.19 08:29:19 (+0200) wallboard after 35683 requests (35683 known processed) with 0 events remaining.

Hmm. This one is suspicious

06.09.19 08:29:18 (+0200) wallboard (WW) glamor: Failed to allocate 1920x1080 FBO due to GL_OUT_OF_MEMORY.

How much gpu_mem have you reserved?

  1. I’ll shove it up to 128 and see what happens. Or any other value that would be a better suggestion ?

I’d recommend 256 as you are using a full desktop environment. 64/128 definitely feel quite low.

Hi, please note to revert the cmdline.txt change in this machine when you do the host OS update because we will be fixing the graphics issue in a different way and you should use that instead.

For the Failed to attach 1 to compat systemd cgroup issue, please see

Hi @ajs1k,

If you let us know exactly what device you are using (I didn’t see it mentioned above), we will let you know when that host OS fix is released to production for your device type!

Thanks - Pi3 is the device I’m using.