Hi,
I’m testing balenaOS 2.60.1+rev1. Currently I have some memory issues that results in my container not running. I have the idea that this unfluences ModemManager too. I think this issue existed in my project befor 2.60 but is now visible for us. I’m stuck in analysing what the exact problem is now. I’m thinking that it maybe an memory leak. The container cannot start because of the following error:
Service exited 'neonlink_build sha256:9fa86c43551f19cfb5a56b05f207ac722bb0d27f49e6d8fe9766f1435c4f0f19'
Restarting service 'neonlink_build sha256:9fa86c43551f19cfb5a56b05f207ac722bb0d27f49e6d8fe9766f1435c4f0f19'
neonlink_build ./runall-dist.sh: line 26: 31126 Segmentation fault (core dumped) ps "$PID"
neonlink_build 31127 (core dumped) | grep "$PID" > /dev/null
neonlink_build 22 stopped
This means that it access memory that it doesn’t have access to. This problem came up when I executed Diagnostics on the device. Before that it ran fine, but still had an high memory usage: > 800 of 924 mb.
When closing and starting the container, I would expect that the memory usage would drop to 400/500 MB, but it’s stil 787 MB.
When executing top on host I get this as the result.
1388 1 root S 976m 105% 0% /usr/bin/balenad --experimental --log-driver=journald -s aufs -H fd:// -H unix:///var/run/balena.sock -H unix:///var/run/balena-engine.sock --dns 10.114.102.1 --bip 10.114.101.1/24 --fixed-cidr=10.114.101.0/25 --max-download-attempts=10 --exec-opt native.cgroupdriver=systemd
1444 1388 root S 975m 105% 0% balena-engine-containerd --config /var/run/balena-engine/containerd/containerd.toml --log-level info
18067 1444 root S 910m 98% 0% balena-engine-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/92c1c6a696d33b895eff4607a8bfc7adce5e9ca8305febb177c3962c112dd422 -address /var/run/balena-engine/containerd/balena-engine-containerd.sock -containerd-binary /u
18027 17975 root S 902m 97% 0% balena run --privileged --name resin_supervisor --restart=always --net=host --cidenv=SUPERVISOR_CONTAINER_ID --mount type=bind,source=/var/run/balena-engine.sock,target=/var/run/balena-engine.sock --mount type=bind,source=/mnt/boot/config.json,target=/boot/config.json --mount type=bi
18086 18067 root S 140m 15% 0% node /usr/src/app/dist/app.js
1376 1 root S 67464 7% 0% /usr/sbin/NetworkManager --no-daemon
1278 1 root S 51516 5% 0% /usr/sbin/ModemManager --log-journal
1260 1 root S 39640 4% 0% /usr/sbin/rngd -f -r /dev/hwrng
872 1 root S 26204 3% 0% /lib/systemd/systemd-journald
1358 1 root S 25980 3% 0% /usr/libexec/qmi-proxy
1 0 root S 25488 3% 0% {systemd} /sbin/init
1270 1 root S 12092 1% 0% /usr/sbin/chronyd -d
1265 1 root S 8956 1% 1% @sbin/plymouthd --tty=tty1 --mode=boot --pid-file=/run/plymouth/pid --attach-to-session --kernel-command-line=plymouth.ignore-serial-consoles splash
1439 1 root S 8492 1% 0% /usr/sbin/wpa_supplicant -u
12587 1 openvpn S 5432 1% 0% /usr/sbin/openvpn --writepid /run/openvpn/openvpn.pid --cd /etc/openvpn/ --config /etc/openvpn/openvpn.conf --connect-retry 5 120
1313 1 root S 5120 1% 0% /lib/systemd/systemd-logind
1432 1 root S 5056 1% 0% /usr/libexec/bluetooth/bluetoothd --experimental
28884 1 root S 4536 0% 0% sshd: root@notty
28722 1 root S 4420 0% 0% sshd: root@pts/1
904 1 root S 4316 0% 0% /lib/systemd/systemd-udevd
1366 1 avahi S 4076 0% 0% avahi-daemon: running [fea1261.local]
1367 1366 avahi S 3688 0% 0% avahi-daemon: chroot helper
1305 1 messageb S 3656 0% 0% /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
17977 17975 root S 3468 0% 0% /proc/self/exe --healthcheck /usr/lib/resin-supervisor/resin-supervisor-healthcheck --pid 17975
1391 1388 root S 3468 0% 0% /proc/self/exe --healthcheck /usr/lib/balena/balena-healthcheck --pid 1388
32373 29128 root S 3040 0% 0% nslookup api.balena-cloud.com 62.140.140.251
1386 1 nobody S 2948 0% 0% /usr/bin/dnsmasq -x /run/dnsmasq.pid -a 127.0.0.2,10.114.102.1 -7 /etc/dnsmasq.d/ -r /etc/resolv.dnsmasq -z --servers-file=/run/dnsmasq.servers -k --log-facility=-
28971 28969 root S 2896 0% 0% jq -s add | {checks:.}
28895 28884 root S 2624 0% 0% bash -s -- --balenaos-registry registry2.balena-cloud.com
29128 29127 root S 2624 0% 0% bash -s -- --balenaos-registry registry2.balena-cloud.com
28969 28895 root S 2624 0% 0% bash -s -- --balenaos-registry registry2.balena-cloud.com
28970 28969 root S 2624 0% 0% bash -s -- --balenaos-registry registry2.balena-cloud.com
29127 28970 root S 2624 0% 0% bash -s -- --balenaos-registry registry2.balena-cloud.com
17975 1 root S 2448 0% 0% {start-resin-sup} /bin/sh /usr/bin/start-resin-supervisor
28732 28722 root S 2448 0% 0% /bin/bash -l
28752 28732 root R 2364 0% 0% top
1428 1 root S 1464 0% 0% /usr/bin/hciattach /dev/serial1 bcm43xx 460800 noflow - b8:27:eb:82:67:22
24139 2 root IW 0 0% 0% [kworker/u8:3-br]
27876 2 root IW 0 0% 0% [kworker/2:1-eve]
when executing journalctl -u resin-supervisor --no-pager
Jan 04 14:08:40 fea1261 systemd[1]: resin-supervisor.service: Failed to run 'start-pre' task: Bad message
Jan 04 14:08:40 fea1261 systemd[1]: resin-supervisor.service: Failed with result 'resources'.
Jan 04 14:08:40 fea1261 systemd[1]: Failed to start Balena supervisor.
Jan 04 14:08:51 fea1261 systemd[1]: resin-supervisor.service: Failed to load environment files: Bad message
Jan 04 14:08:51 fea1261 systemd[1]: resin-supervisor.service: Failed to run 'start-pre' task: Bad message
Jan 04 14:08:51 fea1261 systemd[1]: resin-supervisor.service: Failed with result 'resources'.
Jan 04 14:08:51 fea1261 systemd[1]: Failed to start Balena supervisor.
Here is a global journal: https://pastebin.com/Tz2Q83HT
update:
I’ve noticed that there is a memory increase from ~500 mb to 747 MB usage after executing diagnostics. When I execute diagnostics for a second time, wwan0 disappears. Here’s the log
http://ix.io/2KPA