Balena suddenly restarts

Hi,

I was hoping someone could help me understand what is going wrong with an balena device of ours. Somehow the devices suddenly restarts itselfs which is causing us trouble.

2023-08-05T16:15:28.085969+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:28 2023 Connection reset, restarting [0]
2023-08-05T16:15:28.086529+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:28 2023 /etc/openvpn-misc/downscript.sh resin-vpn 1500 1555 10.240.84.126 52.4.252.97 restart
2023-08-05T16:15:28.093819+0000 13727ba5a2d9 3c60309dfe5b[1067]: [e[34minfoe[39m]    VPN connection is active.
2023-08-05T16:15:28.094460+0000 13727ba5a2d9 balena-supervisor[2334]: [e[34minfoe[39m]    VPN connection is active.
2023-08-05T16:15:28.094818+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:28 2023 SIGUSR1[soft,connection-reset] received, process restarting
2023-08-05T16:15:28.094818+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:28 2023 Restart pause, 5 second(s)
2023-08-05T16:15:28.784091+0000 13727ba5a2d9 3c60309dfe5b[1067]: [e[47me[30mapie[39me[49m]     GET /v1/device 200 - 8.878 ms
2023-08-05T16:15:28.784215+0000 13727ba5a2d9 balena-supervisor[2334]: [e[47me[30mapie[39me[49m]     GET /v1/device 200 - 8.878 ms
2023-08-05T16:15:32.769488+0000 13727ba5a2d9 3c60309dfe5b[1067]: [e[47me[30mapie[39me[49m]     GET /v1/healthy 200 - 0.726 ms
2023-08-05T16:15:32.771129+0000 13727ba5a2d9 balena-supervisor[2334]: [e[47me[30mapie[39me[49m]     GET /v1/healthy 200 - 0.726 ms
2023-08-05T16:15:33.094715+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:33 2023 NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
2023-08-05T16:15:33.095249+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:33 2023 TCP/UDP: Preserving recently used remote address: [AF_INET]52.7.228.224:443
2023-08-05T16:15:33.095249+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:33 2023 Socket Buffers: R=[131072->131072] S=[16384->16384]
2023-08-05T16:15:33.095249+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:33 2023 Attempting to establish TCP connection with [AF_INET]52.7.228.224:443 [nonblock]
2023-08-05T16:15:34.095266+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:34 2023 TCP connection established with [AF_INET]52.7.228.224:443
2023-08-05T16:15:34.095266+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:34 2023 Socket flags: TCP_NODELAY=1 succeeded
2023-08-05T16:15:34.095862+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:34 2023 TCP_CLIENT link local: (not bound)
2023-08-05T16:15:34.095862+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:34 2023 TCP_CLIENT link remote: [AF_INET]52.7.228.224:443
2023-08-05T16:15:34.193645+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:34 2023 TLS: Initial packet from [AF_INET]52.7.228.224:443, sid=52bf7a6f 53458ccf
2023-08-05T16:15:34.293717+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:34 2023 VERIFY OK: depth=1, C=US, ST=WA, L=Seattle, O=balena.io, OU=balenaCloud, CN=open-balena-vpn-rootCA
2023-08-05T16:15:34.293717+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:34 2023 VERIFY KU OK
2023-08-05T16:15:34.293717+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:34 2023 Validating certificate extended key usage
2023-08-05T16:15:34.293717+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:34 2023 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication
2023-08-05T16:15:34.293717+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:34 2023 VERIFY EKU OK
2023-08-05T16:15:34.293717+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:34 2023 VERIFY OK: depth=0, C=US, ST=WA, L=Seattle, O=balena.io, OU=balenaCloud, CN=vpn.balena-cloud.com
2023-08-05T16:15:34.417140+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:34 2023 Control Channel: TLSv1.3, cipher TLSv1.3 TLS_AES_256_GCM_SHA384, 2048 bit RSA
2023-08-05T16:15:34.417140+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:34 2023 [vpn.balena-cloud.com] Peer Connection Initiated with [AF_INET]52.7.228.224:443
2023-08-05T16:15:35.436966+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 SENT CONTROL [vpn.balena-cloud.com]: 'PUSH_REQUEST' (status=1)
2023-08-05T16:15:35.536213+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 PUSH: Received control message: 'PUSH_REPLY,ping 10,ping-exit 60,sndbuf 0,rcvbuf 0,route 52.4.252.97,socket-flags TCP_NODELAY,ifconfig 10.241.13.48 52.4.252.97,peer-id 1310,cipher AES-128-GCM'
2023-08-05T16:15:35.536288+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 OPTIONS IMPORT: timers and/or timeouts modified
2023-08-05T16:15:35.536288+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 OPTIONS IMPORT: --sndbuf/--rcvbuf options modified
2023-08-05T16:15:35.536332+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 Socket Buffers: R=[131072->131072] S=[46080->46080]
2023-08-05T16:15:35.536332+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 OPTIONS IMPORT: --socket-flags option modified
2023-08-05T16:15:35.536332+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 Socket flags: TCP_NODELAY=1 succeeded
2023-08-05T16:15:35.536332+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 OPTIONS IMPORT: --ifconfig/up options modified
2023-08-05T16:15:35.536332+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 OPTIONS IMPORT: route options modified
2023-08-05T16:15:35.536332+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 OPTIONS IMPORT: peer-id set
2023-08-05T16:15:35.536435+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 OPTIONS IMPORT: adjusting link_mtu to 1627
2023-08-05T16:15:35.536435+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 OPTIONS IMPORT: data channel crypto options modified
2023-08-05T16:15:35.536435+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 Data Channel: using negotiated cipher 'AES-128-GCM'
2023-08-05T16:15:35.536488+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 Outgoing Data Channel: Cipher 'AES-128-GCM' initialized with 128 bit key
2023-08-05T16:15:35.536488+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 Incoming Data Channel: Cipher 'AES-128-GCM' initialized with 128 bit key
2023-08-05T16:15:35.536488+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 Preserving previous TUN/TAP instance: resin-vpn
2023-08-05T16:15:35.536488+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 /etc/openvpn-misc/upscript.sh resin-vpn 1500 1555 10.240.84.126 52.4.252.97 restart
2023-08-05T16:15:35.539706+0000 13727ba5a2d9 3c60309dfe5b[1067]: [e[34minfoe[39m]    VPN connection is active.
2023-08-05T16:15:35.539979+0000 13727ba5a2d9 balena-supervisor[2334]: [e[34minfoe[39m]    VPN connection is active.
2023-08-05T16:15:35.540452+0000 13727ba5a2d9 3c60309dfe5b[1067]: [e[34minfoe[39m]    VPN connection is active.
2023-08-05T16:15:35.540531+0000 13727ba5a2d9 balena-supervisor[2334]: [e[34minfoe[39m]    VPN connection is active.
2023-08-05T16:15:35.541398+0000 13727ba5a2d9 3c60309dfe5b[1067]: [e[34minfoe[39m]    VPN connection is active.
2023-08-05T16:15:35.541429+0000 13727ba5a2d9 3c60309dfe5b[1067]: [e[34minfoe[39m]    VPN connection is active.
2023-08-05T16:15:35.541557+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 NOTE: Pulled options changed on restart, will need to close and reopen TUN/TAP device.
2023-08-05T16:15:35.541557+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 Closing TUN/TAP interface
2023-08-05T16:15:35.541557+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 /sbin/ip addr del dev resin-vpn local 10.240.84.126 peer 52.4.252.97
2023-08-05T16:15:35.541619+0000 13727ba5a2d9 balena-supervisor[2334]: [e[34minfoe[39m]    VPN connection is active.
2023-08-05T16:15:35.541619+0000 13727ba5a2d9 balena-supervisor[2334]: [e[34minfoe[39m]    VPN connection is active.
2023-08-05T16:15:35.542652+0000 13727ba5a2d9 openvpn[2176928]: ip: RTNETLINK answers: Operation not permitted
2023-08-05T16:15:35.543017+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 Linux ip addr del failed: external program exited with error status: 2
2023-08-05T16:15:35.554141+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:35 2023 /etc/openvpn-misc/downscript.sh resin-vpn 1500 1555 10.240.84.126 52.4.252.97 init
2023-08-05T16:15:35.557200+0000 13727ba5a2d9 3c60309dfe5b[1067]: [e[34minfoe[39m]    VPN connection is not active.
2023-08-05T16:15:35.557328+0000 13727ba5a2d9 balena-supervisor[2334]: [e[34minfoe[39m]    VPN connection is not active.
2023-08-05T16:15:36.557417+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:36 2023 ROUTE_GATEWAY 192.168.50.1/255.255.255.0 IFACE=wlan0 HWADDR=a0:e7:0b:7e:9b:95
2023-08-05T16:15:36.557718+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:36 2023 ERROR: Cannot ioctl TUNSETIFF resin-vpn: Operation not permitted (errno=1)
2023-08-05T16:15:36.557718+0000 13727ba5a2d9 openvpn[1177214]: Sat Aug  5 16:15:36 2023 Exiting due to fatal error
2023-08-05T16:15:36.558334+0000 13727ba5a2d9 systemd[1]: openvpn.service: Main process exited, code=exited, status=1/FAILURE

Device info

Can anyone tell me why this is happening and how to fix this?

Thanks.

Hi there,

From the looks of the logs that you have provided, it seems that the device is losing its VPN connectivity - can you confirm that this is what is happening, and the device is not actually rebooting or restarting?

Hi,

Thanks for your response.

Will that result in restarting of the docker containers? Because what we have seen is that the application running in the docker container is rebooted. So the device itself maybe is not rebooted.

Hi there, your docker containers shouldn’t restart if the device loses VPN connectivity.

The logs you show demonstrate the vpn connection dropping, and openvpn, the service that manages this VPN connection, restarts (expected behaviour in the case of a connection loss).

But this shouldn’t be causing container restarts. The reason why is probably somewhere else in the devices logs. You might be able to see by looking at the logs generated from the docker/balena engine journalctl -a -u balena ? Or maybe from the supervisor logs. Or if the device is actually rebooting, you should be able to tell by checking something like uptime and seeing if it has been up as long as you expect

mmm,

I understand what you mean, but it somehow appens sortly after the reconnection. See the logs after systemd[1]: openvpn.service: Main process exited, code=exited, status=1/FAILURE

logs.txt (192.8 KB)

Does these logs give an possible cause?

thanks

Indeed they do

2023-08-07T22:19:50.700321+0000 13727c75de54 systemd[1]: balena.service: Watchdog timeout (limit 6min)!
2023-08-07T22:19:50.700360+0000 13727c75de54 systemd[1]: balena.service: Killing process 1068 (balenad) with signal SIGABRT.
2023-08-07T22:19:50.702528+0000 13727c75de54 balenad[1068]: SIGABRT: abort

The engine is killed due to a watchdog timeout by the looks of it!

in my experience, its quite hard to determine the root cause. I see from your screen shot that youre on a fairly old OS version - I’d recommend updating the OS once your device has stabilised as many improvements to engine stability have been made since then. Update process details - Balena Documentation

Okay, I will do that first and see if this solves te problems. Thanks for the responses!