I’ve got an application that is deployed to 1 Balena Fin v1.1 device right now for testing - my services seem to be in a constant state of restarting right now. If I look at the supervisor diagnostics (Diagnostics -> Supervisor State) I can see the following:
Service containers being starting and then seem to get a term signal and reboot. It actually looks like they are getting stopped, removed, re-installed, started, then stopped… on a continuous loop.
I suspect this is occuring because the supervisor is trying to correct whatever error is thinks occurred for the last update (pending and failed are both true) - and I’m not sure how to force a full update. Right delta updates are turned on and I have NOT tried turning delta off. I actually want to leave it in this state so someone at Balena can take a look?
Any ideas? I’m glad to grant access to support - will send details about the app
via private message if needed.
Eventually the cycle does stop or slow down with containers in a Stopped state - but, I’m guessing that because everything is going into backoff and just trying to start less often. Need help at this point!
I found that running the command journalctl -f -n 1000 -a -u balena can shed a little more light on whatever errors are occurring.
Found this error in that log output - repeats on a regular basis!
Apr 13 04:12:02 b05ae2a balenad[790]: time="2020-04-13T04:12:02.439484599Z" level=error msg="245a5107e4854b82c816e84a20bd619a2fa549aeffae1d6bc9a1ae787c266255 cleanup: failed to delete container from containerd: no such container"
Apr 13 04:12:02 b05ae2a balenad[790]: time="2020-04-13T04:12:02.439729025Z" level=error msg="Handler for POST /containers/245a5107e4854b82c816e84a20bd619a2fa549aeffae1d6bc9a1ae787c266255/start returned error: OCI runtime create failed: container_linux.go:338: creating new parent process caused \"container_linux.go:1897: running lstat on namespace path \\\"/proc/10273/ns/net\\\" caused \\\"lstat /proc/10273/ns/net: no such file or directory\\\"\": unknown"
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] Scheduling another update attempt in 900000ms due to failure: Error: Failed to apply state transition steps. (HTTP code 500) server error - OCI runtime create failed: container_linux.go:338: creating new parent process caused "container_linux.go:1897: running lstat on namespace path \"/proc/10273/ns/net\" caused \"lstat /proc/10273/ns/net: no such file or directory\"": unknown Steps:["start","start","kill","kill","kill","kill"]
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at /usr/src/app/dist/app.js:614:16375
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at c (/usr/src/app/dist/app.js:9:77566)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at O._settlePromiseFromHandler (/usr/src/app/dist/app.js:312:325515)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at O._settlePromise (/usr/src/app/dist/app.js:312:326315)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at O._settlePromise0 (/usr/src/app/dist/app.js:312:327014)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at O._settlePromises (/usr/src/app/dist/app.js:312:328255)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at d (/usr/src/app/dist/app.js:312:329960)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at p (/usr/src/app/dist/app.js:312:329899)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at s._drainQueues (/usr/src/app/dist/app.js:312:331419)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at Immediate.drainQueues [as _onImmediate] (/usr/src/app/dist/app.js:312:329641)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at runCallback (timers.js:705:18)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at tryOnImmediate (timers.js:676:5)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at processImmediate (timers.js:658:5)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] Device state apply error Error: Failed to apply state transition steps. (HTTP code 500) server error - OCI runtime create failed: container_linux.go:338: creating new parent process caused "container_linux.go:1897: running lstat on namespace path \"/proc/10273/ns/net\" caused \"lstat /proc/10273/ns/net: no such file or directory\"": unknown Steps:["start","start","kill","kill","kill","kill"]
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at /usr/src/app/dist/app.js:614:16375
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at c (/usr/src/app/dist/app.js:9:77566)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at O._settlePromiseFromHandler (/usr/src/app/dist/app.js:312:325515)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at O._settlePromise (/usr/src/app/dist/app.js:312:326315)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at O._settlePromise0 (/usr/src/app/dist/app.js:312:327014)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at O._settlePromises (/usr/src/app/dist/app.js:312:328255)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at d (/usr/src/app/dist/app.js:312:329960)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at p (/usr/src/app/dist/app.js:312:329899)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at s._drainQueues (/usr/src/app/dist/app.js:312:331419)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at Immediate.drainQueues [as _onImmediate] (/usr/src/app/dist/app.js:312:329641)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at runCallback (timers.js:705:18)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at tryOnImmediate (timers.js:676:5)
Apr 13 04:12:02 b05ae2a 8a566be31289[790]: [error] at processImmediate (timers.js:658:5)
I’m sure you’ll need more of the log than this, but I’m not sure what parts are pertinent, yet. Will share more with support once someone is available to take a look.
[59384.096539] device vethce0d80f entered promiscuous mode
[59384.097116] IPv6: ADDRCONF(NETDEV_UP): vethce0d80f: link is not ready
[59384.223727] br-ebc537e44695: port 3(veth67f2e69) entered blocking state
[59384.223745] br-ebc537e44695: port 3(veth67f2e69) entered disabled state
[59384.223976] device veth67f2e69 entered promiscuous mode
[59384.224277] IPv6: ADDRCONF(NETDEV_UP): veth67f2e69: link is not ready
[59384.224291] br-ebc537e44695: port 3(veth67f2e69) entered blocking state
[59384.224300] br-ebc537e44695: port 3(veth67f2e69) entered forwarding state
[59385.099870] br-ebc537e44695: port 3(veth67f2e69) entered disabled state
[59385.387963] br-ebc537e44695: port 2(vethb88dfc4) entered disabled state
[59385.388551] vethbb40a59: renamed from eth0
[59385.505614] br-ebc537e44695: port 6(veth8ddbd4a) entered disabled state
[59385.508940] veth28f0639: renamed from eth0
[59385.646338] br-ebc537e44695: port 2(vethb88dfc4) entered disabled state
[59385.665720] device vethb88dfc4 left promiscuous mode
[59385.665737] br-ebc537e44695: port 2(vethb88dfc4) entered disabled state
[59385.670234] vetha3bc066: renamed from eth0
[59385.730659] br-ebc537e44695: port 4(veth930713b) entered disabled state
[59385.806571] br-ebc537e44695: port 6(veth8ddbd4a) entered disabled state
[59385.826056] device veth8ddbd4a left promiscuous mode
[59385.826080] br-ebc537e44695: port 6(veth8ddbd4a) entered disabled state
[59385.900683] br-ebc537e44695: port 4(veth930713b) entered disabled state
[59385.927718] device veth930713b left promiscuous mode
[59385.927738] br-ebc537e44695: port 4(veth930713b) entered disabled state
[59386.505356] br-ebc537e44695: port 5(veth06e5425) entered disabled state
[59386.510655] vethae54f63: renamed from eth0
[59386.833069] br-ebc537e44695: port 5(veth06e5425) entered disabled state
[59386.846914] device veth06e5425 left promiscuous mode
[59386.846935] br-ebc537e44695: port 5(veth06e5425) entered disabled state
[59387.213584] eth0: renamed from veth5180a1c
[59387.264177] eth0: renamed from vethed0998c
[59387.310341] IPv6: ADDRCONF(NETDEV_CHANGE): veth67f2e69: link becomes ready
[59387.310623] br-ebc537e44695: port 3(veth67f2e69) entered blocking state
[59387.310639] br-ebc537e44695: port 3(veth67f2e69) entered forwarding state
[59387.314350] IPv6: ADDRCONF(NETDEV_CHANGE): vethce0d80f: link becomes ready
[59387.314587] br-ebc537e44695: port 1(vethce0d80f) entered blocking state
[59387.314613] br-ebc537e44695: port 1(vethce0d80f) entered forwarding state
[59594.853977] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[59909.824857] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[60224.819264] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[60287.121532] br-ebc537e44695: port 2(vethaae3af5) entered blocking state
[60287.121550] br-ebc537e44695: port 2(vethaae3af5) entered disabled state
[60287.123723] device vethaae3af5 entered promiscuous mode
[60287.125241] IPv6: ADDRCONF(NETDEV_UP): vethaae3af5: link is not ready
[60287.548785] br-ebc537e44695: port 4(veth4a0ba06) entered blocking state
[60287.548802] br-ebc537e44695: port 4(veth4a0ba06) entered disabled state
[60287.549286] device veth4a0ba06 entered promiscuous mode
[60287.549938] IPv6: ADDRCONF(NETDEV_UP): veth4a0ba06: link is not ready
[60287.549964] br-ebc537e44695: port 4(veth4a0ba06) entered blocking state
[60287.549977] br-ebc537e44695: port 4(veth4a0ba06) entered forwarding state
[60287.979016] br-ebc537e44695: port 5(veth45c4c5a) entered blocking state
[60287.979034] br-ebc537e44695: port 5(veth45c4c5a) entered disabled state
[60287.979389] device veth45c4c5a entered promiscuous mode
[60287.983610] IPv6: ADDRCONF(NETDEV_UP): veth45c4c5a: link is not ready
[60287.983636] br-ebc537e44695: port 5(veth45c4c5a) entered blocking state
[60287.983645] br-ebc537e44695: port 5(veth45c4c5a) entered forwarding state
[60288.114942] br-ebc537e44695: port 4(veth4a0ba06) entered disabled state
[60288.115206] br-ebc537e44695: port 5(veth45c4c5a) entered disabled state
[60288.134063] br-ebc537e44695: port 6(veth776d04f) entered blocking state
[60288.134082] br-ebc537e44695: port 6(veth776d04f) entered disabled state
[60288.134350] device veth776d04f entered promiscuous mode
[60288.135484] IPv6: ADDRCONF(NETDEV_UP): veth776d04f: link is not ready
[60288.135509] br-ebc537e44695: port 6(veth776d04f) entered blocking state
[60288.135519] br-ebc537e44695: port 6(veth776d04f) entered forwarding state
[60288.292794] br-ebc537e44695: port 6(veth776d04f) entered disabled state
[60288.293411] br-ebc537e44695: port 3(veth67f2e69) entered disabled state
[60288.297930] veth5180a1c: renamed from eth0
[60288.510853] br-ebc537e44695: port 3(veth67f2e69) entered disabled state
[60288.519430] device veth67f2e69 left promiscuous mode
[60288.519451] br-ebc537e44695: port 3(veth67f2e69) entered disabled state
[60288.558525] br-ebc537e44695: port 1(vethce0d80f) entered disabled state
[60288.561308] vethed0998c: renamed from eth0
[60288.775951] br-ebc537e44695: port 1(vethce0d80f) entered disabled state
[60288.786723] device vethce0d80f left promiscuous mode
[60288.786744] br-ebc537e44695: port 1(vethce0d80f) entered disabled state
[60289.756186] eth0: renamed from vethef18472
[60289.806308] IPv6: ADDRCONF(NETDEV_CHANGE): vethaae3af5: link becomes ready
[60289.806471] br-ebc537e44695: port 2(vethaae3af5) entered blocking state
[60289.806486] br-ebc537e44695: port 2(vethaae3af5) entered forwarding state
[60290.376483] eth0: renamed from vethaf2aa43
[60290.415930] IPv6: ADDRCONF(NETDEV_CHANGE): veth4a0ba06: link becomes ready
[60290.416055] br-ebc537e44695: port 4(veth4a0ba06) entered blocking state
[60290.416067] br-ebc537e44695: port 4(veth4a0ba06) entered forwarding state
[60291.036098] eth0: renamed from vethe396aff
[60291.097267] IPv6: ADDRCONF(NETDEV_CHANGE): veth776d04f: link becomes ready
[60291.097406] br-ebc537e44695: port 6(veth776d04f) entered blocking state
[60291.097416] br-ebc537e44695: port 6(veth776d04f) entered forwarding state
[60291.426182] eth0: renamed from veth5aaddbf
[60291.478234] IPv6: ADDRCONF(NETDEV_CHANGE): veth45c4c5a: link becomes ready
[60291.478361] br-ebc537e44695: port 5(veth45c4c5a) entered blocking state
[60291.478376] br-ebc537e44695: port 5(veth45c4c5a) entered forwarding state
[60294.871359] br-ebc537e44695: port 1(veth30d08d3) entered blocking state
[60294.871375] br-ebc537e44695: port 1(veth30d08d3) entered disabled state
[60294.874606] device veth30d08d3 entered promiscuous mode
[60294.880869] IPv6: ADDRCONF(NETDEV_UP): veth30d08d3: link is not ready
[60295.168480] br-ebc537e44695: port 3(veth322e099) entered blocking state
[60295.168506] br-ebc537e44695: port 3(veth322e099) entered disabled state
[60295.168868] device veth322e099 entered promiscuous mode
[60295.169381] IPv6: ADDRCONF(NETDEV_UP): veth322e099: link is not ready
[60295.169402] br-ebc537e44695: port 3(veth322e099) entered blocking state
[60295.169410] br-ebc537e44695: port 3(veth322e099) entered forwarding state
[60295.905158] br-ebc537e44695: port 3(veth322e099) entered disabled state
[60296.109308] br-ebc537e44695: port 6(veth776d04f) entered disabled state
[60296.109626] vethe396aff: renamed from eth0
[60296.299971] br-ebc537e44695: port 6(veth776d04f) entered disabled state
[60296.309852] device veth776d04f left promiscuous mode
[60296.309879] br-ebc537e44695: port 6(veth776d04f) entered disabled state
[60296.504938] br-ebc537e44695: port 4(veth4a0ba06) entered disabled state
[60296.511532] vethaf2aa43: renamed from eth0
[60296.769741] br-ebc537e44695: port 4(veth4a0ba06) entered disabled state
[60296.783316] device veth4a0ba06 left promiscuous mode
[60296.783351] br-ebc537e44695: port 4(veth4a0ba06) entered disabled state
[60297.180850] br-ebc537e44695: port 5(veth45c4c5a) entered disabled state
[60297.181299] veth5aaddbf: renamed from eth0
[60297.393470] br-ebc537e44695: port 5(veth45c4c5a) entered disabled state
[60297.401939] device veth45c4c5a left promiscuous mode
[60297.401955] br-ebc537e44695: port 5(veth45c4c5a) entered disabled state
[60297.742132] br-ebc537e44695: port 2(vethaae3af5) entered disabled state
[60297.742605] vethef18472: renamed from eth0
[60298.006605] eth0: renamed from veth1e04dbd
[60298.067170] br-ebc537e44695: port 2(vethaae3af5) entered disabled state
[60298.079836] device vethaae3af5 left promiscuous mode
[60298.079852] br-ebc537e44695: port 2(vethaae3af5) entered disabled state
[60298.082367] eth0: renamed from vethb568c95
[60298.116108] IPv6: ADDRCONF(NETDEV_CHANGE): veth322e099: link becomes ready
[60298.116370] br-ebc537e44695: port 3(veth322e099) entered blocking state
[60298.116388] br-ebc537e44695: port 3(veth322e099) entered forwarding state
[60298.118017] IPv6: ADDRCONF(NETDEV_CHANGE): veth30d08d3: link becomes ready
[60298.118261] br-ebc537e44695: port 1(veth30d08d3) entered blocking state
[60298.118288] br-ebc537e44695: port 1(veth30d08d3) entered forwarding state
I’ve run the diagnostics and will send the output to you via private message.
I noticed yesterday that the OS version is tagged “next” – and it doesn’t seem to be the current released version. I didn’t want to change anything as I figured it was good to have some look at this behavior.
I’m guessing I downloaded that OS image before someone was able to mark is as pre-release.
@zvin - I restarted the services and everything returned back to a normal state after your fix.
I rebooted the device and it’s back into that weird state again. Is there a way to track down what’s causing it to get into this state?
We seem to have a remedy - manually correcting the issue by stopping supervisor, stopping and removing containers, and restarting supervisor - but, how do we prevent it from getting into this state to start with?
I’ll leave it in the strange state just in case you need to look at it.
I’ve updated the supervisor and purged its database.
It works as expected now.
Still I’m not sure how it gets in that weird state.
Please ping us if that happens again.
I’m able to start terminal sessions from the dashboard to the host and any service container I want. I was also able to change a service variable and see it change in the log output.
Restart and/or Reboot both throw the error Request error: Unauthorized