Error restarting the app

Hello!

I spent the whole day working with my code, and I restarted my app a lot of times today.

However, one time, I got a estrange behavior that ended on a endless loop.

To restart the app I was killing the main process. (not trough the dashboard restart button)

This is the log:

Caught SIGTERM signal!
Sending SIGTERM to remaining processes…
Sending SIGKILL to remaining processes…
Unmounting file systems.
Unmounting /sys/kernel/debug.
Unmounting /lib/modules.
Unmounting /dev/mqueue.
All filesystems unmounted.
Rebooting.

Code: System error

Message: stat /tmp/resin-supervisor/139312: no such file or directory

Frames:

0: setupRootfs
Package: github.com/opencontainers/runc/libcontainer
File: rootfs_linux.go@40

1: Init
Package: github.com/opencontainers/runc/libcontainer.(*linuxStandardInit)
File: standard_init_linux.go@57

2: StartInitialization
Package: github.com/opencontainers/runc/libcontainer.(*LinuxFactory)
File: factory_linux.go@240

3: initializer
Package: github.com/docker/docker/daemon/execdriver/native
File: init.go@35

I restarted trough the dashboard and it worked good, I did not experience the issue again.

OS: Resin OS 2.0.0+rev3 (dev)

Just thought that will share this in case it is a bug and you can solve it.

Thanks!

Hi @diegosucaria, how exactly have you killed the main process? Web terminal and kill a PID? Or some other way? Just so we can try to troubleshoot what’s going on (and you are using a dev image, so there could be a lot that you could have done) :dolphin:

@imrehg at the end of the start.sh file I do start the xserver in foreground, the next line is a reboot .

So when I kill the xserver pkill xinit the next line does the reboot.

Well, that sounds to me not killing the main process, but actually calling the container to finish with reboot.

BTW, I think you do not need reboot - when your script finishes, that should trigger a restart by default…

The problem re appeared, I had a device online for a week and today I found the screen off, the app container is not running and it didn’t start again.

If you want to take a look to the logs I will send you the device url. I left it as is.

Please send us the URL. The supervisor stops restarting if there are too many restarts.

here is the url https://dashboard.resin.io/apps/390300/devices/555453/summary

Hi @lifeeth, were you able to take a look into it?

Hi Diego,
I’ve just taken a look at that device and restarted the device, not just the container. Things seem to have come back up okay. I notice in the logs that your container code exits occasionally, most recent example being 19.05.17 03:13:42 (+0000)

I think you miss to paste part of the code…

Thanks for taking a look.

Anyways, yes, I knew a reboot will fix the issue, it happened before, (when I wrote the first post) but it seems a bug in the supervisor perhaps?

The fact is, a container restart from inside my app made the supervisor hang. it doesn’t happen always, and I have not experienced it on 1.x OS.

this time was particularly dangerous, because it could have been a production device (this is a dev one sitting on my desk) working good for days, and suddenly, hanged up without warning.

Hey, are you still using pkill to restart the container? Does the supervisor hang also happen when you use the API (either though the dashboard restart button, or the supervisor directly (supervisor restart call)? Just trying to get to the bottom of why it might hang, and the more info you can give on what are you doing exactly, the more likely it is that we can replicate it. Thanks!

For some reason my main app stopped responding and the watchdog on the start.sh script triggers a reboot

The first time I got the error it was a pkill xinit I manually entered on the terminal, which indeed does trigger the same reboot on start.sh.

I haven’t experienced the issue by rebooting from an API call.

Happened again. So far this problem seems to affect the device with the -dev variant of resin OS.