Reboot challenges: Raspberry Pi 3 + -node:7/-node:latest (likely my fault)

Hello gurus,
I’m having great success configuring an image with a Dockerfile.template and making great things happen with a Raspberry Pi camera and video streams (different solution from examples). Cool!

My headache begins when rebooting. After a reboot, the app will not start. I can login to the host os which is good, but the restart option after reboot results in an error. I can reboot again with no success from the website. Purging data doesn’t help. Shutting down and booting manually doesn’t help. The only temporary solution I’ve found so far as I continue to troubleshoot is to burn a new disk and start working with the new device (icky).

What’s odd is that I can push new changes to the app with Git and the app will download and restart without issues as far as I can see.

I’m wondering if there is something I’m doing in the configuration that is causing the app container to not play nice with the host os? Is that possible? I am running apt-get update and upgrade, along with installing ffmpeg from source.

Here is my Dockerfile.template:

# I don't have a huge preference on this and can adjust if needed:
FROM resin/%%RESIN_MACHINE_NAME%%-node:7

# Good, yes?:
ENV INITSYSTEM on

RUN apt-get -q update && \
apt-get upgrade && \
apt-get install -yq --no-install-recommends build-essential libraspberrypi-bin && \
apt-get clean

# Install H264 support from source
# ...

# Install ffmgeg from source
# ...

COPY . /usr/src/app

# will run when container starts up on the device
CMD ["echo", "hiya"]

Thanks for taking a look in advance! Also, thanks for tolerating my lack of docker/resin.io experience (I’m catching up!). Please let me know what additional info I can provide. I’m retrying a fresh install and can provide logs soon to start.

Kind regards,

Chris

update: I’m trying without apt-get upgrade next…

What sort of error do you see? How do you reboot, using the “reboot” button in the dashboard? And what does it mean that your app will not start, how does your app show that it is running? (not sure what your app should be doing normally, hence the question). Also what resinOS version are you using? We should find some hints along these lines :slight_smile:

Hello,
Thanks for the note. Here are further details:

I’m using the reboot button in the dashboard, and also tried unplugging and re-plugging Raspberry Pi manually.

After booting, I’m supposed to see “hiya” echoed on terminal for now. This is made possible with the last line of my Dockerfile.template (the actual app is being built now, only installing dependencies for now):

CMD ["echo", "hiya"]

Once installed, I see this in the logs (awesome):

05.01.18 08:53:46 (-0700) Applying boot config: {"gpu_mem":"128","start_x":"1"}
05.01.18 08:53:49 (-0700) Applied boot config: {"gpu_mem":"128","start_x":"1"}
05.01.18 08:53:51 (-0700) Rebooting
05.01.18 08:55:06 (-0700) Downloading application 'registry2.resin.io/myapp/4780a4c5becb97ed430569292c5b07cb8aa12345'
05.01.18 09:16:56 (-0700) Downloaded application 'registry2.resin.io/myapp/4780a4c5becb97ed430569292c5b07cb8aa12345'
05.01.18 09:16:56 (-0700) Installing application 'registry2.resin.io/myapp/4780a4c5becb97ed430569292c5b07cb8aa12345'
05.01.18 09:17:11 (-0700) Installed application 'registry2.resin.io/myapp/4780a4c5becb97ed430569292c5b07cb8aa12345'
05.01.18 09:17:11 (-0700) Starting application 'registry2.resin.io/myapp/4780a4c5becb97ed430569292c5b07cb8aa12345'
05.01.18 09:17:15 (-0700) Started application 'registry2.resin.io/myapp/4780a4c5becb97ed430569292c5b07cb8aa12345'
05.01.18 09:17:15 (-0700) Systemd init system enabled.
05.01.18 09:17:15 (-0700) systemd 215 running in system mode. (+PAM +AUDIT +SELINUX +IMA +SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ -SECCOMP -APPARMOR)
05.01.18 09:17:15 (-0700) Detected virtualization 'other'.
05.01.18 09:17:15 (-0700) Detected architecture 'arm'.
05.01.18 09:17:15 (-0700) Set hostname to <abcdef0>.
05.01.18 09:17:16 (-0700) Downloading application 'registry2.resin.io/myapp/934b1f29bc4a9c8766e99ea804122c4d4d712345'
05.01.18 09:17:18 (-0700) hiya

Then I can develop and push changes with success. When I press the reboot button in the dashboard:

05.01.18 11:03:20 (-0700) Killing application 'registry2.resin.io/myapp/934b1f29bc4a9c8766e99ea804122c4d4d12345'
05.01.18 11:03:20 (-0700) Sending SIGTERM to remaining processes...
05.01.18 11:03:20 (-0700) Sending SIGKILL to remaining processes...
05.01.18 11:03:20 (-0700) Unmounting file systems.
05.01.18 11:03:20 (-0700) Unmounting /sys/kernel/debug.
05.01.18 11:03:20 (-0700) Unmounting /dev/mqueue.
05.01.18 11:03:21 (-0700) All filesystems unmounted.
05.01.18 11:03:21 (-0700) Halting system.
05.01.18 11:03:26 (-0700) Killed application 'registry2.resin.io/myapp/934b1f29bc4a9c8766e99ea804122c4d4d12345'
05.01.18 11:03:26 (-0700) Rebooting

This time I did not see any errors. Logs stop at “Rebooting”, the device status changes to “Offline”, the shortly “Online” as expected. Then, it updates to “Stopping”. I’m not seeing further logs. If I try to reboot again from the dashboard, I see an error message at the top:

Device not found: 944809

However, I’m still able to connect to the host OS via the dashboard terminal (which is super cool by the way):

Connecting to 9f8a7deec8f52070e2fc9e3814717a14...
Spawning shell...
=============================================================
    Welcome to ResinOS
=============================================================
bash-4.3#

I’m using the latest Raspberry Pi + Node.js dev image: 2.7.8+rev1-dev-v6.4.2

Thanks again for taking a look. Let me know if I can provide additional info.

Hi Chris, can you please provide us a link to the device on the dashboard and wllow support access from the “actions” panel ?

zvin and team,
Thanks for taking a look at my device. You mentioned that the /data folder was corrupt. I’ve concluded that this is DEFINATELY my fault as the topic subject line suggests. Removing some of the Dockerfile source install code resolved the issue. While I’m still troubleshooting, I can take it from here and have learned much in this process.

Thanks again for your time and have an excellent weekend!

Kind regards,

Chris

UPDATE: The corrupt data folder was due to either sudos in my dockerfile template (which could cause some permissions issues?), or the 1GB+ of uncleaned up build files being transferred as part of the image (at least that made the install process painful). I’ve been developing for a weekend without issue. :smiley: