Problems with PiHole

Since yesterday my PiHole hasn’t been working. After checking the logs it turns out that dnscrypt-proxy service exited. I can not start it through the dashboard. I also get “Request error: tunneling socket could not be established, cause=socket hang up” when trying to reboot it from the dashboard.

Hi, that second error usually indicates the device is offline. Is the device online and connected to the VPN (showing as Online in the Balena Dashboard)?

Yes. I can even SSH into Host OS through the dashboard. The LED blink option works too. Another problem is that after dnscrypt exited yesterday the log file doesn’t have anything else.

If you want to post the supervisor logs (journalctl -b -au resin-supervisor) then we might be able to work out what has happened, but just restarting the supervisor (systemctl restart resin-supervisor) should get you back on track

The supervisor might be the problem. The journal entry for it contains repeated errors:

Sep 10 19:15:58 5923b8f systemd[1]: resin-supervisor.service: Service hold-off time over, scheduling restart.
Sep 10 19:15:58 5923b8f systemd[1]: resin-supervisor.service: Scheduled restart job, restart counter is at 342.
Sep 10 19:15:58 5923b8f systemd[1]: Stopped Resin supervisor.
Sep 10 19:15:59 5923b8f systemd[1]: Starting Resin supervisor...
Sep 10 19:15:59 5923b8f resin-supervisor[25017]: Cannot connect to the balenaEngine daemon at unix:///var/run/balena-engine.sock. Is the balenaEngine daemon running?
Sep 10 19:15:59 5923b8f resin-supervisor[25023]: inactive
Sep 10 19:15:59 5923b8f systemd[1]: resin-supervisor.service: Control process exited, code=exited status=3
Sep 10 19:15:59 5923b8f systemd[1]: resin-supervisor.service: Failed with result 'exit-code'.
Sep 10 19:15:59 5923b8f systemd[1]: Failed to start Resin supervisor. ```

ADD: Trying to start a service "balena-engine" returns a dependency error for balena.service.

Hey, if you run journalctl -b -au balena then it should give some info as to why balena-engine is unable to start

Hello. After containerd starts, there a ‘layer does not exist’ error:

Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="starting containerd" module=containerd revision= version=1.0.0+unknown
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="setting subreaper..." module=containerd
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="changing OOM score to -500" module=containerd
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.content.v1.content\"..." module=containerd type=io.containerd.content.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.overlayfs\"..." module=containerd type=io.containerd.snapshotter.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.metadata.v1.bolt\"..." module=containerd type=io.containerd.metadata.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.differ.v1.walking\"..." module=containerd type=io.containerd.differ.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.gc.v1.scheduler\"..." module=containerd type=io.containerd.gc.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.grpc.v1.containers\"..." module=containerd type=io.containerd.grpc.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.grpc.v1.content\"..." module=containerd type=io.containerd.grpc.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.grpc.v1.diff\"..." module=containerd type=io.containerd.grpc.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.grpc.v1.events\"..." module=containerd type=io.containerd.grpc.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.grpc.v1.healthcheck\"..." module=containerd type=io.containerd.grpc.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.grpc.v1.images\"..." module=containerd type=io.containerd.grpc.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.grpc.v1.leases\"..." module=containerd type=io.containerd.grpc.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.grpc.v1.namespaces\"..." module=containerd type=io.containerd.grpc.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.grpc.v1.snapshots\"..." module=containerd type=io.containerd.grpc.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.monitor.v1.cgroups\"..." module=containerd type=io.containerd.monitor.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.runtime.v1.linux\"..." module=containerd type=io.containerd.runtime.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.grpc.v1.tasks\"..." module=containerd type=io.containerd.grpc.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.grpc.v1.version\"..." module=containerd type=io.containerd.grpc.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="loading plugin \"io.containerd.grpc.v1.introspection\"..." module=containerd type=io.containerd.grpc.v1
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg=serving... address=/var/run/balena-engine/containerd/balena-engine-containerd-debug.sock module=containerd/debug
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg=serving... address=/var/run/balena-engine/containerd/balena-engine-containerd.sock module=containerd/grpc
Sep 11 03:18:06 5923b8f balenad[12079]: time="2019-09-11T03:18:06Z" level=info msg="containerd successfully booted in 0.011012s" module=containerd
Sep 11 03:18:06 5923b8f balenad[12079]: Error starting daemon: layer does not exist
Sep 11 03:18:06 5923b8f systemd[1]: balena.service: Main process exited, code=exited, status=1/FAILURE
Sep 11 03:18:06 5923b8f systemd[1]: balena.service: Failed with result 'exit-code'.```

It looks like the underlying filesystem might be corrupted could you run e2fsck -n /dev/mmcblk0p6 on the hostOS to check for errors?

The FS is normal.

e2fsck 1.43.8 (1-Jan-2018)
Warning!  /dev/mmcblk0p6 is mounted.
Warning: skipping journal recovery because doing a read-only filesystem check.
resin-data: clean, 18650/901120 files, 217644/1783552 blocks```

Hello, could you please enable support access for this device and provide us its uuid?

Sure. The UUID is 5923b8f.

Thanks, I’m having a look.

The device appears to have some FS corruption, you can see that in dmesg:

EXT4-fs (mmcblk0p5): error count since last fsck: 170
EXT4-fs (mmcblk0p5): initial error at time 1551462805: ext4_find_dest_de:1808: inode 1723: block 759
EXT4-fs (mmcblk0p5): last error at time 1568117736: ext4_find_dest_de:1808: inode 1723: block 759

There are also a lot of under-voltage warning there. These are most probably the cause for the FS corruption.
You should try running this device with another power supply.

I can try fixing this device, but I’ll need to nuke all containers and images. Do you want me to try that?
Please note that even if it fixes the device, you’ll probably get more fs corruption if you don’t change the power supply.

I’ve changed the PSU for the Pi. I cannot run fsck since openvpn is running and I can’t unmount it. About the nuking process - does that mean that I will need to push the manually or will it download automatically? Of course, if there’s no other way, it will need to be used.

It will be re-downloaded automatically.

You can’t fsck a running filesystem.
You can try removing the sd card and running fsck on it from another computer, I’m not sure the errors are fixable though.

By nuking I meant rm -rf /var/lib/docker.

The ‘layer does not exist’ error still persists, even after repairng the FS on my computer.

You’ll need to remove /var/lib/docker and re-download the images.
Do you want me to do it?

No, thanks.

After redownloading the containers everything seems to work again.