Device failing to download images - Error processing tar file(exit status 1): unexpected EOF'

Hey team,
I’ve been banging my head against this issue today. I just setup an openBalena instance (v0.1.2) according to the quickstart guide. I can login from my local machine and create applications successfully. I can also successfully deploy development images to the new application.

Upon spinning up a Raspberry Pi 3 with the application image flashed, I see the device gets registered but shows as offline. Looking at the device logs, I see a repeat of the following message:

Failed to download image 'registry.mydomain.io/v2/hash@sha256:hash' due to 'failed to register layer: Error processing tar file(exit status 1): unexpected EOF'

A quick google search seems to indicate that the above message relates to docker permissions. AFAIK, I’ve set up permissions according to the quickstart guide on client, server, and RPi.

I also see a smattering of this message in the logs:

Failed to download image 'registry.mydomain.io/v2/hash@sha256:hash' due to '(HTTP code 500) server error - Get https://registry.mydomain.io/v2/v2/hash/manifests/sha256:hash: received unexpected HTTP status: 503 Service Unavailable '

and this one:

Failed to download image 'registry.mydomain.io/v2/hash@sha256:hash' due to 'could not get decompression stream: Get https://registry.mydomain.io/v2/v2/hash/blobs/sha256:hash: read tcp x.x.x.x:33332->x.x.x.x:443: read: connection reset by peer'

curl’ing registry.mydomain.io gives me an empty 200 response.

Looking at the haproxy container logs, I see a series of messages showing that all the front and backends have been stopped, ending with

[WARNING] 354/223058 (1) : Server backend_api/resin_api_1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] 354/223058 (1) : backend 'backend_api' has no server available!
[WARNING] 354/223102 (1) : Server backend_api/resin_api_1 is UP, reason: Layer4 check passed, check duration: 0ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.

I haven’t made any changes to the haproxy config provided in the open-balena repo, though it seems from the logs that haproxy might be a culprit? It also appears that my issue is separate from the other issue about devices not being online, though I could be wrong.

Is this a networking issue on my end? I, unfortunately, don’t have control over the network the RPi is on :grimacing:

Thanks for the great work! We’ve been looking forward to switching over to openBalena since the announcement :slight_smile:

Hello, can you please inspect the server logs for any hints? It looks to me like some of the backend containers aren’t running as they should. The command you need is ./scripts/compose exec -it SERVICE_NAME journalctl -fn1000. You’ll need to run this command once for each service, replacing SERVICE_NAME with api/registry/vpn/etc.

Thanks for the reply, @dfunckt. Unfortunately, I restarted the containers before I read your message and it looks like the logs disappeared. Good news is that openBalena seems to be working! Just successfully deployed to a container. Bad news is that I can’t reproduce the errors I had before.

My hunch is that it was a networking issue. The only thing that’s changed since I posted this is the network on which the device is running. I’m investigating why the previous network would have caused the aforementioned issues. I’ll post if I find out anything.

Thanks again for all the great work!