Getting repeated Failed to download image updating device after pushing balenaSound to application

Device is updating after I pushed balenaSound project to the application, now I get the following failures

25.02.20 20:40:35 (+0000) Applying configuration change {“SUPERVISOR_POLL_INTERVAL”:“900000”,“SUPERVISOR_DELTA_VERSION”:“3”}
25.02.20 20:40:39 (+0000) Applied configuration change {“SUPERVISOR_POLL_INTERVAL”:“900000”,“SUPERVISOR_DELTA_VERSION”:“3”}
25.02.20 20:40:56 (+0000) Creating volume ‘snapcast’
25.02.20 20:40:58 (+0000) Creating volume ‘bluetoothcache’
25.02.20 20:40:58 (+0000) Creating volume ‘spotifycache’
25.02.20 20:40:59 (+0000) Creating network ‘default’
25.02.20 20:42:11 (+0000) Downloading image ‘registry2.balena-cloud.com/v2/8fada592992e74de88bb1d14f8db58b2@sha256:7c0cf499392f2d76d71a31eb2aa2af1e9efaf7a9679c59d306d3841ad95c9c23
25.02.20 20:42:16 (+0000) Downloading image ‘registry2.balena-cloud.com/v2/9447858fe5667cd260b1f747cb27ae6c@sha256:11202ec1313ccf108efb6b8b6fd10e8406fe16d621332218a4474bd4fcd6cb47
25.02.20 20:42:21 (+0000) Downloading image ‘registry2.balena-cloud.com/v2/78e18301e85faa26afe3069bde1955d2@sha256:4e3d5117bd1504596fd72b342fd29c766b092eac6af8284346ef1c12f4d787a7
25.02.20 20:42:21 (+0000) Downloading image ‘registry2.balena-cloud.com/v2/db7a068fbc3fce8146f6b249f556d263@sha256:34145d9efb66eb4f205efa0bd31eb7526d81c5b8c7d17de3f0aeb8f75ca0e180
25.02.20 20:42:22 (+0000) Downloading image ‘registry2.balena-cloud.com/v2/9ede1621f0d59f0535eb2bf7071dd7e2@sha256:c0ba72cea3eaa23b56318aecedf1d5ab4d1367ac0930526dc24dcc23e762e3e7
25.02.20 20:42:22 (+0000) Downloading image ‘registry2.balena-cloud.com/v2/a679dbd65a9020c9a68901ed4d9de9eb@sha256:f4405dac276263e0a7ccaa2baa435063744dc6e2c7e956d4bf0924a9fcba8672
25.02.20 20:42:30 (+0000) Failed to download image ‘registry2.balena-cloud.com/v2/8fada592992e74de88bb1d14f8db58b2@sha256:7c0cf499392f2d76d71a31eb2aa2af1e9efaf7a9679c59d306d3841ad95c9c23’ due to '(HTTP code 500) server error - Get https://registry2.balena-cloud.com/v2/: net/http: TLS handshake timeout ’
25.02.20 20:42:36 (+0000) Failed to download image ‘registry2.balena-cloud.com/v2/9ede1621f0d59f0535eb2bf7071dd7e2@sha256:c0ba72cea3eaa23b56318aecedf1d5ab4d1367ac0930526dc24dcc23e762e3e7’ due to '(HTTP code 500) server error - Get https://registry2.balena-cloud.com/v2/: net/http: TLS handshake timeout ’
25.02.20 20:42:36 (+0000) Failed to download image ‘registry2.balena-cloud.com/v2/78e18301e85faa26afe3069bde1955d2@sha256:4e3d5117bd1504596fd72b342fd29c766b092eac6af8284346ef1c12f4d787a7’ due to '(HTTP code 500) server error - Get https://registry2.balena-cloud.com/v2/: net/http: TLS handshake timeout ’
25.02.20 20:42:36 (+0000) Failed to download image ‘registry2.balena-cloud.com/v2/db7a068fbc3fce8146f6b249f556d263@sha256:34145d9efb66eb4f205efa0bd31eb7526d81c5b8c7d17de3f0aeb8f75ca0e180’ due to '(HTTP code 500) server error - Get https://registry2.balena-cloud.com/v2/: net/http: TLS handshake timeout ’
25.02.20 20:42:36 (+0000) Failed to download image ‘registry2.balena-cloud.com/v2/a679dbd65a9020c9a68901ed4d9de9eb@sha256:f4405dac276263e0a7ccaa2baa435063744dc6e2c7e956d4bf0924a9fcba8672’ due to '(HTTP code 500) server error - Get https://registry2.balena-cloud.com/v2/: net/http: TLS handshake timeout ’
25.02.20 20:42:36 (+0000) Failed to download image ‘registry2.balena-cloud.com/v2/9447858fe5667cd260b1f747cb27ae6c@sha256:11202ec1313ccf108efb6b8b6fd10e8406fe16d621332218a4474bd4fcd6cb47’ due to '(HTTP code 500) server error - Get https://registry2.balena-cloud.com/v2/: net/http: TLS handshake timeout ’

Hi @pigcry,
looking at the status page ( https://status.balena.io/ ) we did have an API outage today but the registry did not have any issues.
Otherwise TLS handshake timeout mostly points to bandwidth related problems so it would be good to make sure your device can access the registry with sufficient speed.
Please also provide some info about device type and balena-os version used.
Regards Thomas

Still seeing various different errors when trying the download, latest is

Failed to download image ‘registry2.balena-cloud.com/v2/db7a068fbc3fce8146f6b249f556d263@sha256:34145d9efb66eb4f205efa0bd31eb7526d81c5b8c7d17de3f0aeb8f75ca0e180’ due to '(HTTP code 404) no such image - no such image: registry2.balena-cloud.com/v2/db7a068fbc3fce8146f6b249f556d263@sha256:34145d9efb66eb4f205efa0bd31eb7526d81c5b8c7d17de3f0aeb8f75ca0e180: No such image: registry2.balena-cloud.com/v2/db7a068fbc3fce8146f6b249f556d263@sha256:34145d9efb66eb4f205efa0bd31eb7526d81c5b8c7d17de3f0aeb8f75ca0e180

I have flashed a faster class 10 SD card incase it was the write speed and the device has a decent ADSL 8-10Mb down speed. Got closer this time, 13% through the download and 35% through one of the containers but then hit the above errors. Version info is:

HOST OS VERSION - balenaOS 2.46.1+rev1
SUPERVISOR VERSION - 10.6.27

How large are the six balenaSound container downloads? I did a test 12M download in less that 20 seconds as a verification of network speed.

Hi. If you enable support access and share the UUID of the device, we can take a look.

Could you also share the application UUID?

Hi there, access shared uuid is 675e98f2db1149d5009fd4a9efbc4cda

Hey, to update you on this, you can check the image sizes via the “Releases” tab on your application dashboard, but currently it’s 6 images with an average of around 200MB. For the issue itself it looks to me to be that downloading so many large images at once is overloading your raspberrypi as it’s an old model and also doesn’t have much RAM, the faster sd card might help with this as it’ll allow writing to disk faster and may alleviate some of the load.
Outside of that as a workaround you can also try removing a few services from the docker-compose.yml and then pushing, letting those download, then adding some more and letting them download and so on in order to reduce the load, I’ve also raised with the supervisor team an idea of having a max concurrent downloads config var to help with cases like this where a device or network is unable to download so many images in parallel

Thanks for that, I’ll give that a spin.

Thanks that worked. commenting out all but one service at a time in the docker-compose-yml did the trick. Thanks for your help.

@pigcry

What is the location where we need to modify docker-compose.yml? I see one at the below location -

(base) C02LX7P3FD57:balena-sound-master alsharm$ pwd
/Users/alsharm/Downloads/Balena/balena-sound-master
(base) C02LX7P3FD57:balena-sound-master alsharm$
(base) C02LX7P3FD57:balena-sound-master alsharm$ ls -l docker-compose.yml
-rwxr-xr-x@ 1 alsharm ABCD Users 1447 Aug 24 12:09 docker-compose.yml

But I think it’s not correct because it’s on my location machine and not on the server.

Is there a way to change the behavior of supervisor to only download images sequential to alleviate this in slow network environments?

At the moment it appears to try download them all concurrently.

Can we bump this? I’m experiencing this as well and it prohibits us from updating our devices properly. I understand the “comment out a service at a time” solution, but that’s not viable in a production environment.

You can try modifying the balenad service on your device to limit the concurrent downloads (see systemctl cat balena).

Appending --max-concurrent-downloads 1 to BALENAD_EXTRA_ARGS should be enough, followed by a service restart. the root filesystem on the device will require to be remounted read/write in order to achieve this and it may lead to unexpected side effects.

A better solution would be to move the device to a faster network.

I’ve tried this, but it did not have any effect

--max-concurrent-downloads 1 ← this caused service to crash

--max-concurrent-downloads=1 ← this did not

Has there been any development on this?

Hi
just checking to see if you have managed to sort this out or you’d like some more help.

Ramiro