Image download fail/reload repeat

I am pushing new images from the CLI, and on a somewhat frequent basis, I get a download fail/reload cycle. Download reports 100% complete and then repeats maybe 3 or 4 times before successfully completing. The CLI which I used to initiate the download successfully finishes long before the dashboard reports success.
This all may be correlated with changing the project that I load, but I am not sure.

This is not stopping me from making progress, but it is confusing. Am I doing something wrong?

30.04.21 21:47:36 (-0700) Failed to download image 'registry2.balena-cloud.com/v2/f189a714fcc2c5e9aa92c36e6bae951d@sha256:bfb7aef0b420146cac11bae549df66bbff3e26720d8f81c92d2e99226d66b8e0' due to 'error pulling image configuration: Get https://registry-data.balena-cloud.com/prod/docker/registry/v2/blobs/sha256/22/22d198854f7455ea822d5a395fa5f4191b9fb9c39bbeaada60cd2b1c7b24a94c/data?Expires=1619845626&Signature=YmT16F1QnOUs3o0xpHZTz38Izh8Y0AncIZb2d~IeiU6boutzHUBCORAafLZ5NX6y53tjixraQwuRE7hSGBDjNXEGeSDKH3ntUxEgfxlTAVojDL-n~2nIqLBn5VLAXMmXPolxwspCmgX1pqvzgrx~8SosxV5Pn663kVPC9oBpXOmJxnX8cBb5BGIorhJqE3pO1FSSbyShVd926JR~e73qCTG1vgGqOq4b16LlF-ONgeOpdwvW5ILBdOVoQhD52C0NsQ3T9mwf6Rl3bpO90DcMv~qxrxAasra5RxLUmsRuZLa7KuqIcMQF5j7Gehe1kZVFe~Umifzr7W2~pdDWm2IgjA__&Key-Pair-Id=APKAJRCZR26VRIDKA6WQ: dial tcp: i/o timeout'
30.04.21 21:47:45 (-0700) Downloading image 'registry2.balena-cloud.com/v2/f189a714fcc2c5e9aa92c36e6bae951d@sha256:bfb7aef0b420146cac11bae549df66bbff3e26720d8f81c92d2e99226d66b8e0'

A second sort of related question. Is it normal to see remnants of previous containers during the update process? In the example below, balena_avrdude_rs485 is the old container, and influxdb etc are containers from the multi-container balena-sense project.

Reverting back to my first question. The same download seemed to end in an incomplete state. ie there was an old container left from the previous project, and not all the containers for balena-sense had been loaded. After a reboot, I saw the following. Note that grafana and telegraf are now being loaded.

I waited 30min, and still loading/failing/loading. Left overnight, and in the morning everything is fine. All the balena-sense containers are loaded and the leftover container from the previous image is gone. However, when I look at the Grafana server, I barely have 1 hour of data, which would imply that download took 6 hours!! What’s happening?

Hello

Could you please provide more details about your setup?

  • what device type are you using? which balenaOS version also?
  • which sd card do you have?
  • how is the device connected to the internet? wifi, ethernet or cellular?
  • Can you run the Device Health checks and Device Diagnostics from the Diagnostics(Experimental) tab? We would be grateful if you could provide the diagnostics output.

As for your question:

Is it normal to see remnants of previous containers during the update process?

I believe this is the intended behaviour if you are using the download-then-kill update strategy. It is the default strategy.

Thanks for providing the above info. This will help us investigate the failed downloads and also why the update presumably takes so long

[balenaOS 2.73.1+rev1]
Sandisk Ultra 32GB UHS-I class 10
Connected by WiFi.
Two failed diagnostics checks, everything else passed.


Hi

Can you tell me a little about the network that you are on? I am wondering if it has some kind of a custom DNS thing going on. Reason that I ask this is the errors we see in the diagnostics.

We have some network requirements for devices - so that the connection to balena’s cloud is smooth. You can find more about those in our relevant docs