How to stop infinite download loop

Raspberry pi
OS 2.3.2

I have a device that was in an app with 3 services. I moved it to a new app with 2 services. It got stuck in an infinite download loop, restarting after 40-50% was downloaded, so I moved it back. Now the third service which was eliminated in the new app is stuck in an infinite download loop. No messages. Just gets to 85 or 90% and restarts. I can sometimes log into host but journalctl does not show anything recent - last messages were over 2 hrs old.

This device is in a remote location with a prepaid data plan and it’s chewing up the data. Is there a way to abort the download and just keep that one container not running? The device can limp along with the other 2 containers until balena support is available to help figure out what’s corrupted.

Thanks

You can block the update by pinning the device onto the release it is now running. That will prevent it from trying to download the new release. When things are sorted (I saw you’ve reached out separately to support) you can unpin the release and let it download.

Can you elaborate? I moved from application 1 w/ 3 services A,B,C to application 2 w/ 2 services A,B. When that failed because it was stuck in an infinite download & fail loop, I moved back to application 1, and then it got stuck in a loop downloading&failing service C (services A & B were ok because I guess they never managed to update). After hours of that download/fail, it finally managed to download container C.

But there was nothing to pin to.

Hi,

To pin an application to a release, enter the Application --> Device menu. To the right of the device name are the Reboot and Restart buttons, then the light bulb icon, then a menu. Choose from that menu “Pin to release”. Click “Pin to a specific release” and select the release to which you want to lock to. Here’s a screen shot:

John

Thanks @jtonello. I’m sorry if I was unclear - I know how to pin a release. In the use case I presented, I don’t know how it applies.

Hi,

Sorry about that! Pinning will revert to a previous version of your app, which essentially ejects the current one and replaces it with the previous one. That is, it will replace all the containers in the tracking version to the previous version your choose. You’re correct, though. If the problem occurred during your initial deployment there wouldn’t be any previous releases to pin to.

Since you’re working with a PiZero, initial downloads can take some time and those delays can appear as failures. If possible, you can try your deployment on, say, an RPi3 to confirm the application is working and then push to your PiZeros. Still, depending on the size of your containers and the quality of your wifi, initial image deployments to the PiZero can be much slower than to other devices.

John

Ye thanks. I have tested in the lab and it went without issues. It was in the field that the problems exhibited themselves. I’m thinking that possibly deploying to pi0s in challenging connectivity environments is a risky business.

Hi,

Yes, they can be hard to diagnose when we’re used to more capable devices. Fortunately, with balena’s built-in delta capabilities, updates are generally significantly smaller than initial builds and take less bandwidth.

John

In a related note:

I tried to update a device from a 3 service deployment with (example) services alpha, beta, gamma (ID 1234) to one with a single service alpha (ID 5678). In a successful update, I would see the current release matching the target release and a single service alpha running with ID 5678. However, the update keeps failing. In the meantime, I see a current release not matching target release, as expected, but I only see the single service alpha with the old ID 1234.

It seems to me that ideally the system should maintain its current state until it has successfully downloaded the new images, at which point it can terminate the old containers and launch the new ones. The current model puts us in the precarious position of having a nonfunctional system while the system is updating, which is unfortunately not a reliable process over 3G connections.

Yes, this happens because the Supervisor considers each service separately. That is, it will download the new image, kill the previous container and then start spawning the one. The first two services don’t exist on the target release, so there’s nothing to download and they’re just killed immediately. Ideally, the Supervisor would first complete the download of all services, and only then proceed to kill and spawn. This is a known issue we’d like to improve: https://github.com/balena-io/balena-supervisor/issues/1103

Might be a silly answer, but did you remove the SD card with the Etched application on it?

I left mine in and it kept reinstalling every time the BBB booted up.

@theeruditefrog the Beagle Bone Black has internal storage. So, as indicated in the instructions when you download the BBB image from the dashboard, you need to insert the SD card, boot the BBB so it flashes the internal storage, wait until it shuts down, remove the SD card an boot it again.
See https://www.balena.io/docs/learn/getting-started/beaglebone-black/nodejs/#provision-device

This does not apply to raspberry pi devices that do not have an internal storage and work directly from the SD card.

That would explain it. thanks. Please ignore the noob!