Deploying an app times out most of the time

Hi,

I’m building an app with multiple services, all services are prebuild (in the docker compose I point to tagged image on the docker hub registry). When I want to deploy the app, the builder tries to fetch the images but it fails mose of the time :

This is a pretty hard to work with. When it’s working it does not take too long (~2 minutes) but when it fails it take longer and we have to deploy it multiple times to get it to work si that’s really inconvenient. Is there anything we can do to improve that ?

we were initaly building the images at deploy time but it tool too long (20+ minutes) that’s why we changed for prebuild images but it’s not really better since we have to monitor if the deploy is successful and restart

thank you !

How often do you see this happening? When did it happen for the first time?

looking at the release log I see 30 failure for the last 40 attempts. It started happening once I switched from building the images at deploy time to using pre build images instead last week.
so I don’t think that’s a new issue, just one we were avoiding because of the different configuration

Just out of curiosity, are you able to attempt a build using the balena push methodology, instead of git push ? I am just curious if the results are the same.

@dtischler just tried that a few times, it failed with the same error 3 out of 5 times

I don’t really understand why it would fail after having succeded though. The target images should already be in cache. even in BalenaCloud own registry maybe. From my point of view this should almost be a no-op. Idealy even the release number should be the same since virtually nothing changed (I’m pushing the same commit with a docker-compose pointing to images already build and already deployed through BalenaCloud)

Hi Mathieu,

A couple things to consider: If you’re using a “free” Docker account, they’ve instituted download limits, though last time I checked it was about 1,000 pulls every 6 hours (hard to overwhelm for routine builds). You might also double-check the docker image path and/or ensure that the repo is public (and doesn’t require authorization). The fact all images fail could suggest a networking error (or intermittent outages). From the devices’s HostOS terminal, are you able to ping remote addresses and get consistent results?

John

Also, this Forum thread might offer some insight: Deploy a pre-built image into balena using local mode

John

Hi @jtonello, thx for the feedback :slight_smile:

I’m pretty sure I’m not hitting the docker download limits. and in fact, doing a balean push immediatly tells me:

[Info] Everything is up to date (use --build to force a rebuild)

and continue to build the release (quite quickly) and then push it back (to balena registry? to docker hub? I’m not sure)

unfortunately, our internet connection is very very limited and event for just a few hundreds mbs it takes a very long time. Does it push the whole images each time even if they are the same ?

I’ve tried with a balean push too and it timed out too:

I’m not sure where the timeout is happening here. is it on one of the Balena Cloud server ?


I was planning to configure our gitlab ci server to handle deployment. maybe it’ll work better with its faster internet connection

These errors are happening in the builder. Once the build completes, deltas will be generated and sent to all devices unless deltas are manually disabled (full images are sent in this case). I’ll ping the builder maintainer and we’ll investigate what might be causing this. In the meantime could you post your docker-compose.yml?

Just to confirm, building locally using balena build ... works fine?

Also, a note on the cloud builders. They are shared and on occasion (some days more often than not), the builders may be running concurrent heavy builds, which could block and cause these timeouts. If the behaviour you are experiencing is completely (seemingly?) random, then it’s probably related to this.

We are looking for next generation build infrastructure at the moment, where the builds will run in isolation, but I have no timelines I can report on at this stage.

If your internet connection is limited, balena|git push ... flows will probably not be much use, unless your resulting images are of manageable size.

In the mean time, you could run your builds in a CI pipeline and wrap them in an exponential back-off retry function, which will cause the builds to retry.

Just to confirm, building locally using balena build ... works fine?

yes it works fine

We are looking for next generation build infrastructure at the moment, where the builds will run in isolation, but I have no timelines I can report on at this stage.

:+1:

If your internet connection is limited, balena|git push ... flows will probably not be much use, unless your resulting images are of manageable size.

why would git push not help ? it only pushes the docker-compose.yml from my network, the rest is done on balena cloud. or did I misendrestood how how these command work ?

anyway, using balena push from GitLab CI on a server with a fast internet connection work reliably and in a manageable amount of time ( 3 ~ 5 minutes)

I still think it should be faster when I’m pushing the exact same images but it’s good enough to be usable. Maybe I’ll open another thread to discuss how things could be improved


thank you all !

Hey there
We would be grateful if you could provide further feedback or suggestions as you mention :slight_smile: