Error with cloud arm64 build - 404 no such image

It failed, I sent you a reply to my previous PM

Thanks a lot. We are continuing to investigate the problem. Can you confirm that, as a workaround, you can successfully push the application using balena push with the --nocache option? i.e. balena push myApp --nocache

Hello. Sadly I cannot confirm that, the build timed out after 12 hours.

[Error]           The build took too long and was canceled.
[Info]            Built on arm01
[Error]           Not deploying release.

No other error is to be seen in the basic logs from detached mode. This never happened before, it usually takes a little bit more than 2 hours to build.

Ran the build again with a live session, and the --nocache option. The build failed with error “ An error occured: (HTTP code 404) no such image - no such image: cbc116b399b6: No such image: cbc116b399b6:latest ” at the step “Uploading images”.

Hey Valentin, there is currently an ongoing issue with the builder. Please go to https://status.balena.io/incidents/hszhtp712z59 and subscribe to get alerts right away when we make more progress on this.

I see there was a fix implemented. The build from earlier today went through without error, which is promising. I’ll try to push some updates in the next days to see if the problem happens again.

Thank you for confirming it @vbersier :+1:
Let us know if you come across further issues.

The same build failed today with the same error (404 no such image) during the “Uploading images” stage at the very end. The new garbage collector seems to not always succeed in keeping the images that are in use.

Hi there, could you please share the builder logs with us, to investigate the issue further?
You can run the following command
balena push yourApp > logs.txt
to save the outcome locally and attach the logs here.
Thanks

Logs sent as private message since they contain sensitive information

Hi, we still encounter some 404 errors for a very small percentage of the builds. From your logs, it seems that your build usually takes around 2 hours to run. The chance that the image upload will fail is related to how long the build runs. There is a higher chance for the builder GC to interfere with a build the longer the build runs.
Unfortunately, there is currently no way around and we are working on a permanent fix.

Could you please try once again to build your image?
Thanks for bearing with us,
Georgia

Hello Georgia,

That’s what I thought, it makes sense. I built again and it succeeded the second time.

Hopefully you can find a way to perform garbage collection in a deterministic manner soon.

Best,
Val

On a side note, I also hope the developers of scipy, pandas, scikit-learn, etc. provide pre-built wheel packages for arm64 soon, because that’s the reason my build takes so long. I have to compile everything from source.

Thanks for trying this again, and I’m glad to hear that it worked for you this time. I’ve added a suggestion to our internal ticketing system that we consider sponsoring builds for the packages you mention – I’m sure they’d be useful to our users. In the meantime, we’re continuing to work on a permanent fix for the garbage collection issue, and will update you when we have more info.

All the best,
Hugh

That’s great to hear. Thanks for the support.

Regarding the arm64 compilation of python packages, I came across this project: https://www.piwheels.org/

I see in the github issues that the project owner is considering adding aarch64 builds to his pi-oriented index, which could help in this regard.

Hey guys,

I’m sorry to report that I got the dreaded “404 no such image” error again today during a long build.

I can provide the logs if needed, but it’s the same story as before.

Hi Valentin, sorry to hear that. Yes sending over the logs would be useful thanks. I also noted that you suggested precompiled wheels might help. Have you considered putting some of the longer build and less frequently updated things into a base image you can put up on dockerhub and then just reference in your Dockerfile? This should effectively allow you to avoid rebuilding those. I often do this for personal projects when using ML/AI libraries and things like openCV.

Hello Shaun,

Thanks for the message. I sent you the logs in a direct message. Regarding compiling wheels for the packages that take a long time to compile, I think that would not gain me a lot of time, since I mainly re-build the application when some dependencies versions change (which would require a re-build anyway). Also, I’m not sure if I could use the cloud builders for that task, and locally it takes even longer to build for arm. I manage dependencies with poetry and the easiest way would be to host some pre-compiled packages over at pypi or – even better – some custom repo, but I can’t take on the task of keeping up to date with releases for numpy, scipy, scikit-learn, pandas, statsmodels and more.

EDIT: this is hit or miss, I just tried a build again and it finished succesfully after 3 hours. Hopefully the garbage collector on the builder instances can be made even more robust.

We’re actively working on the builder garbage collection, and should be able to eliminate this problem in the near future. At the moment, only a small percentage of builds hit this error, but it seems you’ve been unlucky and hit it multiple times in a short timeframe.