Failing build - Cannot overwrite digest

Hi,
This is a follow up of my last message.
Currently, pushing a 12 service application using balena push fails most of the time. Here, I pushed twice in a row without changing anything. The first time, everything worked without issue, but on the second time, we got the following error:

[manager]          Successfully built 19466c3204ae
[Info]             Uploading images
[Success]          Successfully uploaded images
[Error]            Some services failed to build:
[Error]              Service: pijuice
[Error]                Error: Cannot overwrite digest sha256:b41e392e213f79e5c3a8670f5d21e4f2c2f2b91b1929c1752eb4c340bb5f74b5
[Error]              Service: artnet-receiver
[Error]                Error: Cannot overwrite digest sha256:b41e392e213f79e5c3a8670f5d21e4f2c2f2b91b1929c1752eb4c340bb5f74b5
[Info]             Built on arm01
[Error]            Not deploying release.

Here are the full logs of the failing deployment for more context:

More details as requested by @georgiats (although the error is occuring during the build process, so OS version and supervisor shouldn’t affect this):

  • We are deploying to a fleet of raspberrypi3
  • balenaOS 2.52.7+rev1
  • Supervisor 11.12.4

This really slows our development process down as we can lose between 5 to 15 minutes everytime a build fails.

Hi. One of our engineers is investigating this issue. I pinged him and he will take a look as soon as he is available.

1 Like

Hi there, our engineers are aware of this issue and are working on this. Actually a fix for this problem comes through this PR https://github.com/moby/moby/pull/37781 and is available upstream since v19.03.0. We are currently running v18.9.7 in our ARM builders, so this should be fixed as soon as we move to the new arch because we’ll be updating the engine version. I’m afraid I cannot share an ETA with you regarding the resolution, but we are going to notify you as soon as we have it updated.

Thanks for bearing with us,
Georgia

I see. IMO, this information should be pinned to the status page, as this affects all customers. I don’t think “99.97% uptime” is accurate, when half our builds fail.

We have the fix for this in staging. It’s being tested right now and if nothing goes wrong push it to production.

Hi!!

We just deployed the fix in one of our builder instances (ARM03) and we’ll be monitoring it closely for some to make sure it works properly. After a reasonable time, and once we make sure that it behaves as expected, we’ll deploy it on the rest of our builders.

Please try your builds again and take into account that the error might happen again if the build gets scheduled on ARM01, but it shouldn’t happen if is scheduled on ARM03.

Thank you very much!