updating fails jetson emmc

hello,

I whenever i make any change in my Dockerfile, i run into updating issues.
It seems like any change in my dockerfile results in a complete download of the image.

whenever i update the code in my application by simply rebuilding the image with the exact same dockerfile, everything works fine.

whenever i change the Dockerfile (example pip3 install Flask >> pip3 install Flask==1.1.1), the updating fails. i can fix this by purging the data but this deletes all persistent data as well.

output failed update + purging data:

Failed to download image 'registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f' due to 'Got 500 when requesting v3 delta from delta server.'
Downloading delta for image 'registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f'
Delta still processing remotely. Will retry...
Downloading delta for image 'registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f'
Delta still processing remotely. Will retry...
Downloading delta for image 'registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f'
Delta still processing remotely. Will retry...
Downloading delta for image 'registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f'
Failed to download image 'registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f' due to 'failed to register layer: Error processing tar file(exit status 1): write /usr/local/cuda-10.2/targets/aarch64-linux/lib/libcufft_static_nocallback.a: no space left on device'
Downloading delta for image 'registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f'


Purging data for app 1774169
Killing service 'main sha256:a6b3756e832b0997df646fbc4a257c866de91a0a2865ad7b8b63c2d0aad96f37'
Service exited 'main sha256:a6b3756e832b0997df646fbc4a257c866de91a0a2865ad7b8b63c2d0aad96f37'
Killed service 'main sha256:a6b3756e832b0997df646fbc4a257c866de91a0a2865ad7b8b63c2d0aad96f37'
Failed to download image 'registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f' due to 'failed to register layer: Error processing tar file(exit status 1): write /usr/lib/python2.7/config-aarch64-linux-gnu/libpython2.7-pic.a: no space left on device'
Deleting image 'registry2.balena-cloud.com/v2/e2d300472be8df02c1c00bdaac0dda0b@sha256:a4b081609ce6530e6d694dab7ccae8a3af86caec445a57a0c640f81c30cd6146'
Deleted image 'registry2.balena-cloud.com/v2/e2d300472be8df02c1c00bdaac0dda0b@sha256:a4b081609ce6530e6d694dab7ccae8a3af86caec445a57a0c640f81c30cd6146'
Removing volume 'resin-data'
Creating volume 'resin-data'
Error purging data: Error: Failed to apply state transition steps. Cannot read property 'match' of undefined Steps:["fetch","createVolume"]
Downloading image 'registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f'

hardware: jetson nano emmc 16Gb
image: 8Gb
os: balenaOS 2.67.3+rev2

Hi jasper,
Instead of purging the data, can you try sshing into the device and running balena system prune -a. This deletes only unused containers. (docker system prune | Docker Documentation)

after pruning, it still gets stuck in a loop of filling the entire disk with data.

balena images shows a single 75mb image (the resin_supervisor), but the disk usage at /mnt/data is still at 8Gb at the lowest.

i did get an additional error :

02.03.21 11:45:11 (+0100) Failed to download image ‘registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f’ due to 'connect ECONNREFUSED /var/run/balena-engine.soc

Hi Jasper,

How big is the image the device is trying to download?

You could use a different update strategy so that the device removes the current image before downloading the new one.

Phil

I prefer not to use the delete-then-download method, since i also plan to deploy small code updates and a reliable network connection is not always a given.

besides, i think downloading only the deltas is a feature of BalenaOS, and the adjustment i made (‘flask-socketio’ to ‘flask-socketio==4.3.2’) probably should not generate 6Gb of deltas.

EDIT: enabling delete-then-download does not change anything. no space is cleared up before updating (or i might be doing it wrong)

The ECONNREFUSED error shows that there is an issue with balena engine. Could you run diagnostics on the device and seeing if anything comes up red? It’s on the left side of the device’s dashboard page.

And also this:

Error purging data: Error: Failed to apply state transition steps. Cannot read property 'match' of undefined Steps:["fetch","createVolume"]

Which version of the supervisor is your device running?

And lastly, what do you mean by “purging the data”? What steps did you follow?

the supervisor is 12.3.0 and the diagnostics do not seem to show anything new. The main container is indeed stopping due to a full disk and the temperature is a known bug.

when i purge the system, i am using the purge data option in the dashboard.

The main container is indeed stopping due to a full disk

Hey @jap937, when does this happen? Only during the update?
You noted that your image is 8GB. What’s the usual size of your persistent application data?

As you say, I’d expect the delta image size to be much smaller than the original image when you are making minimal changes in your application.

In case you haven’t seen these, I encourage you to go through our docs on optimizing your docker builds:

Perhaps you could save some room by changing how you build the final image that’s deployed to the device.

Next, can you please update the supervisor version to the latest and see if the problem still manifests?
My teammate maintaining supervisor just noted that the latest supervisor version (v12.4.3) addresses couple of commonly experienced issues.

If you still see issues, it’d be great to see the supervisor logs. Can you please run this command journalctl -fn 1000 -u resin-supervisor on HostOS and send us the the output?

Cheers…

the persistent data is just a few configuration files usually less than a megabyte
I also updated the supervisor to 12.4.3, but no luck.

here are the supervisor logs from the last purge until the failing update.

Mar 08 07:51:52 0713b55 resin-supervisor[597313]: [api]     POST /v1/purge  -  ms
Mar 08 07:56:02 0713b55 resin-supervisor[597313]: [debug]   delta([main] registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f): Delta failed with Error: failed to register layer: Error processing tar file(exit status 1): write /usr/share/midi/freepats/Tone_000/040_Violin.pat: no space left on device
Mar 08 07:56:02 0713b55 resin-supervisor[597313]: [event]   Event: Image download error {"error":{"message":"failed to register layer: Error processing tar file(exit status 1): write /usr/share/midi/freepats/Tone_000/040_Violin.pat: no space left on device","stack":"Error: failed to register layer: Error processing tar file(exit status 1): write /usr/share/midi/freepats/Tone_000/040_Violin.pat: no space left on device\n    at Stream.<anonymous> (/usr/src/app/dist/app.js:10:2278947)\n    at Stream.emit (events.js:310:20)\n    at drain (/usr/src/app/dist/app.js:2:298862)\n    at Stream.stream.queue.stream.push (/usr/src/app/dist/app.js:2:299269)\n    at Parser.parser.onToken (/usr/src/app/dist/app.js:10:251148)\n    at Parser.proto.write (/usr/src/app/dist/app.js:10:695962)\n    at Stream.<anonymous> (/usr/src/app/dist/app.js:10:249612)\n    at Stream.stream.write (/usr/src/app/dist/app.js:2:299138)\n    at IncomingMessage.ondata (_stream_readable.js:695:22)\n    at IncomingMessage.emit (events.js:310:20)\n    at addChunk (_stream_readable.js:286:12)\n    at readableAddChunk (_stream_readable.js:268:9)\n    at IncomingMessage.Readable.push (_stream_readable.js:209:10)\n    at HTTPParser.parserOnBody (_http_common.js:132:24)\n    at Socket.socketOnData (_http_client.js:476:22)\n    at Socket.emit (events.js:310:20)"},"image":{"name":"registry2.balena-cloud.com/v2/6cc5f64d33f244d8e3dce4b038bc94d8@sha256:a1cff9a63383d80506e18fa252a96c454266ae081065b223fd7a868815cba832","appId":1774169,"serviceId":835637,"serviceName":"main","imageId":3351715,"releaseId":1718361,"dependent":0,"dockerImageId":null}}
Mar 08 07:56:08 0713b55 resin-supervisor[597313]: [api]     GET /v1/healthy 200 - 9.756 ms
Mar 08 07:56:09 0713b55 resin-supervisor[597313]: [event]   Event: Image removal {"image":{"id":27,"name":"registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f","appId":1774169,"serviceId":835637,"serviceName":"main","imageId":3327926,"releaseId":1711387,"dependent":0,"dockerImageId":"sha256:05aea6e2bcccabffa8cc93f50525c4898e41a1231f30ca9caf6050fbe7a94413"}}
Mar 08 07:56:09 0713b55 resin-supervisor[597313]: [event]   Event: Image removed {"image":{"id":27,"name":"registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f","appId":1774169,"serviceId":835637,"serviceName":"main","imageId":3327926,"releaseId":1711387,"dependent":0,"dockerImageId":"sha256:05aea6e2bcccabffa8cc93f50525c4898e41a1231f30ca9caf6050fbe7a94413"}}
Mar 08 07:56:24 0713b55 resin-supervisor[597313]: [success] Device state apply success
Mar 08 07:56:24 0713b55 resin-supervisor[597313]: [event]   Event: Volume removal {}
Mar 08 07:56:25 0713b55 resin-supervisor[597313]: [event]   Event: Volume creation {}
Mar 08 07:56:25 0713b55 resin-supervisor[597313]: [error]   Device state apply error Error: Failed to apply state transition steps. Cannot read property 'match' of undefined Steps:["fetch","createVolume"]
Mar 08 07:56:25 0713b55 resin-supervisor[597313]: [error]         at fn (/usr/src/app/dist/app.js:6:8488)
Mar 08 07:56:25 0713b55 resin-supervisor[597313]: [event]   Event: Purge data error {"appId":1774169,"error":{"message":"Failed to apply state transition steps. Cannot read property 'match' of undefined Steps:[\"fetch\",\"createVolume\"]","stack":"Error: Failed to apply state transition steps. Cannot read property 'match' of undefined Steps:[\"fetch\",\"createVolume\"]\n    at fn (/usr/src/app/dist/app.js:6:8488)"}}
Mar 08 07:56:25 0713b55 resin-supervisor[597313]: [error]   Error on POST /v1/purge:  Error: Failed to apply state transition steps. Cannot read property 'match' of undefined Steps:["fetch","createVolume"]
Mar 08 07:56:25 0713b55 resin-supervisor[597313]: [error]         at fn (/usr/src/app/dist/app.js:6:8488)
Mar 08 07:56:25 0713b55 resin-supervisor[597313]: [event]   Event: Docker image download {"image":{"name":"registry2.balena-cloud.com/v2/6cc5f64d33f244d8e3dce4b038bc94d8@sha256:a1cff9a63383d80506e18fa252a96c454266ae081065b223fd7a868815cba832","appId":1774169,"serviceId":835637,"serviceName":"main","imageId":3351715,"releaseId":1718361,"dependent":0,"dockerImageId":null}}
Mar 08 08:00:02 0713b55 resin-supervisor[597313]: [error]   Error from the API: 503
Mar 08 08:00:02 0713b55 resin-supervisor[597313]: [error]   Non-200 response from the API! Status code: 503 - message: Error
Mar 08 08:00:02 0713b55 resin-supervisor[597313]: [error]         at /usr/src/app/dist/app.js:22:554765
Mar 08 08:00:02 0713b55 resin-supervisor[597313]: [error]       at runMicrotasks (<anonymous>)
Mar 08 08:00:02 0713b55 resin-supervisor[597313]: [error]       at processTicksAndRejections (internal/process/task_queues.js:97:5)
Mar 08 08:00:02 0713b55 resin-supervisor[597313]: [error]       at async /usr/src/app/dist/app.js:22:554073
Mar 08 08:00:02 0713b55 resin-supervisor[597313]: [error]       at async /usr/src/app/dist/app.js:22:555595
Mar 08 08:01:06 0713b55 resin-supervisor[597313]: [debug]   Attempting container log timestamp flush...
Mar 08 08:01:06 0713b55 resin-supervisor[597313]: [debug]   Container log timestamp flush complete
Mar 08 08:01:09 0713b55 resin-supervisor[597313]: [api]     GET /v1/healthy 200 - 16.671 ms
Mar 08 08:06:10 0713b55 resin-supervisor[597313]: [api]     GET /v1/healthy 200 - 13.440 ms
Mar 08 08:07:33 0713b55 resin-supervisor[597313]: [event]   Event: Image downloaded {"image":{"name":"registry2.balena-cloud.com/v2/6cc5f64d33f244d8e3dce4b038bc94d8@sha256:a1cff9a63383d80506e18fa252a96c454266ae081065b223fd7a868815cba832","appId":1774169,"serviceId":835637,"serviceName":"main","imageId":3351715,"releaseId":1718361,"dependent":0,"dockerImageId":null}}
Mar 08 08:07:34 0713b55 resin-supervisor[597313]: [event]   Event: Service install {"service":{"appId":1774169,"serviceId":835637,"serviceName":"main","releaseId":1718361}}
Mar 08 08:07:34 0713b55 resin-supervisor[597313]: [event]   Event: Service installed {"service":{"appId":1774169,"serviceId":835637,"serviceName":"main","releaseId":1718361}}
Mar 08 08:07:34 0713b55 resin-supervisor[597313]: [event]   Event: Service start {"service":{"appId":1774169,"serviceId":835637,"serviceName":"main","releaseId":1718361}}
Mar 08 08:07:35 0713b55 resin-supervisor[597313]: [event]   Event: Service started {"service":{"appId":1774169,"serviceId":835637,"serviceName":"main","releaseId":1718361}}
Mar 08 08:07:35 0713b55 resin-supervisor[597313]: [debug]   Spawning journald with: chroot  /mnt/root journalctl -a -S 2021-03-08 08:07:35 -o json CONTAINER_ID_FULL=148645a621b7975ce92af232bf0aa165be214b7dd7ef4db0bf611eae6930cea9
Mar 08 08:07:36 0713b55 resin-supervisor[597313]: [debug]   Finished applying target state
Mar 08 08:07:36 0713b55 resin-supervisor[597313]: [success] Device state apply success
Mar 08 08:07:36 0713b55 resin-supervisor[597313]: [info]    Applying target state
Mar 08 08:07:36 0713b55 resin-supervisor[597313]: [debug]   Finished applying target state
Mar 08 08:07:36 0713b55 resin-supervisor[597313]: [success] Device state apply success
Mar 08 08:11:06 0713b55 resin-supervisor[597313]: [debug]   Attempting container log timestamp flush...
Mar 08 08:11:06 0713b55 resin-supervisor[597313]: [debug]   Container log timestamp flush complete
Mar 08 08:11:11 0713b55 resin-supervisor[597313]: [api]     GET /v1/healthy 200 - 14.502 ms
Mar 08 08:13:04 0713b55 resin-supervisor[597313]: [event]   Event: Update notification {}
Mar 08 08:13:04 0713b55 resin-supervisor[597313]: [api]     POST /v1/update 204 - 59.533 ms
Mar 08 08:13:05 0713b55 resin-supervisor[597313]: [info]    Applying target state
Mar 08 08:13:05 0713b55 resin-supervisor[597313]: [debug]   Replacing container for service main because of config changes:
Mar 08 08:13:05 0713b55 resin-supervisor[597313]: [debug]     Non-array fields:  {"added":{},"deleted":{"entrypoint":{},"environment":{},"labels":{}},"updated":{"image":"registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f","workingDir":""}}
Mar 08 08:13:05 0713b55 resin-supervisor[597313]: [debug]   Replacing container for service main because of config changes:
Mar 08 08:13:05 0713b55 resin-supervisor[597313]: [debug]     Non-array fields:  {"added":{},"deleted":{"entrypoint":{},"environment":{},"labels":{}},"updated":{"image":"registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f","workingDir":""}}
Mar 08 08:13:05 0713b55 resin-supervisor[597313]: [event]   Event: Delta image download {"image":{"name":"registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f","appId":1774169,"serviceId":835637,"serviceName":"main","imageId":3327926,"releaseId":1711387,"dependent":0,"dockerImageId":null}}
Mar 08 08:13:05 0713b55 resin-supervisor[597313]: [debug]   delta([main] registry2.balena-cloud.com/v2/6cc5f64d33f244d8e3dce4b038bc94d8@sha256:a1cff9a63383d80506e18fa252a96c454266ae081065b223fd7a868815cba832): Starting delta to registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f
Mar 08 08:13:06 0713b55 resin-supervisor[597313]: [debug]   delta([main] registry2.balena-cloud.com/v2/6cc5f64d33f244d8e3dce4b038bc94d8@sha256:a1cff9a63383d80506e18fa252a96c454266ae081065b223fd7a868815cba832): Applying balena delta...
Mar 08 08:13:06 0713b55 resin-supervisor[597313]: [debug]   delta([main] registry2.balena-cloud.com/v2/6cc5f64d33f244d8e3dce4b038bc94d8@sha256:a1cff9a63383d80506e18fa252a96c454266ae081065b223fd7a868815cba832): Using registry auth token
Mar 08 08:15:10 0713b55 resin-supervisor[597313]: [error]   Error from the API: 503
Mar 08 08:15:10 0713b55 resin-supervisor[597313]: [error]   Non-200 response from the API! Status code: 503 - message: Error
Mar 08 08:15:10 0713b55 resin-supervisor[597313]: [error]         at /usr/src/app/dist/app.js:22:554765
Mar 08 08:15:10 0713b55 resin-supervisor[597313]: [error]       at runMicrotasks (<anonymous>)
Mar 08 08:15:10 0713b55 resin-supervisor[597313]: [error]       at processTicksAndRejections (internal/process/task_queues.js:97:5)
Mar 08 08:15:10 0713b55 resin-supervisor[597313]: [error]       at async /usr/src/app/dist/app.js:22:554073
Mar 08 08:15:10 0713b55 resin-supervisor[597313]: [error]       at async /usr/src/app/dist/app.js:22:555595
Mar 08 08:16:12 0713b55 resin-supervisor[597313]: [api]     GET /v1/healthy 200 - 23.233 ms
Mar 08 08:19:59 0713b55 resin-supervisor[597313]: [debug]   delta([main] registry2.balena-cloud.com/v2/6cc5f64d33f244d8e3dce4b038bc94d8@sha256:a1cff9a63383d80506e18fa252a96c454266ae081065b223fd7a868815cba832): Delta failed with Error: failed to register layer: Error processing tar file(exit status 1): write /usr/bin/nmap: no space left on device
Mar 08 08:19:59 0713b55 resin-supervisor[597313]: [event]   Event: Image download error {"error":{"message":"failed to register layer: Error processing tar file(exit status 1): write /usr/bin/nmap: no space left on device","stack":"Error: failed to register layer: Error processing tar file(exit status 1): write /usr/bin/nmap: no space left on device\n    at Stream.<anonymous> (/usr/src/app/dist/app.js:10:2278947)\n    at Stream.emit (events.js:310:20)\n    at drain (/usr/src/app/dist/app.js:2:298862)\n    at Stream.stream.queue.stream.push (/usr/src/app/dist/app.js:2:299269)\n    at Parser.parser.onToken (/usr/src/app/dist/app.js:10:251148)\n    at Parser.proto.write (/usr/src/app/dist/app.js:10:695962)\n    at Stream.<anonymous> (/usr/src/app/dist/app.js:10:249612)\n    at Stream.stream.write (/usr/src/app/dist/app.js:2:299138)\n    at IncomingMessage.ondata (_stream_readable.js:695:22)\n    at IncomingMessage.emit (events.js:310:20)\n    at addChunk (_stream_readable.js:286:12)\n    at readableAddChunk (_stream_readable.js:268:9)\n    at IncomingMessage.Readable.push (_stream_readable.js:209:10)\n    at HTTPParser.parserOnBody (_http_common.js:132:24)\n    at Socket.socketOnData (_http_client.js:476:22)\n    at Socket.emit (events.js:310:20)"},"image":{"name":"registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f","appId":1774169,"serviceId":835637,"serviceName":"main","imageId":3327926,"releaseId":1711387,"dependent":0,"dockerImageId":null}}
Mar 08 08:20:07 0713b55 resin-supervisor[597313]: [debug]   Replacing container for service main because of config changes:
Mar 08 08:20:07 0713b55 resin-supervisor[597313]: [debug]     Non-array fields:  {"added":{},"deleted":{"entrypoint":{},"environment":{},"labels":{}},"updated":{"image":"registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f","workingDir":""}}
Mar 08 08:20:07 0713b55 resin-supervisor[597313]: [debug]   Replacing container for service main because of config changes:
Mar 08 08:20:07 0713b55 resin-supervisor[597313]: [debug]     Non-array fields:  {"added":{},"deleted":{"entrypoint":{},"environment":{},"labels":{}},"updated":{"image":"registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f","workingDir":""}}
Mar 08 08:20:07 0713b55 resin-supervisor[597313]: [event]   Event: Delta image download {"image":{"name":"registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f","appId":1774169,"serviceId":835637,"serviceName":"main","imageId":3327926,"releaseId":1711387,"dependent":0,"dockerImageId":null}}
Mar 08 08:20:07 0713b55 resin-supervisor[597313]: [debug]   delta([main] registry2.balena-cloud.com/v2/6cc5f64d33f244d8e3dce4b038bc94d8@sha256:a1cff9a63383d80506e18fa252a96c454266ae081065b223fd7a868815cba832): Starting delta to registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f
Mar 08 08:20:08 0713b55 resin-supervisor[597313]: [debug]   delta([main] registry2.balena-cloud.com/v2/6cc5f64d33f244d8e3dce4b038bc94d8@sha256:a1cff9a63383d80506e18fa252a96c454266ae081065b223fd7a868815cba832): Applying balena delta...
Mar 08 08:20:08 0713b55 resin-supervisor[597313]: [debug]   delta([main] registry2.balena-cloud.com/v2/6cc5f64d33f244d8e3dce4b038bc94d8@sha256:a1cff9a63383d80506e18fa252a96c454266ae081065b223fd7a868815cba832): Using registry auth token
Mar 08 08:21:06 0713b55 resin-supervisor[597313]: [debug]   Attempting container log timestamp flush...
Mar 08 08:21:06 0713b55 resin-supervisor[597313]: [debug]   Container log timestamp flush complete
Mar 08 08:21:13 0713b55 resin-supervisor[597313]: [api]     GET /v1/healthy 200 - 16.543 ms
Mar 08 08:25:40 0713b55 resin-supervisor[597313]: [debug]   delta([main] registry2.balena-cloud.com/v2/6cc5f64d33f244d8e3dce4b038bc94d8@sha256:a1cff9a63383d80506e18fa252a96c454266ae081065b223fd7a868815cba832): Delta failed with Error: failed to register layer: Error processing tar file(exit status 1): write /usr/bin/nmap: no space left on device
Mar 08 08:25:40 0713b55 resin-supervisor[597313]: [event]   Event: Image download error {"error":{"message":"failed to register layer: Error processing tar file(exit status 1): write /usr/bin/nmap: no space left on device","stack":"Error: failed to register layer: Error processing tar file(exit status 1): write /usr/bin/nmap: no space left on device\n    at Stream.<anonymous> (/usr/src/app/dist/app.js:10:2278947)\n    at Stream.emit (events.js:310:20)\n    at drain (/usr/src/app/dist/app.js:2:298862)\n    at Stream.stream.queue.stream.push (/usr/src/app/dist/app.js:2:299269)\n    at Parser.parser.onToken (/usr/src/app/dist/app.js:10:251148)\n    at Parser.proto.write (/usr/src/app/dist/app.js:10:695962)\n    at Stream.<anonymous> (/usr/src/app/dist/app.js:10:249612)\n    at Stream.stream.write (/usr/src/app/dist/app.js:2:299138)\n    at IncomingMessage.ondata (_stream_readable.js:695:22)\n    at IncomingMessage.emit (events.js:310:20)\n    at addChunk (_stream_readable.js:286:12)\n    at readableAddChunk (_stream_readable.js:268:9)\n    at IncomingMessage.Readable.push (_stream_readable.js:209:10)\n    at HTTPParser.parserOnBody (_http_common.js:132:24)\n    at Socket.socketOnData (_http_client.js:476:22)\n    at Socket.emit (events.js:310:20)"},"image":{"name":"registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f","appId":1774169,"serviceId":835637,"serviceName":"main","imageId":3327926,"releaseId":1711387,"dependent":0,"dockerImageId":null}}
Mar 08 08:25:51 0713b55 resin-supervisor[597313]: [debug]   Replacing container for service main because of config changes:
Mar 08 08:25:51 0713b55 resin-supervisor[597313]: [debug]     Non-array fields:  {"added":{},"deleted":{"entrypoint":{},"environment":{},"labels":{}},"updated":{"image":"registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f","workingDir":""}}
Mar 08 08:25:51 0713b55 resin-supervisor[597313]: [debug]   Replacing container for service main because of config changes:
Mar 08 08:25:51 0713b55 resin-supervisor[597313]: [debug]     Non-array fields:  {"added":{},"deleted":{"entrypoint":{},"environment":{},"labels":{}},"updated":{"image":"registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f","workingDir":""}}
Mar 08 08:25:51 0713b55 resin-supervisor[597313]: [event]   Event: Delta image download {"image":{"name":"registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f","appId":1774169,"serviceId":835637,"serviceName":"main","imageId":3327926,"releaseId":1711387,"dependent":0,"dockerImageId":null}}
Mar 08 08:25:51 0713b55 resin-supervisor[597313]: [debug]   delta([main] registry2.balena-cloud.com/v2/6cc5f64d33f244d8e3dce4b038bc94d8@sha256:a1cff9a63383d80506e18fa252a96c454266ae081065b223fd7a868815cba832): Starting delta to registry2.balena-cloud.com/v2/dde316e69426e2f6ae8b5f7eb32c42a9@sha256:3948731fd2e85548ac9d6efd915417d1ee91a92260d5fff67faeca8fd977538f
Mar 08 08:25:52 0713b55 resin-supervisor[597313]: [debug]   delta([main] registry2.balena-cloud.com/v2/6cc5f64d33f244d8e3dce4b038bc94d8@sha256:a1cff9a63383d80506e18fa252a96c454266ae081065b223fd7a868815cba832): Applying balena delta...
Mar 08 08:25:52 0713b55 resin-supervisor[597313]: [debug]   delta([main] registry2.balena-cloud.com/v2/6cc5f64d33f244d8e3dce4b038bc94d8@sha256:a1cff9a63383d80506e18fa252a96c454266ae081065b223fd7a868815cba832): Using registry auth token
Mar 08 08:26:13 0713b55 resin-supervisor[597313]: [api]     GET /v1/healthy 200 - 6.071 ms

this appears to be a Dockerfile related problem. (though delete-then-download still has no effect)
I require opencv for my application, but any Dockerfile changes above the opencv install causes a massive delta update.
I delete any downloaded opencv packages after the install on the same layer. Does the way delta updates are handles affect the way i am required to build packages from source (are multi stage builds essential for this)?

simplified Dockerfile

  1. install jetpack core packages (BSP)
  2. instal some basic packages
  3. install opencv from source
  4. install some application specific packages

Hello, the delta updates look at binary differences between the whole old and new image, regardless of the layers. The delta will be most effective (smallest) when there is little change from one release to next. However, in the worst case the delta won’t be any larger than the size of the whole image pull. Multi stage builds are not essential for efficient deltas, but are a good overall strategy to reduce image size when appropriate, especially when building packages such as OpenCV. You might want to consider reordering your Dockerfile into two stages like this:
First stage:
0. install OpenCV prereqs

  1. build OpenCV from source
    Second stage:
  2. Copy OpenCV files/needed libraries (CUDA, etc…) from previous stage
  3. install jetpack core packages (BSP)
  4. install some basic packages
  5. install some application specific packages
    One goal is to place the items that are likely to change more often closer to the end of the Dockerfile. Here’s an example that you may find helpful: jetson-nano-sample-new/Dockerfile at master · balena-io-playground/jetson-nano-sample-new · GitHub

Thanks for the clarification,
I will try to adjust the image accordingly.
I am still curious at to why the delete-then-download strategy does not appear to work, but i might turn that into another thread

I managed to postpone the issue by reducing the size of the container to less than half of the availlable memory.
after creating the multi stage build, the core problem still persists.
i built 2 images, and then updated from the first to the second.

build 1:
build opencv+cuda / copy opencv+cuda / install the rest

build 2:
build opencv+cuda / "RUN apt clean" / copy opencv+cuda / install the rest

updating from build 1 to 2 still results in a download as large as the entire image. I am still relatively new to docker, so is this to be expected?

possibly relevant note: i do get the following message now after building Failed to generate deltas due to an internal error; will be generated on-demand

Hey @jap937

Could you please share the 2 different dockerfiles you used?

Looking at what you mentioned: build opencv+cuda / "RUN apt clean" / copy opencv+cuda / install the rest, the way docker cache works, if you run this command “further” up you dockerfile, subsequent cached layers are invalidated. This could explain why the delta size is still relatively large.

You can look at the Dockerfile best practices for multi-stage builds and Leverage build cache for more details.

if your build contains several layers, you can order them from the less frequently changed (to ensure the build cache is reusable) to the more frequently changed

Hope this is helpful

Thanks, this clarifies the delta behaviour a bit more.
i prefer not to share my entire Docker files, but i have created a test to exclude potential mistakes i made.
if this behavior is to be expected, i will work around it and keep it in mind for future development

I copied the example multistage cuda (1.9 Gb) and added a single line RUN apt-get update && apt-get clean. at the end of the build stage.
this resulted in the same behaviour with deltas the size of (mostly) the entire image.

storage bar at idle:
Screenshot from 2021-04-08 11-24-42

storage bar 100% downloaded
Screenshot from 2021-04-08 11-24-08

dockerfile 1

FROM balenalib/jetson-nano-ubuntu:bionic as buildstep

WORKDIR /usr/src/app

# Don't prompt with any configuration questions
ENV DEBIAN_FRONTEND noninteractive

# Install CUDA, CUDA compiler and some utilities
RUN \
    apt-get update && apt-get install -y cuda-toolkit-10-2 cuda-compiler-10-2 \
    lbzip2 xorg-dev \
    cmake wget unzip \
    libgtk2.0-dev \
    libavcodec-dev \
    libgstreamer1.0-dev \
    libgstreamer-plugins-base1.0-dev \
    libjpeg-dev \
    libpng-dev \
    libtiff-dev \
    libdc1394-22-dev -y --no-install-recommends && \
    echo "/usr/lib/aarch64-linux-gnu/tegra" > /etc/ld.so.conf.d/nvidia-tegra.conf && \
    ldconfig && \
    wget https://github.com/opencv/opencv/archive/4.0.1.zip && \
    unzip 4.0.1.zip && rm 4.0.1.zip

RUN \
    wget https://github.com/opencv/opencv_contrib/archive/4.0.1.zip -O opencv_modules.4.0.1.zip && \
    unzip opencv_modules.4.0.1.zip && rm opencv_modules.4.0.1.zip && \
    export CUDA_HOME=/usr/local/cuda-10.2/ && \
    export LD_LIBRARY_PATH=${CUDA_HOME}/lib64 && \
    PATH=${CUDA_HOME}/bin:${PATH} && export PATH && \
    mkdir -p opencv-4.0.1/build && cd opencv-4.0.1/build && \
    cmake -D WITH_CUDA=ON -D CUDA_ARCH_BIN="5.3"  -D BUILD_LIST=cudev,highgui,videoio,cudaimgproc,ximgproc -D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib-4.0.1/modules -D CUDA_ARCH_PTX="" -D WITH_GSTREAMER=ON -D WITH_LIBV4L=ON -D BUILD_TESTS=ON -D BUILD_PERF_TESTS=ON -D BUILD_SAMPLES=ON -D BUILD_EXAMPLES=ON -D CMAKE_BUILD_TYPE=RELEASE -D WITH_GTK=on -D BUILD_DOCS=OFF -D CMAKE_INSTALL_PREFIX=/usr/local .. && make -j32 && make install && \
    cp /usr/src/app/opencv-4.0.1/build/bin/opencv_version /usr/src/app/ && \
    cp /usr/src/app/opencv-4.0.1/build/bin/example_ximgproc_paillou_demo /usr/src/app/ && \
    cp /usr/src/app/opencv-4.0.1/build/bin/example_ximgproc_fourier_descriptors_demo /usr/src/app/ && \
    cd /usr/src/app/ && rm -rf /usr/src/app/opencv-4.0.1 && \
    mv opencv_contrib-4.0.1/samples/data/corridor.jpg /usr/src/app/ && \
    rm -rf /usr/src/app/opencv_contrib-4.0.1

FROM balenalib/jetson-nano-ubuntu:bionic as final

# Starting with a fresh new base image, but with access to files in previous build

# Uncomment if planning to use libs from here
#COPY --from=buildstep /usr/local/cuda-10.2 /usr/local/cuda-10.2
# Minimum CUDA runtime libraries
COPY --from=buildstep /usr/lib/aarch64-linux-gnu /usr/lib/aarch64-linux-gnu
# OpenCV runtime libraries
COPY --from=buildstep /usr/local/lib /usr/local/lib
# Demo apps
COPY --from=buildstep /usr/src/app/ /usr/src/app/

ENV DEBIAN_FRONTEND noninteractive

# Download and install BSP binaries for L4T 32.4.4
RUN apt-get update && apt-get install -y wget tar lbzip2 python3 libegl1 && \
    wget https://developer.nvidia.com/embedded/L4T/r32_Release_v4.4/r32_Release_v4.4-GMC3/T210/Tegra210_Linux_R32.4.4_aarch64.tbz2 && \       
    tar xf Tegra210_Linux_R32.4.4_aarch64.tbz2 && \
    cd Linux_for_Tegra && \
    sed -i 's/config.tbz2\"/config.tbz2\" --exclude=etc\/hosts --exclude=etc\/hostname/g' apply_binaries.sh && \
    sed -i 's/install --owner=root --group=root \"${QEMU_BIN}\" \"${L4T_ROOTFS_DIR}\/usr\/bin\/\"/#install --owner=root --group=root \"${QEMU_BIN}\" \"${L4T_ROOTFS_DIR}\/usr\/bin\/\"/g' nv_tegra/nv-apply-debs.sh && \
    sed -i 's/LC_ALL=C chroot . mount -t proc none \/proc/ /g' nv_tegra/nv-apply-debs.sh && \
    sed -i 's/umount ${L4T_ROOTFS_DIR}\/proc/ /g' nv_tegra/nv-apply-debs.sh && \
    sed -i 's/chroot . \//  /g' nv_tegra/nv-apply-debs.sh && \
    ./apply_binaries.sh -r / --target-overlay && cd .. \
    rm -rf Tegra210_Linux_R32.4.4_aarch64.tbz2 && \
    rm -rf Linux_for_Tegra && \
    echo "/usr/lib/aarch64-linux-gnu/tegra" > /etc/ld.so.conf.d/nvidia-tegra.conf && ldconfig

RUN apt-get update && apt-get install -y lbzip2 xorg
ENV UDEV=1
ENV LD_LIBRARY_PATH=/usr/local/lib
WORKDIR /usr/src/app/
CMD [ "sleep", "infinity" ]

dockerfile 2

FROM balenalib/jetson-nano-ubuntu:bionic as buildstep

WORKDIR /usr/src/app

# Don't prompt with any configuration questions
ENV DEBIAN_FRONTEND noninteractive

# Install CUDA, CUDA compiler and some utilities
RUN \
    apt-get update && apt-get install -y cuda-toolkit-10-2 cuda-compiler-10-2 \
    lbzip2 xorg-dev \
    cmake wget unzip \
    libgtk2.0-dev \
    libavcodec-dev \
    libgstreamer1.0-dev \
    libgstreamer-plugins-base1.0-dev \
    libjpeg-dev \
    libpng-dev \
    libtiff-dev \
    libdc1394-22-dev -y --no-install-recommends && \
    echo "/usr/lib/aarch64-linux-gnu/tegra" > /etc/ld.so.conf.d/nvidia-tegra.conf && \
    ldconfig && \
    wget https://github.com/opencv/opencv/archive/4.0.1.zip && \
    unzip 4.0.1.zip && rm 4.0.1.zip

RUN \
    wget https://github.com/opencv/opencv_contrib/archive/4.0.1.zip -O opencv_modules.4.0.1.zip && \
    unzip opencv_modules.4.0.1.zip && rm opencv_modules.4.0.1.zip && \
    export CUDA_HOME=/usr/local/cuda-10.2/ && \
    export LD_LIBRARY_PATH=${CUDA_HOME}/lib64 && \
    PATH=${CUDA_HOME}/bin:${PATH} && export PATH && \
    mkdir -p opencv-4.0.1/build && cd opencv-4.0.1/build && \
    cmake -D WITH_CUDA=ON -D CUDA_ARCH_BIN="5.3"  -D BUILD_LIST=cudev,highgui,videoio,cudaimgproc,ximgproc -D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib-4.0.1/modules -D CUDA_ARCH_PTX="" -D WITH_GSTREAMER=ON -D WITH_LIBV4L=ON -D BUILD_TESTS=ON -D BUILD_PERF_TESTS=ON -D BUILD_SAMPLES=ON -D BUILD_EXAMPLES=ON -D CMAKE_BUILD_TYPE=RELEASE -D WITH_GTK=on -D BUILD_DOCS=OFF -D CMAKE_INSTALL_PREFIX=/usr/local .. && make -j32 && make install && \
    cp /usr/src/app/opencv-4.0.1/build/bin/opencv_version /usr/src/app/ && \
    cp /usr/src/app/opencv-4.0.1/build/bin/example_ximgproc_paillou_demo /usr/src/app/ && \
    cp /usr/src/app/opencv-4.0.1/build/bin/example_ximgproc_fourier_descriptors_demo /usr/src/app/ && \
    cd /usr/src/app/ && rm -rf /usr/src/app/opencv-4.0.1 && \
    mv opencv_contrib-4.0.1/samples/data/corridor.jpg /usr/src/app/ && \
    rm -rf /usr/src/app/opencv_contrib-4.0.1

RUN apt-get update && apt-get clean

FROM balenalib/jetson-nano-ubuntu:bionic as final

# Starting with a fresh new base image, but with access to files in previous build

# Uncomment if planning to use libs from here
#COPY --from=buildstep /usr/local/cuda-10.2 /usr/local/cuda-10.2
# Minimum CUDA runtime libraries
COPY --from=buildstep /usr/lib/aarch64-linux-gnu /usr/lib/aarch64-linux-gnu
# OpenCV runtime libraries
COPY --from=buildstep /usr/local/lib /usr/local/lib
# Demo apps
COPY --from=buildstep /usr/src/app/ /usr/src/app/

ENV DEBIAN_FRONTEND noninteractive

# Download and install BSP binaries for L4T 32.4.4
RUN apt-get update && apt-get install -y wget tar lbzip2 python3 libegl1 && \
    wget https://developer.nvidia.com/embedded/L4T/r32_Release_v4.4/r32_Release_v4.4-GMC3/T210/Tegra210_Linux_R32.4.4_aarch64.tbz2 && \       
    tar xf Tegra210_Linux_R32.4.4_aarch64.tbz2 && \
    cd Linux_for_Tegra && \
    sed -i 's/config.tbz2\"/config.tbz2\" --exclude=etc\/hosts --exclude=etc\/hostname/g' apply_binaries.sh && \
    sed -i 's/install --owner=root --group=root \"${QEMU_BIN}\" \"${L4T_ROOTFS_DIR}\/usr\/bin\/\"/#install --owner=root --group=root \"${QEMU_BIN}\" \"${L4T_ROOTFS_DIR}\/usr\/bin\/\"/g' nv_tegra/nv-apply-debs.sh && \
    sed -i 's/LC_ALL=C chroot . mount -t proc none \/proc/ /g' nv_tegra/nv-apply-debs.sh && \
    sed -i 's/umount ${L4T_ROOTFS_DIR}\/proc/ /g' nv_tegra/nv-apply-debs.sh && \
    sed -i 's/chroot . \//  /g' nv_tegra/nv-apply-debs.sh && \
    ./apply_binaries.sh -r / --target-overlay && cd .. \
    rm -rf Tegra210_Linux_R32.4.4_aarch64.tbz2 && \
    rm -rf Linux_for_Tegra && \
    echo "/usr/lib/aarch64-linux-gnu/tegra" > /etc/ld.so.conf.d/nvidia-tegra.conf && ldconfig

RUN apt-get update && apt-get install -y lbzip2 xorg
ENV UDEV=1
ENV LD_LIBRARY_PATH=/usr/local/lib
WORKDIR /usr/src/app/
CMD [ "sleep", "infinity" ]

Hey there

thanks for providing the additional details.

Can you please clarify storage bar at idle? Is that when you just provisioned a device and there are no releases yet?

From the Dockerfiles you shared, as far I as know, the build cache should not be invalidated. In fact, it should not also affect the final image size, since this clears the cache in the build stage image and only the final stage image to sent to the device.

I think this looks like an issue with generating deltas itself or some other issue and i think it’s worth investigating further.

Would you might provisioning a device with the latest os image available for this device so we can start from scratch?

Thanks for your help

with idle, i mean before and after the update when the disk usage is minimal.
I will flash a new device with the latest version(2.69+rev1), but i have been reflashing modules frequently with version 2.67.3+rev2 due to another issue.

Thanks, please do reach back when you do. Happy to help!