/mnt/data/docker/aufs/diff is full

Did you run that cleanup script from the host OS?

Yes, of course.

AFAIK it’s the only way to access the large build up of non-cleared containers.

Can you also make sure your application itself does the cleanup?

  1. It’s not the applications job to cleanup after resin-supervisor
  2. It’s not possible for the container to access the host os’s root filesystem and other resources in order to perform the operations without major hacks.
  3. The bash script approach is horrifically hacky

What I meant was if your application is writing a lot of data in the user container then it should also do some housekeeping. No need for the user container to do something in the host OS. Just to make sure your application is not filling up the space it’s own container

I don’t think you understand the problem, see the ls output provided where it shows there are over 500 excess layers. All of these are previous instances of the containers layers.

This has nothing to do with data written from the container.

I understand the issue, let me ask the supervisor guys if it’s possible for the /var/lib/balena/aufs/diff folder to contain this many layers of the user container application; this location is supposed to hold all of the layers your application container contains, so it’s normal to be more than one there, just not sure if it can reach that many; also, did you do many updates to your application? that may explain the large number of writable layers in that directory

It’s entirely possible for this device to have had alot of releases, probably around 50-100. It’s been around since near the start of our project.

Our container had 5 layers last I checked. Two layers are ours (run & cmd) with the run being invalidated on every build (resin-nocache) and delta updated.

Hey @SplitIce, looking at the output of ls, it appears that all of the leftover diffs are from July 4th and 5th. It could be possible that the supervisor or balena got into a weird state with a release from around tha time frame.

Can you remember any weirdness going on at around that time? Perhaps a bug in application code that created more data than usual. Regardless this is something that should have been automatically handled, and we’re going to investigate.

In the meantime, a fix for this is to clear out the docker directory. Unfortunately, without knowing which diff is for which container, you’d have to remove the user images and containers too. I can do this for you if you like, with a dashboard link and support access enabled. Alternatively, and for the benefit of the thread, the commands would be:

systemctl stop resin-supervisor
systemctl stop balena
rm -rf /var/lib/docker/{aufs,diff,overlay,containers,image,tmp}
systemctl start balena
update-resin-supervisor
3 Likes

All our devices have some degree of wastage in this folder. (or /var/lib/docker/overlay for those running overlayfs). Skimming some of those I can see this hub was alone in it’s storage of July 4th and 5th. It’s possible the internet was unstable that day or something.

If I was to guess I’d say supervisor is not cleaning up old container layers.

We are still seeing this with Resin OS 2.14.3+rev5 / supervisor 7.19.4 (although the storage utilisation has moved to /mnt/data/docker/overlay2)

Had a device with a 330MB application and 600MB of data hit 100% (8GB storage) today.

Tracked it down to excess overlays in /mnt/data/docker/overlay2. At a guess it’s failed updates not being cleared as this device has had periods of instability (it’s a staging device).

Hello @SplitIce

Can you please enable support access and provide us the device uuid in a private message?

@SplitIce I’d recommend using a latest balenaOS version. v2.14.3 was taken down from production and shouldn’t be used. There was an issue where balena-engine would keep trying to download updates and fail…

Hi,

We are experiencing this same issue, and are not really understanding why is it happening.
Yesterday we cleaned up the folder:
/mnt/data/docker/aufs/diff

and at that point in time we had 550M on storage, today it already had the storage fill up, directory /mnt/data/docker/aufs/diff was around 6.4G.

Does someone have a clue why this happens? we did no deploys during this period.

Thanks

Here goes a printout of the /mnt/data/docker/aufs/diff

root@b4bb522:/mnt/data/docker/aufs/diff# ls -lat
total 316
drwxr-xr-x 9 root root 32768 Jul 5 21:04 cb37f1b96eb3a18d64fa959bfdc88da85e79347e24578d5a5e4cb07815ccb5a1
drwx------ 70 root root 12288 Jul 5 21:04 .
drwxr-xr-x 6 root root 4096 Jul 5 14:07 cb37f1b96eb3a18d64fa959bfdc88da85e79347e24578d5a5e4cb07815ccb5a1-init
drwxr-xr-x 7 root root 4096 Jul 5 13:02 3ad7d9dd991818415e1bf415a236128bb5ab994641f70c6f800080a0d203879d
drwxr-xr-x 6 root root 4096 Jul 5 13:02 3ad7d9dd991818415e1bf415a236128bb5ab994641f70c6f800080a0d203879d-init
drwxr-xr-x 3 root root 4096 Jul 5 13:02 98fc9a9aaef501f60bae2a7b762df6e7b3721661f06e0d030d922fbce02a06ac
drwxr-xr-x 9 root root 4096 Jul 5 13:01 bd53dbcff43ce49fdddb664a2d5885442ba803c407b862f17baebd631d36f69f
drwxr-xr-x 6 root root 4096 Jul 5 13:01 bd53dbcff43ce49fdddb664a2d5885442ba803c407b862f17baebd631d36f69f-init
drwxr-xr-x 5 root root 4096 Jul 5 11:39 f770c7917c43e64c034468437821bd337fc0956ee5efc82b41ea60939ecf06fc
drwxr-xr-x 8 root root 4096 Jul 5 11:39 f770c7917c43e64c034468437821bd337fc0956ee5efc82b41ea60939ecf06fc-init
drwxr-xr-x 10 root root 4096 Jul 5 11:39 d4d7acd3f3f6489ae04631ed32aba91e3ad2acc90d3dae34137469230e61f911
drwxr-xr-x 19 root root 4096 Jul 5 10:47 f19c0fb62faa6dcdc11fd968aa40febab83577bd8e16105474eaa9ec3c565a13
drwxr-xr-x 9 root root 4096 Jul 5 10:00 32a79cdc1adacda7332854301d36e6124573f2c773b2fd7029d74d91f8311ba7
drwxr-xr-x 3 root root 4096 Jul 5 10:00 d31e54dcf13d5c7d9c7b31fa7cf59f2a61866f6ea9021a2666babecd850e5fa8
drwxr-xr-x 3 root root 4096 Jul 5 10:00 9f1181bdec039958e45e831e8698dee29ecfe0aca8e8f2178da938fc900d87d7
drwxr-xr-x 3 root root 4096 Jul 5 10:00 8f4d16a7dbab60b4d3fe17fd38a0e2c8890bedc115f8d4fe8c5f85430846cb3a
drwxr-xr-x 3 root root 4096 Jul 5 10:00 660746d5e74731d0377df7b07989360f657f9987f9286e74814028f44185540a
drwxr-xr-x 5 root root 4096 Jul 5 10:00 0e73b56a2e8cd974b965cfb3ebf6767e4ac568eee20d09fc087f867ee06661d1
drwxr-xr-x 3 root root 4096 Jul 5 10:00 051b2f467a9e2a046aaa6215637bec849b07c4d06669f6d1411221411fa3caf7
drwxr-xr-x 3 root root 4096 Jul 5 10:00 aecf0836207880b5ebcd6b1b0f397a7c673ae4d57d8587c3539f4fdd71f18359
drwxr-xr-x 3 root root 4096 Jul 5 10:00 8bad7c4b5a13d41695b9946ddbbb5997c7f386cbd1699ba9f337e3ad50de4984
drwxr-xr-x 3 root root 4096 Jul 5 10:00 cbe81c768fce8e923414efaed12fdda8bbcede758ad8b4c23156c8fd81a47ba0
drwxr-xr-x 8 root root 4096 Jul 5 09:59 163de25ad07e83d52f71fb6b1de476fb9956a0588527caefa103475df4fe068e
drwxr-xr-x 6 root root 4096 Jul 5 09:59 3e0f82968bf562cb1a295690942343080631d16c9800df25158b1588b67c04c5
drwxr-xr-x 4 root root 4096 Jul 5 09:59 19c63668e805186defafb261fe25dccc19414f380ad048cc056cc9d74d528f50
drwxr-xr-x 4 root root 4096 Jul 5 09:59 24c89ff16a97c04243136e2ab826fc8b32632ed7c693accc6e6ea4cd28515a38
drwxr-xr-x 5 root root 4096 Jul 5 09:59 33263617eda49f7303f2f28e69b7d34c5b56dfd0498c0aaf452910447d5db460
drwxr-xr-x 3 root root 4096 Jul 5 09:59 81395918e9b3a15742bdd99cab76f115d401f6db6207f91ff0e904a20a91bc3d
drwxr-xr-x 3 root root 4096 Jul 5 09:59 ed2ba6fe155d732cec5ef456d1a9b09e27fe0119bf0a8317d132cb0303997857
drwxr-xr-x 3 root root 4096 Jul 5 09:59 5c083aae1f5f54086f2b9f35516a18897a4a9e970b85b83eebca32faa954a2a3
drwxr-xr-x 3 root root 4096 Jul 5 09:59 23b31574c4e5ba972e80211b0820ea07a1923ed5ce29ac8342b0aacf8a511e27
drwxr-xr-x 8 root root 4096 Jul 5 09:59 7a6c321d6884c57207a94684fd93537ee29b73e6ab19e4dba4e707d48621d9c5
drwxr-xr-x 8 root root 4096 Jul 5 09:59 c0dc41d8ddaa66e5f7479c2b59cc6d9b2e4a8870d0eb005cff97e039cc63995c
drwxr-xr-x 3 root root 4096 Jul 5 09:59 f1580939eb06a71be4e0d7aff025b1f77d20bb42ee29e12837028a638deadeb8
drwxr-xr-x 3 root root 4096 Jul 5 09:59 6f5c696dcad3fedba8b9036f0d85c7d58b4d78c24e9cfd953535e2e7b7c67d2b
drwxr-xr-x 3 root root 4096 Jul 5 09:59 4d1c08c90e8e1cfe8fa7ada94fa13023b55d5276be4c0488acfc92979750e7fa
drwxr-xr-x 3 root root 4096 Jul 5 09:59 d06b885e2f2721f325d5b2a25a7bceef75ce4b1596c02dcae3535f183ceeea37
drwxr-xr-x 3 root root 4096 Jul 5 09:59 49bd282745692f095f0c944f748ec73a0e6062259b13516e9c800492864e83a8
drwxr-xr-x 3 root root 4096 Jul 5 09:59 f19b033d5b9698fc06a9998ba07368490e47a527e1a891bb980a8385b8f6e653
drwxr-xr-x 3 root root 4096 Jul 5 09:59 f2911a9e387807664db9484ca4a8629ff40f17c3b963c58ecc032c96e7dede03
drwxr-xr-x 3 root root 4096 Jul 5 09:59 303b6e201bbba217cfce6b6051bb34795bda6c8edd3694634da05f2832cc20c5
drwxr-xr-x 3 root root 4096 Jul 5 09:59 ffbff63de064a8c4bb92ba7b764308351e29d3734e8f3e7c352f439a09605813
drwxr-xr-x 3 root root 4096 Jul 5 09:59 30f34007a7a9df55b1f971048de7e1130795ada779e4062861143fa84096d02a
drwxr-xr-x 3 root root 4096 Jul 5 09:59 ea8ff9dea0de2f563856de081207c2bba1aff12c9fed4b1b090137a8957661eb
drwxr-xr-x 3 root root 4096 Jul 5 09:59 993c945e6417d12cd83a36228eb6b41d340e20dd38a5ce894584ed67ef3fe17f
drwxr-xr-x 9 root root 4096 Jul 5 09:59 a0c3ecc61797e33e6c8a37007f5e9869071d8b1a0631ce1a9313304af4dec792
drwxr-xr-x 9 root root 4096 Jul 5 09:58 376dc9d2c7748b34e8a84b005344107e1cbf88fb6861a31ddc56e7b78564acfb
drwxr-xr-x 3 root root 4096 Jul 5 09:58 f2b93b66aaaa6486a1f57d66a05c7637d5a1fe7a2f90e835c5f8272ca53ffdf3
drwxr-xr-x 3 root root 4096 Jul 5 09:58 462348c2b8a0af7f1a6640b20872a7f04acd71c3d1c8cfdd03e2a0c9b42cabc8
drwxr-xr-x 19 root root 4096 Jul 5 09:58 a6a063509f905fe215798ff40f114790efc2b61fe204709a581b07ed3fc7f149
drwxr-xr-x 19 root root 4096 Jul 5 09:58 0a4582dac868a9e7324c358e789720522ab7afaad693d9ae40aefaa822b268b1
drwxr-xr-x 9 root root 4096 Jul 5 09:58 e1ea12c788e203cd6c1749f9df33e53e6cbbd8347d3435d878218f57049b5ed2
drwxr-xr-x 6 root root 4096 Jul 5 09:58 e1ea12c788e203cd6c1749f9df33e53e6cbbd8347d3435d878218f57049b5ed2-init
drwxr-xr-x 4 root root 4096 Jul 5 09:58 07df1fd481bf6680f846bc435261d21e6707dfe7e424eb89e77d2dacd7a5b04a
drwxr-xr-x 3 root root 4096 Jul 5 09:58 e84442ec3bb6120df3b265e9d410a7ecc21f0e09ae31e16a3e54a5164c65e985
drwxr-xr-x 4 root root 4096 Jul 5 09:58 894bf77debf7734f75c1ef5c8d480e6bec702b9ed3e03eeaa9f1fc9785a589f1
drwxr-xr-x 3 root root 4096 Jul 5 09:58 449a4fea1469486a99e780cc526b1a78cb0531d245b76c6736ab6a26c9c4e828
drwxr-xr-x 3 root root 4096 Jul 5 09:58 f800dc5f00322c8bcb4daf222c3dba663527e36ff4af75019b5c0c0006c187f5
drwxr-xr-x 3 root root 4096 Jul 5 09:58 4d836c20be1c179f108f64f56e8caca558425fab136f6012c385caaf18f1161d
drwxr-xr-x 3 root root 4096 Jul 5 09:58 3e20089aea27e96de64bd22499a74d74428961de45d6019cce0ae1dce3477cd6
drwxr-xr-x 3 root root 4096 Jul 5 09:58 b0f1b8e4cfbf22460648b6aeea55f46bdc2d8eaec6ebb9c9669ba0be182aa88d
drwxr-xr-x 3 root root 4096 Jul 5 09:58 a0025101dd313cbf40188d4dc63f2aacbaf8b13fc35c04ac313b9343cd7218af
drwxr-xr-x 9 root root 4096 Jul 5 09:58 e6009ee2bc79c1bffab0586b6180cc835a3feb80314097f8b4d48c6750f18dd5
drwxr-xr-x 3 root root 4096 Jul 5 09:57 811268151cd295e380dd14988127fa8c482a19d2f4a98a8373b20b8b607e8b2f
drwxr-xr-x 3 root root 4096 Jul 5 09:57 82442018ecc38ae325d84b58d64ee77744ee80c8ee56b2a853ab4236df6dcc24
drwxr-xr-x 6 root root 4096 Jul 5 09:57 cde1cea1d288d0565f32ed75008e7c3192ccc5a4902f2da8816e6178251c8df9
drwxr-xr-x 3 root root 4096 Jul 5 09:57 1a2945c92e9b327f672389d2e109d5dd27f12dd3f661ee9876d4e2d0fba23f70
drwxr-xr-x 19 root root 4096 Jul 5 09:57 94b9091d9ff37d521e2385ee9fb47cd9749e7b0f6c153a98424adf5155b1b76f
drwxr-xr-x 2 root root 4096 Jul 5 09:57 be6f166af0475e0e1d165df315a90e0639308583a5ae5386429d460a44ff97ec

Whomever is coming upon this set of commands in the future, please exercise caution when attempting them as this will result in data loss and may have unintended consequences. It is meant as a “nuke everything” option when no other debug attempt works. Instead, please make a post in forums or our paid support and balena support agents or the community will attempt to help.

If you must use the command though, the correct command sequence is as below:

systemctl stop balena-supervisor
systemctl stop balena
rm -rf /var/lib/docker/{aufs,overlay2,containers,image,tmp}
systemctl start balena
update-balena-supervisor

Corrected typo overlayoverlay2 .
Replace balena with resin if you have an older OS. To verify whether resin or balena works, run the first command only and view the results. If it errors, replace balena with resin.