/mnt/data is full - not because of application

hey guys,
i am having issues with device uuid : 63cd82358b01ebd927a8cc32e07a8889 - the /mnt/data is full, but it’s not because of application content.

running: [balenaOS 2.48.0+rev1] - i granted support access - can you please point me to what i am doing wrong?

I tried “balena system prune” as advised in another forum post but it said it reclaimed 0b

same thing is happening with device 8c86b19de34a413f6483a06aa02205a5 but balena engine isn’t running there. - granted support access.

i would appreciate it if you also told me what you did to fix it, so I don’t bother in the future :slight_smile:

thanks in advance,

Hi, the data partition is indeed full. I had a quick look at its content and the bulk of it are docker application layers. I would suggest looking at balena images and cleaning up unused images, but the balena-engine is not running and cannot start probably because of the lack of space in the data partition.
My approach would be to remove the docker layers, restart the engine and the supervisor and let the application to be re-downloaded. However, this means removing application data from the device so it might lead to data loss, and also will stop the application from working for a while.
There have been roadmap discussion about monitoring the disk space and let the administrator know once it reaches a threshold, but unfortunately I don’t have an ETA for this.
Let me know how you want to proceed,

there’s no serious loss if the docker layers are removed and the app re-downloads, so you can proceed. just let me know of the things you did please, so if this re-occurs - i can fix it. thanks again.

Hi again, so as neither the balena engine or supervisor were running I went ahead and removed the /var/lib/balena/{overlay2,containers,image,tmp} to make some space in /mnt/data. After that I restarted the balena engine and re-installed the supervisor and the application is downloading now.
As I mentioned before, this was a extreme case as balena-engine had stopped and could not be started. We would usually analyze the docker layer and remove the ones that are not needed any more.
Once the application finishes downloading the device should be functional. Please let us know if there is any further problem.

07.04.20 19:49:00 (+0200) Supervisor starting
07.04.20 19:49:03 (+0200) Downloading image ‘registry2.balena-cloud.com/v2/53f214ef7d1e0201165421461a43b70d@sha256:0e6d84b1e090126dfe1c5f5cbb877ae6cd88056814709833d7cc075e87020e6d
07.04.20 19:54:50 (+0200) Failed to download image ‘registry2.balena-cloud.com/v2/53f214ef7d1e0201165421461a43b70d@sha256:0e6d84b1e090126dfe1c5f5cbb877ae6cd88056814709833d7cc075e87020e6d’ due to ‘connect ECONNRESET /var/run/balena-engine.sock’
07.04.20 19:55:17 (+0200) Supervisor starting
07.04.20 19:55:21 (+0200) Downloading image ‘registry2.balena-cloud.com/v2/53f214ef7d1e0201165421461a43b70d@sha256:0e6d84b1e090126dfe1c5f5cbb877ae6cd88056814709833d7cc075e87020e6d

doesn’t look like it’s working tho.

i rebooted it, to see if it picks itself up.

nop doesn’t work even after rebooting:

07.04.20 20:01:37 (+0200) Supervisor starting
07.04.20 20:01:37 (+0200) Downloading image ‘registry2.balena-cloud.com/v2/53f214ef7d1e0201165421461a43b70d@sha256:0e6d84b1e090126dfe1c5f5cbb877ae6cd88056814709833d7cc075e87020e6d
07.04.20 20:10:18 (+0200) Failed to download image ‘registry2.balena-cloud.com/v2/53f214ef7d1e0201165421461a43b70d@sha256:0e6d84b1e090126dfe1c5f5cbb877ae6cd88056814709833d7cc075e87020e6d’ due to ‘connect ECONNRESET /var/run/balena-engine.sock’
07.04.20 20:11:00 (+0200) Supervisor starting
07.04.20 20:11:01 (+0200) Downloading image ‘registry2.balena-cloud.com/v2/53f214ef7d1e0201165421461a43b70d@sha256:0e6d84b1e090126dfe1c5f5cbb877ae6cd88056814709833d7cc075e87020e6d

Hi, I can see this device is in the process of updating. It’s quite slow and I’ve been having some issues connecting to the device over the VPN. What are the network conditions like where this device is, though it looks like it should complete without any intervention reasonably shortly.

not really, that’s the logs i pasted above. the machine goes up to like 40 50% then the download restarts. you can check the timestamps.

This looks network related in that the download can’t complete and is restarting. In addition, as I noted previously I’m having a hard time connecting to the device. Does this device have a stable network connection?

yeah there’s 8 other devices on that network. the other 8 updated balenaOS just fine as well as app deployed today. only these 2 i mentioned above are stuck in this download loop. i can provide access to another machine that’s on the same net - working fine if you want.

Hi, I managed to run the device diagnostics on these devices and they are both experiencing very slow disk writes. This is likely manifesting itself in other issues such as my inability to reliably connect to the device. What SD cards are the devices running, as it is possible they are failing. Do you have the ability to swap out one of the SD cards to test?

yeah we are going to replace the sd cards. meanwhile i did a rm on /mnt/data/docker and issued a reboot from the console, so everything gets recreated. so far it’s downloading fine. will update once it’s done.