Failed to download image; no space left on device, jetson agx xavier

avinash · February 20, 2021, 1:50am

Hi,
I’m new to balena cloud. I just added a Jetson AGX Xavier device to my application, flashed the jetson using jetson-flash with this image (balena-cloud-XAVIER-jetson-xavier-2.67.3+rev5-dev-v12.3.0.img) and pushed the hello, world c++ sample app using balena-cli. The push command was successful and the device’s target release gets updated. However, when the device tries to download the image, I get the following error:
Failed to download image 'registry2.balena-cloud.com/v2/1c20ca3cbf0f43b574ed193512392af1@sha256:7362eefcd868a06e46ad8e046f79a4ce904af0309459ec103ccd5edd4e418802' due to 'failed to register layer: Error processing tar file(exit status 1): write /usr/lib/aarch64-linux-gnu/libcrypto.so.1.1: no space left on device'
So it looks like the docker pull is failing because the device has insufficient memory.
I believe the xavier should have enough memory to download the image and run it.
Can someone help me with this? Thanks in advance.

the-real-kenna · February 20, 2021, 2:50am

Hi there,
Can you confirm if this is where you’re getting the Hello World code and the instructions you’ve been following up to this point? Get started with Nvidia Jetson Xavier and C++ - Balena Documentation
Thanks!

avinash · February 20, 2021, 2:55am

Yes, I have been following the instructions from that link and I downloaded the hello world code from the link in that guide. However, the Dockerfile.template included in the zip file that I downloaded was using Resin which did not work, so I replaced resin with balenalib to get the build to work.

the-real-kenna · February 20, 2021, 2:59am

Thanks for confirming, and for telling me about the outdated reference to Resin - I’ll work on getting that updated.

avinash · February 20, 2021, 3:03am

Don’t know if this helps but I thought I’ll share anyway. I don’t know why the storage shows only 169MB. balena_dashboard

the-real-kenna · February 20, 2021, 3:04am

That file is really small, and wouldn’t have taken up enough space to cause the error you’re seeing. Neither should our base image since we built it for the Xavier. I’m wondering if there was a hiccup during the pull… is there any chance you could try pulling it again for me to see if you see the same? That error feels like something else is going on, but I would still like to give a second pull a shot if you have the ability…

the-real-kenna · February 20, 2021, 3:05am

Oh interesting…

the-real-kenna · February 20, 2021, 3:06am

Could you go to the Diagnostics page and run the Healthchecks to see if it’s showing anything wrong with storage?

avinash · February 20, 2021, 3:12am

I’m running the health check now. The xavier did try pulling multiple times, I tried rebooting, re-building the image and pushing it again but it failed everytime.

avinash · February 20, 2021, 3:16am

I get the following error when I run the healthchecks:
An error occurred while querying checks data: Bus n/a: changing state UNSET → OPENING Bus n/a: changing state OPENING → AUTHENTICATING Bus n/a: changing state AUTHENTICATING → RUNNING Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1/unit/balena_2eservice interface=org.freedesktop.DBus.Properties member=Get cookie=1 reply_cookie=0 signature=ss error-name=n/a error-message=n/a Got message type=method_return sender=org.freedesktop.systemd1
Health check output json:
{"diagnose_version":"4.20.23","checks":[{"name":"check_balenaOS","success":true,"status":"Supported balenaOS 2.x detected"},{"name":"check_container_engine","success":true,"status":"No container_engine issues detected"},{"name":"check_localdisk","success":true,"status":"No localdisk issues detected"},{"name":"check_memory","success":true,"status":"93% memory available"},{"name":"check_networking","success":true,"status":"No networking issues detected"},{"name":"check_os_rollback","success":true,"status":"No OS rollbacks detected"},{"name":"check_service_restarts","success":true,"status":"No services are restarting unexpectedly"},{"name":"check_supervisor","success":false,"status":"Supervisor is running, but may be unhealthy"},{"name":"check_temperature","success":false,"status":"Some temperature issues detected: \ntest_current_temperature Temperature above 80C detected (/sys/class/thermal/thermal_zone4)"},{"name":"check_timesync","success":true,"status":"Time is synchronized"}]}

avinash · February 20, 2021, 3:25am

Don’t know if this helps df -h output when I ssh in to the xavier:

root@a6e8f9e:~# df -h
Filesystem                      Size  Used Avail Use% Mounted on
devtmpfs                         16G     0   16G   0% /dev
tmpfs                            16G  172K   16G   1% /tmp
/dev/disk/by-state/resin-rootA  461M  336M   98M  78% /mnt/sysroot/active
/dev/disk/by-state/resin-state   19M  391K   17M   3% /mnt/state
overlay                         461M  336M   98M  78% /
/dev/mmcblk0p41                 170M   79M   79M  51% /mnt/data
tmpfs                            16G     0   16G   0% /dev/shm
tmpfs                            16G   39M   16G   1% /run
tmpfs                            16G     0   16G   0% /sys/fs/cgroup
/dev/mmcblk0p37                 120M   54M   66M  45% /mnt/boot
tmpfs                            16G   28K   16G   1% /var/volatile
/dev/mmcblk0p39                 461M  2.3M  431M   1% /mnt/sysroot/inactive

the-real-kenna · February 20, 2021, 3:26am

Okay, darn. At first I thought maybe it was a partitioning error that happened during the imaging process, but now I’m wondering if that’s not the case. I’m sending this question to our broader team since I see both high temperature and Supervisor warnings. I also see you’re working with Ross on our Customer Success team, so I’ve shared this thread with him for visibility on his end. We’ll get back to you as soon as the broader team has had a chance to review this and can provide some suggestions.

avinash · February 20, 2021, 3:29am

Ok, thank you for looking into this. At least while touching the xavier it is pretty cool and when I was working with the jetpack image it used to get a lot warmer. So I think that diagnostic message may not be accurate. I have shared the thread with Ross as well.

the-real-kenna · February 20, 2021, 3:37am

That’s interesting… 80C would definitely not be cool to the touch. Something’s off for sure. Thanks for sharing the extra info Avi. We’ll be in touch with more info (and probably more questions) soon.

avinash · February 20, 2021, 3:39am

Ok, thank you! I’ll try re-flashing the image in the mean time, since, I haven’t tried that yet.

avinash · February 20, 2021, 3:55am

I tried re-flashing, but unfortunately it behaves the same. Same error when it tries to pull the image and same errors when I run diagnostics. And the storage size is still ~170MB (/dev/mmcblk0p41 170M 79M 79M 51% /mnt/data)

the-real-kenna · February 20, 2021, 4:00am

Thanks for sharing that, it’s very good info to have as we continue troubleshooting and trying to determine root cause.

In addition, I had this nagging feeling I had seen a high-temperature listed on NVIDIA devices before but seeing from them somewhere that it was considered normal, and I finally found the reference to it. Interestingly, when thermal_zone4 is read using the driver value, there are some cases where it can be fixed to indicate a normal status. I don’t know more about this than what I’m seeing here, but at least wanted to share it with you while I was thinking about it and in case something similar might be applied to Xavier AGX. thermal_zone4 reports 100 degree celcius ? - #5 by sjlin - Jetson TX1 - NVIDIA Developer Forums

avinash · February 20, 2021, 4:18am

I don’t know what to make of that either since I don’t know what thermal zone your diagnostic queries to generate that report. However, I can confirm for you that thermal zone4 type on AGX Xavier is PMIC and it has a temp value of what I think is 100C.

root@3c882a0:~# cat /sys/class/thermal/thermal_zone0/type
CPU-therm
root@3c882a0:~# cat /sys/class/thermal/thermal_zone1/type
GPU-therm
root@3c882a0:~# cat /sys/class/thermal/thermal_zone2/type
AUX-therm
root@3c882a0:~# cat /sys/class/thermal/thermal_zone3/type
AO-therm
root@3c882a0:~# cat /sys/class/thermal/thermal_zone4/type
PMIC-Die
root@3c882a0:~# cat /sys/class/thermal/thermal_zone4/temp
100000
root@3c882a0:~# cat /sys/class/thermal/thermal_zone0/temp
28500

the-real-kenna · February 20, 2021, 4:37am

That’s great to know, thanks for sharing Avi.

In addition, we’ve got a PR setup to update the Getting Started page. Thanks again for letting us know it was outdated. Outdated base image referenced in Getting Started for Jetson Xavier · Issue #1595 · balena-io/docs · GitHub

rosswesleyporter · February 20, 2021, 11:32pm

Hello Avi. As mentioned, I asked our devices team for their insights on the storage question and we’ll let you know as soon as we have a suggestion for you. BTW, thanks for sending such good diagnostic information.

Topic		Replies	Views
Jetson TX2 4GB Support balenaOS jetson	30	1090	June 24, 2021
Issue flashing Jetson TX2 R32.2 over USB Product support	15	3079	April 7, 2020
Configuring balena os for the jetson-xavier-nx-devkit-emmc openBalena	23	1121	March 10, 2022
updating fails jetson emmc Product support docker , jetson	19	966	April 15, 2021
balena push fails to work with Jetson Xavier NX balenaOS	10	909	June 16, 2021

Failed to download image; no space left on device, jetson agx xavier

Related topics