I’ve created a small benchmark application that uses TensorFlow Lite to run 100 inferences on a toy neural network (model.tflite). The benchmark demonstrates a significant slowdown (75%) on Jetson Nano when upgrading balenaOS from 2.45 to 2.47. I have no idea what is causing it. It seems that calling the invoke function does something in the OS that got much worse when moving from 2.45 to 2.47 (production images). I’m not very familiar with the OS itself, so it would be great if anybody has an idea what’s causing it, and more importantly, if and how it could be fixed. To reproduce my results (see below), use the following commands (assuming you have an appropriate balena application <app-name> set up):
After doing 100 inferences, the benchmark script will print out the minimum, maximum and average time it takes per inference. My results are pasted here:
I can’t find 2.47.1+rev1 in the dashboard for the nano, only 2.47.1+rev3. A few questions to clarify:
Do you use a custom build?
Did you flash 2.47.1+rev3 to the sd-card, or did a hostOS update to 2.47.1+rev3?
Is the sample app running on the nano or on an M2 coral tpu connected to the nano?
Tags 2.45.1+rev3 and 2.47.1+rev1 are both on L4T 32.2, so there shouldn’t be any difference, however 2.47.1+rev3 is a production image which upgraded to L4T 32.3.1.
I wonder if it’s L4T 32.3.1 that’s causing the slowdown with that particular tflite version.
Sorry for the typo, I indeed meant 2.47.1+rev3. Regarding your questions:
No, we are not using a custom build. It’s the Jetson Nano Dev Kit.
For both runs, I flashed the OS to (the same) SD card.
The sample app is running on the CPU of the Nano itself. No peripherals were attached.
A change in L4T version could definitely impact it! My initial thought was that the power settings have changed; the power mode of the device can be adjusted and perhaps 2.47.1 is using a very conservative mode by default.
I’ve tried your example on my sd-card Jetson Nano, checked with a barrel jack adapter 5V 5A as well as with a 2A charger on the micro-usb port. I ran tests directly with 2.47.1+rev3:
Hi @acostach, I’m surprised to see that you do get good results, even though we are both using a Jetson Nano Dev kit and the same software. I’ve received another Jetson Nano Dev kit today, and I still get the lower inference speeds on balenaOS 2.47. This time, I have powered the board from USB (5V, 2A). I’ve also taken a snapshot of top while it was running the benchmark in 2.47:
Yes, I’ve tried again some more times but I don’t see this at all. By the way, I think the barrel jack with a 5V 4A adapter is usually recommended for the nano.
@pdboef so, do you have a B01 devkit? I received one today and could reproduce the slow down in your APP but only with the B01, the A02 revision does not exhibit this.
Until a new image is available, a short term solution is to take another sd-card and flash the 32.3.1 L4T Driver Package (BSP) using
sudo ./flash.sh jetson-nano-qspi-sd mmcblk0p1
You will see the partitions names and what’s written to them:
[ 45.1447 ] Writing partition RP1 with tegra210-p3448-0000-p3449-0000-b00.dtb.encrypt
...
[ 45.4537 ] Writing partition DTB with tegra210-p3448-0000-p3449-0000-b00.dtb.encrypt
Then take out this sd-card and place the balena image back in, boot the board and use a usb card reader for the flashed sd-card. In Balena hostOS, write the dtb from the usb to the sd-card that’s running balena:
dd if=/dev/sda3 of=/dev/mmcblk0p2 and mmcblk0p9
Please double-check with ls -l /dev/disk/by-partlabel that 0p2 and 0p9 are the RP1 and DTB partitions. After writing them you can sync && reboot the board and it should run fine again.
Hi @acostach, I’m happy to hear that you were able to reproduce my issue, thanks for following up! I have confirmed that we indeed have the B01 revision, I will update my application to reflect the insight that the phenomenon is related to the revision number. We will hold off updating balenaOS until the issue has been resolved.
Sounds great that the balenaOS 2.51.1+rev1 improves the speed again. Thanks for sharing this with us.
Does the average inference time cause any significant interference in your app?
I’ll share your results with the team, and see if there is anything we can look into.
Thanks,
Georgia