Our Balena use case is to manage a number of devices deployed in low-bandwidth environments and run machine learning on the edge with these devices. Currently, we are using NVIDIA Jetson AGX Orin devices. We use a multi-container release structure and one of these containers is responsible for hosting the model repository and running inference using these models using NVIDIA Triton Inference Server. Each device has to use a different machine learning model as each model is trained specifically to work in that device’s environment.
Due to the models’ size and the number of devices we have, it isn’t feasible to put all the models we use across all the devices into one container with the 64 GB of eMMC the device has. So we have come up with two possible solutions:
- Store the Balena containers on a massive SD card so we have as much storage as we could want to store ML models
- Set up a release versioning structure that has multiple sub-versions that denote which models the device has on it (ex: v0.0.1-device1, v0.0.1-device2, …)
The issue with solution 1 is that we have been unsuccessful with being able to mount the SD card as the Jetson AGX Orin devices running Balena won’t register when SD card or USB devices are plugged in
The issue with solution 2 is that it is extremely clunky and may be a nightmare to manage (but could be mitigated by the use of CI)
I am looking for alternative solutions for anyone that has solved a similar issue/support on how to get solution 1 working. Thank you!