James’ suggestion is to make a new image with CUDA/etc., push it on Dockerhub, and then use it in FROM in whichever services you want to use CUDA with. Because of the way Docker images work, everything contained in the base image will only be downloaded once on the device.