Yes, that works!
But sadly, nvidia-smi doesn’t detect my board…
root@balena:/# lshw -C display
*-display UNCLAIMED
description: VGA compatible controller
product: GM204 [GeForce GTX 970]
vendor: NVIDIA Corporation
physical id: 0
bus info: pci@0000:01:00.0
version: a1
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress vga_controller bus_master cap_list
configuration: latency=0
resources: memory:a2000000-a2ffffff memory:90000000-9fffffff memory:a0000000-a1ffffff ioport:3000(size=128) memory:c0000-dffff
root@balena:/# nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
Not sure what’s wrong?
Here is my full Dockerfile.template file:
FROM balenalib/%%BALENA_MACHINE_NAME%%-ubuntu-python:3.6-bionic-build
ENV RESINOS_VERSION=2.29.0%2Brev1.prod
ENV YOCTO_VERSION=4.12.12
RUN wget https://files.resin.io/images/intel-nuc/${RESINOS_VERSION}/kernel_modules_headers.tar.gz
RUN tar -xf kernel_modules_headers.tar.gz && rm -rf kernel_modules_headers.tar.gz
RUN mkdir -p /lib/modules/${YOCTO_VERSION}-yocto-standard
RUN mv ./kernel_modules_headers /lib/modules/${YOCTO_VERSION}-yocto-standard/build
RUN ln -s /lib64/ld-linux-x86-64.so.2 /lib/ld-linux-x86-64.so.2
RUN apt-get update && apt-cache search nvidia-driver
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get install -y nvidia-driver-390
# Enable udevd so that plugged dynamic hardware devices show up in our container.
ENV UDEV=1
ENV INITSYSTEM on
CMD [ "sleep", "infinity"]
@scarlyon, I don’t have a solution but I can ask some questions that may help us help you:
What is the output of the following command on the host OS prompt of your NUC device? uname -a && cat /etc/issue && lsmod
What is the build output for the Dockerfile? (E.g. output of the "balena push" or "git push" command.) Could there have been any errors in building the drivers?
If the output is too large to paste in the body of message, maybe it could be attached as zip or created in a gist.github.com page. Just some suggestions.
Hi Shane. One thing I noticed from your output here is that you are running the container on balenaOS 2.44, but building the kernel module for balenaOS 2.29. If i recall, there were some major kernel bumps from 4.18 to 5.2 between those two OS versions, so it might be worth making sure your Dockerfile targets the 2.44 version.
Hey Shane,
Can you also confirm you 're setting the right YOCTO_VERSION envvar ? From the above shared output, that should be set to 5.2.10. Could you share the relevant section of your Dockerfile?
Thanks!
FROM balenalib/%%BALENA_MACHINE_NAME%%-ubuntu-python:3.6-bionic-build
ENV RESINOS_VERSION=2.44.0%2Brev1.prod
ENV YOCTO_VERSION=5.2.10
RUN wget https://files.resin.io/images/intel-nuc/${RESINOS_VERSION}/kernel_modules_headers.tar.gz
RUN tar -xf kernel_modules_headers.tar.gz && rm -rf kernel_modules_headers.tar.gz
RUN mkdir -p /lib/modules/${YOCTO_VERSION}-yocto-standard
RUN mv ./kernel_modules_headers /lib/modules/${YOCTO_VERSION}-yocto-standard/build
RUN ln -s /lib64/ld-linux-x86-64.so.2 /lib/ld-linux-x86-64.so.2
RUN apt-get update && apt-cache search nvidia-driver
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get install -y nvidia-driver-390
# Enable udevd so that plugged dynamic hardware devices show up in our container.
ENV UDEV=1
ENV INITSYSTEM on
CMD [ "sleep", "infinity"]
I’ve noticed that there’s a failure whilst building modules with that Dockerfile:
[main] DKMS: install completed.
[main] Building initial module for 5.2.10-yocto-standard
[main] Error! Bad return status for module build on kernel: 5.2.10-yocto-standard (x86_64)
[main] Consult /var/lib/dkms/nvidia/390.116/build/make.log for more information.
Can you try with RUN apt-get install -y nvidia-driver-435 instead of RUN apt-get install -y nvidia-driver-390 as this seems to build cleanly, and let us know how it goes?