nvidia-x86 on balena

Hello.

I am trying to build a setup that allows me to have graphical output on x86 devices using NVIDIA cards.
I searched a bit and I ended up in this GitHub thread.

I was trying to just run the provided examples but my GPU container is not working.

It fails to unload the nouveau module: rmmod: ERROR: Module nouveau is in use.

Then it also fails to load the driver modules:

insmod: ERROR: could not insert module /nvidia/driver/nvidia.ko: No such device
insmod: ERROR: could not insert module /nvidia/driver/nvidia-modeset.ko: Unknown symbol in module
insmod: ERROR: could not insert module /nvidia/driver/nvidia-uvm.ko: Unknown symbol in module
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

I am running in circles on the internet trying to get somewhere without success yet.
Any help would be very welcome.

Thanks,

Nuno

Some additional information about things I’ve already tried:

  • Unloading the nouveau driver usually requires a system reboot.

  • Based on this thread - I tried doing it by including a blacklist-nouveau.conf in /etc/modprobe.d/ with the content:

blacklist nouveau
options nouveau modeset=0

Hi @ngmartins-ff,

Welcome to balena forum. I have limited knowledge on the GPU setup above, but we had a similar thread sometime back where the blacklisting/blocking of nouveau driver was done successfully and a dockerfile is shared for the same. Ref: blacklist drivers in host OS - #25 by jgordon

Also, I have pinged some of our folks who have been involved with the nvidia-x86 on balena to see if they have seen something similar. Let us know if the above blocking steps make a difference.

Regards,
Nitish

Hi @nitish.

Thanks a lot for your response. Looking into the post you shared I managed to fix the issue I was experiencing.
Thanks a lot.

Now my GPU container is running ok. I suggest updating the git project in order to make it functional.
Now I will move forward and try to use the driver from another app.

Just to add in here that I have been unable to reproduce any issues unloading the nouveau drivers, but I only have access to one Nvidia GPU and x86 setup - there are many other varieties though and they could act differently. The link that my colleague has provided (which we also include in the entry.sh script) is worth a try. It has a slight variation on this project’s method of unloading the driver and hopefully it works for you. Let us know how it goes.

[quote=“alanb128, post:6, topic:361438”] Let us know how it goes.
[/quote]

Hello @alanb128. The link shared above helped me to unload the module and move forward:
Stop plymouth service used for splash screen display did the trick for me.

Glad it worked out @ngmartins-ff, thanks for letting us know. I updated the documentation on the repo to point people to that post if they run into trouble.