blacklist drivers in host OS

HI,

I am having a generic x86_64 new OS type and as it has both nouveau drivers and as my container needs nvidia drivers, I am experiencing few problems. In some of the GPUs like GeForce GTX 16 series, the nouveau drivers are not loaded when kernel boots up. But, in some desktops like hp omen where I have GeForce rtx 2070, the nouveau drivers are not loaded and runs my container with nvidia drivers installed. So, inorder for my container to work on HP Omens, I was trying to remount the Host OS with read and write permissions and blacklist those nouveau drivers to not load at boot time. since I was unable to blacklist from my services however much I tried to write blacklist conf files, it was not working. So, now the nvidia drivers were getting loaded and I was able to run my application. But, I have another problem. All the nvidia-uvm, nvidia, nvidia-dkms modules are getting loaded, except nvidia-drm. I am having a problem with inserting that module using insmod and it says unknown symbol. I want to know how if what I did break something? and I want to know if it is correct, then is there a clean way to handle it. Thanks!

Hi Latitha, thanks for contacting support.

I understand you are using the Generic x86_64 device type and have included the proprietary nvidia drivers in your application container. As you say, the Generic x86_64 device type provides a wide selection of modules that are available to your multi container application when you use the io.balena.features.kernel-modules label in your docker compose file (see https://www.balena.io/docs/learn/develop/multicontainer/#labels)

Once all the modules are accessible from the application containers, there is no need to modify the hostOS in any way. Doing so is not recommended as there is no way to replicating those changes automatically across a fleet of devices and the changes will be lost when you perform a hostOS update of the device.

You should be able to manage the modules from your application container as you would from a normal Linux system. For example, if you want to blacklist a module, you should add it to the modprobe.d/blacklist.conf file.

Back to your message, as you say the hostOS includes the noveua drivers, and they may be loaded automatically if the kernel has the need for them, for example to display a splash screen. You can try to explicitly run rmmod in your application container before using modprobe to load the nvidia drivers.

About your last comment, the unknown symbols, this usually means that there are dependency drivers that need to be loaded beforehand which are needed by the module you are trying to load. To work around this, I suggest you use modprobe to load modules instead of insmod.

Let us know if that helps.

Hi,

Thank you so much for getting back to me. I tried having rmmod and I also tried to blacklist the drivers from my service but it was not making any changes to disable nouveau drivers that is the reason why I got into hostOS and made the change. I know that it is not advisable and is very janky way of doing it. Well, I will try testing and see if I can make changes from my services. Thank you for making time to answer my questions.

I hope you have better luck with changing the service. Let us know if you need any more guidance - we are always happy to help