Use Nvidia GPU on Proxmox K3s LXC

Proxmox Host

This guide pertains to installing Nvidia drivers on K3s running on a privileged LXC. Make sure that you install the same Nvidia driver version on the LXC that you installed on the Proxmox (PVE) host.

Install Nvidia drivers on the PVE host. Make note of the driver version you install you'll need to install the same version on the LXC.

Instructions taken from this guide:

echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf

apt update
apt install dkms libc6-dev proxmox-default-headers --no-install-recommends

wget -O NVIDIA-Linux-x86_64-550.120.run https://us.download.nvidia.com/XFree86/Linux-x86_64/550.120/NVIDIA-Linux-x86_64-550.120.run
chmod +x NVIDIA-Linux-x86_64-550.120.run
./NVIDIA-Linux-x86_64-550.120.run --no-nouveau-check --dkms

You must add the following udev rules to create the Nvidia devices on the PVE host:

cat << EOF > /etc/udev/rules.d/70-nvidia.rules
KERNEL=="nvidia", RUN+="/bin/bash -c '/usr/bin/nvidia-smi -L && /bin/chmod 666 /dev/nvidia*'"
KERNEL=="nvidia_uvm", RUN+="/bin/bash -c '/usr/bin/nvidia-modprobe -c0 -u && /bin/chmod 0666 /dev/nvidia-uvm*'"
EOF

Once the Nvidia devices exist you need to obtain their major device numbers (e.g. 195 and 236):

root@pve-media:~# ls -la /dev/nvid*                                                                                                                                                                                       
crw-rw-rw- 1 root root 195,   0 Sep 27 19:40 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Sep 27 19:40 /dev/nvidiactl
crw-rw-rw- 1 root root 236,   0 Sep 27 19:40 /dev/nvidia-uvm
crw-rw-rw- 1 root root 236,   1 Sep 27 19:40 /dev/nvidia-uvm-tools

/dev/nvidia-caps:
total 0
drwxr-xr-x  2 root root     80 Sep 27 19:40 .
drwxr-xr-x 20 root root   5060 Sep 27 20:08 ..
cr--------  1 root root 239, 1 Sep 27 19:40 nvidia-cap1
cr--r--r--  1 root root 239, 2 Sep 27 19:40 nvidia-cap2

Next you must add to following to the LXC config:

/etc/pve/lxc/<lxc_id>.conf

mp1: /usr/lib/modules,mp=/usr/lib/modules
lxc.cgroup2.devices.allow: c 195:* rw
lxc.cgroup2.devices.allow: c 236:* rw
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file

These lines perform the following:

  • Mount host kernel headers so gpu-operator helm chart on K3s can build Nvidia drivers
  • Create cgroup2 allowist entries for the major device numbers of the Nvidia devices
  • Passthrough Nvidia devices through mounts

References

https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE#Preparation https://github.com/UntouchedWagons/K3S-NVidia https://theorangeone.net/posts/lxc-nvidia-gpu-passthrough/ https://forum.proxmox.com/threads/sharing-gpu-to-lxc-container-failed-to-initialize-nvml-unknown-error.98905/

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9