Commit 421654
2024-10-16 02:14:17 admin: pve host install guide| /dev/null .. kubernetes/use nvidia gpu on proxmox k3s lxc.md | |
| @@ 0,0 1,74 @@ | |
| + | # Use Nvidia GPU on Proxmox K3s LXC |
| + | |
| + | # Proxmox Host |
| + | This guide pertains to installing Nvidia drivers on K3s running on a privileged LXC. Make sure that you install the same Nvidia driver version on the LXC that you installed on the Proxmox (PVE) host. |
| + | |
| + | |
| + | Install Nvidia drivers on the PVE host. Make note of the driver version you install you'll need to install the same version on the LXC. |
| + | |
| + | |
| + | Instructions taken from this [guide](https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE#Preparation): |
| + | ``` |
| + | echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf |
| + | |
| + | apt update |
| + | apt install dkms libc6-dev proxmox-default-headers --no-install-recommends |
| + | |
| + | wget -O NVIDIA-Linux-x86_64-550.120.run https://us.download.nvidia.com/XFree86/Linux-x86_64/550.120/NVIDIA-Linux-x86_64-550.120.run |
| + | chmod +x NVIDIA-Linux-x86_64-550.120.run |
| + | ./NVIDIA-Linux-x86_64-550.120.run --no-nouveau-check --dkms |
| + | ``` |
| + | |
| + | |
| + | You must add the following udev rules to create the Nvidia devices on the PVE host: |
| + | ``` |
| + | cat << EOF > /etc/udev/rules.d/70-nvidia.rules |
| + | KERNEL=="nvidia", RUN+="/bin/bash -c '/usr/bin/nvidia-smi -L && /bin/chmod 666 /dev/nvidia*'" |
| + | KERNEL=="nvidia_uvm", RUN+="/bin/bash -c '/usr/bin/nvidia-modprobe -c0 -u && /bin/chmod 0666 /dev/nvidia-uvm*'" |
| + | EOF |
| + | ``` |
| + | |
| + | |
| + | Once the Nvidia devices exist you need to obtain their major device numbers (e.g. 195 and 236): |
| + | ``` |
| + | root@pve-media:~# ls -la /dev/nvid* |
| + | crw-rw-rw- 1 root root 195, 0 Sep 27 19:40 /dev/nvidia0 |
| + | crw-rw-rw- 1 root root 195, 255 Sep 27 19:40 /dev/nvidiactl |
| + | crw-rw-rw- 1 root root 236, 0 Sep 27 19:40 /dev/nvidia-uvm |
| + | crw-rw-rw- 1 root root 236, 1 Sep 27 19:40 /dev/nvidia-uvm-tools |
| + | |
| + | /dev/nvidia-caps: |
| + | total 0 |
| + | drwxr-xr-x 2 root root 80 Sep 27 19:40 . |
| + | drwxr-xr-x 20 root root 5060 Sep 27 20:08 .. |
| + | cr-------- 1 root root 239, 1 Sep 27 19:40 nvidia-cap1 |
| + | cr--r--r-- 1 root root 239, 2 Sep 27 19:40 nvidia-cap2 |
| + | ``` |
| + | |
| + | |
| + | Next you must add to following to the LXC config: |
| + | |
| + | `/etc/pve/lxc/<lxc_id>.conf` |
| + | |
| + | ``` |
| + | mp1: /usr/lib/modules,mp=/usr/lib/modules |
| + | lxc.cgroup2.devices.allow: c 195:* rw |
| + | lxc.cgroup2.devices.allow: c 236:* rw |
| + | lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file |
| + | lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file |
| + | lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file |
| + | lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file |
| + | ``` |
| + | |
| + | |
| + | These lines perform the following: |
| + | - Mount host kernel headers so gpu-operator helm chart on K3s can build Nvidia drivers |
| + | - Create cgroup2 allowist entries for the major device numbers of the Nvidia devices |
| + | - Passthrough Nvidia devices through mounts |
| + | |
| + | |
| + | # References |
| + | https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE#Preparation |
| + | https://github.com/UntouchedWagons/K3S-NVidia |
| + | https://theorangeone.net/posts/lxc-nvidia-gpu-passthrough/ |
| + | https://forum.proxmox.com/threads/sharing-gpu-to-lxc-container-failed-to-initialize-nvml-unknown-error.98905/ |
