Commit c9a46a

2024-10-16 14:48:50 admin: update k3s nvidia
kubernetes/use nvidia gpu on proxmox k3s lxc.md ..
@@ 1,13 1,8 @@
- # Use Nvidia GPU on Proxmox K3s LXC
-
- # Proxmox Host
+ # Install Nvidia GPU on Proxmox K3s LXC
This guide pertains to installing Nvidia drivers on K3s running on a privileged LXC. Make sure that you install the same Nvidia driver version on the LXC that you installed on the Proxmox (PVE) host.
-
- Install Nvidia drivers on the PVE host. Make note of the driver version you install you'll need to install the same version on the LXC.
-
-
- Instructions taken from this [guide](https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE#Preparation):
+ ## Installing Nvidia Drivers on the Proxmox Host
+ Make note of the driver version you install as you'll need to install the same version later on the K3s LXC. Use the following instructions to install the Nvidia driver on your Proxmox host ([Proxmox official docs](https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE#Preparation)):
```
echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf
@@ 19,8 14,7 @@
./NVIDIA-Linux-x86_64-550.120.run --no-nouveau-check --dkms
```
-
- You must add the following udev rules to create the Nvidia devices on the PVE host:
+ You must also add the following udev rules to create the Nvidia devices on the Proxmox host:
```
cat << EOF > /etc/udev/rules.d/70-nvidia.rules
KERNEL=="nvidia", RUN+="/bin/bash -c '/usr/bin/nvidia-smi -L && /bin/chmod 666 /dev/nvidia*'"
@@ 28,8 22,10 @@
EOF
```
+ Reboot your Proxmox host to load the Nvidia driver and create the udev devices.
- Once the Nvidia devices exist you need to obtain their major device numbers (e.g. 195 and 236):
+ ## Configuring the LXC
+ Once the driver is loaded and the Nvidia devices exist you need to obtain their major device numbers. Run the following on the Proxmox host to get your numbers. In my case they are 195 and 236:
```
root@pve-media:~# ls -la /dev/nvid*
crw-rw-rw- 1 root root 195, 0 Sep 27 19:40 /dev/nvidia0
@@ 45,11 41,14 @@
cr--r--r-- 1 root root 239, 2 Sep 27 19:40 nvidia-cap2
```
+ Next you must add to following to the LXC config changing your device numbers as needed:
- Next you must add to following to the LXC config:
-
+ Edit:
`/etc/pve/lxc/<lxc_id>.conf`
+
+ Add:
+
```
mp1: /usr/lib/modules,mp=/usr/lib/modules
lxc.cgroup2.devices.allow: c 195:* rw
@@ 60,11 59,10 @@
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
```
-
- These lines perform the following:
- - Mount host kernel headers so gpu-operator helm chart on K3s can build Nvidia drivers
+ These lines perform the following for your LXC (in order):
+ - Mount host kernel headers so gpu-operator helm chart on K3s can build the Nvidia drivers
- Create cgroup2 allowist entries for the major device numbers of the Nvidia devices
- - Passthrough Nvidia devices through mounts
+ - Passthrough Nvidia devices to LXC
# References
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9