Commit 5f54c9

2024-10-17 01:13:57 admin: editing updates
kubernetes/use nvidia gpu on proxmox k3s lxc.md ..
@@ 12,7 12,7 @@
echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf
apt update
- apt install dkms libc6-dev proxmox-default-headers --no-install-recommends
+ apt install -y dkms libc6-dev proxmox-default-headers --no-install-recommends
wget -O NVIDIA-Linux-x86_64-550.120.run https://us.download.nvidia.com/XFree86/Linux-x86_64/550.120/NVIDIA-Linux-x86_64-550.120.run
chmod +x NVIDIA-Linux-x86_64-550.120.run
@@ 172,7 172,7 @@
helm repo update
```
- Next install the gpu-operator Helm chart using the values tailored for K3s:
+ Next install the `gpu-operator` Helm chart using the following values file tailored for K3s:
```
cat << EOF > gpu-operator-values.yaml
toolkit:
@@ 186,7 186,7 @@
helm install gpu-operator nvidia/gpu-operator --create-namespace --values gpu-operator-values.yaml
```
- This will create a bunch of pods in the `gpu-operator` namespace that will build the Nvidia drivers. You will see some of these pods restarting, this is normal as some pods are dependant on others to complete. Overall the build process should take a couple of minutes. If it's taking longer than 10 minutes you likely have an issue and should look at the logs of the `gpu-operator-node-feature-discovery-worker` pods (This is how I figure out you need to mount the Proxmox host kernel headers on the LXC as the pod couldn't find the kernel modules).
+ This will create a bunch of pods in the `gpu-operator` namespace that will build the Nvidia drivers. You will see some of these pods restarting, this is normal as some pods are dependant on others completion. Overall the build process should take a couple of minutes. If it's taking longer than 10 minutes you likely have an issue and should look at the logs of the `gpu-operator-node-feature-discovery-worker` pods (This is how I figure out you need to mount the Proxmox host kernel headers on the LXC as the pod couldn't find the kernel modules).
You can verify that everything is working correctly by spinning up a pod that uses the GPU:
```
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9