#+TITLE: Qubes NPU Setup - sys-ai #+AUTHOR: Amr #+CREATED: [2026-04-24 Fri] #+BEGIN_COMMENT Documentation for setting up sys-ai Qube with AMD Ryzen AI NPU for llama.cpp. #+END_COMMENT * Hardware - *Laptop:* Framework Laptop 13 (AMD) - *CPU:* AMD Ryzen AI 5 340 (6 cores, no SMT) - *NPU:* AMD XDNA2 (Strix Point) - c2:00.1 / dom0:00_08.2-00_00.1 - *RAM:* 96GB total * Current Progress ** DONE [X] Create sys-ai AppVM (HVM, 64GB, 2 vCPUs) #+begin_src bash # Run in dom0 qvm-create --label purple --property netvm=sys-firewall --property memory=65536 --property vcpus=2 --property virt_mode=HVM sys-ai #+end_src ** DONE [X] Attach NPU PCI device to sys-ai #+begin_src bash # Run in dom0 qvm-pci attach -o no-strict-reset=true sys-ai dom0:00_08.2-00_00.1 --persistent #+end_src ** TODO [ ] Fix repository configuration in sys-ai *Status:* Package repositories missing in fedora-43-ai template. Fedora 43 uses DNF5 with different repo paths. *Next Step:* Qubes OS templates typically get packages installed via `qubes-vm-update` or dom0 commands. Try the Qubes way to install packages. ** TODO [ ] Verify NPU is accessible inside sys-ai #+begin_src bash # Install pciutils sudo dnf install pciutils # Check NPU is visible lspci | grep -i neural #+end_src ** TODO [ ] Install AMD NPU drivers in sys-ai #+begin_src bash # Enable Copr repository sudo dnf copr enable xanderlent/amd-npu-driver # Install drivers sudo dnf install xrt xdna-driver tcsh # Setup environment source /usr/xrt/setup.sh # Verify NPU detection xrt-smi examine #+end_src ** TODO [ ] Build llama.cpp with AMD XDNA2 NPU backend #+begin_src bash # Install build dependencies sudo dnf install cmake gcc-c++ python3.11 git # Clone NPU fork git clone https://github.com/BrandedTamarasu-glitch/OllamaAMDNPU.git cd OllamaAMDNPU # Build with NPU backend cmake -B build -DGGML_XDNA=ON -DGGML_BACKEND_DL=ON -DBUILD_SHARED_LIBS=ON cmake --build build --parallel #+end_src ** TODO [ ] Download model and test inference #+begin_src bash # Download GGUF model (Qwen3 1B or 3B quantized) # ... model download command ... # Run with NPU offload ./build/bin/llama-cli -m model.gguf -p "Hello" -n 256 --npu-split 1 #+end_src * Next Step Run the repository fix commands from the "Fix repository" step above.