VM Resource Allocation¶
Proper resource allocation for KVM/libvirt VMs ensures stable performance and prevents host resource exhaustion.
Memory Allocation¶
Static Allocation¶
Set fixed memory in VM XML:
Or via virsh:
# Set maximum memory (requires VM shutdown)
virsh setmaxmem myvm 16G --config
# Set current memory
virsh setmem myvm 16G --config
Memory Ballooning¶
Ballooning allows dynamic memory adjustment. The guest OS releases unused memory back to the host.
Enable in VM XML:
Adjust at runtime:
# Reduce to 8GB (guest releases memory)
virsh setmem myvm 8G --live
# Increase to 12GB
virsh setmem myvm 12G --live
Note
Ballooning requires guest cooperation. Windows needs the virtio balloon driver; Linux includes it by default.
Memory Locking¶
For latency-sensitive VMs, lock memory to prevent swapping:
Requires ulimit -l adjustment for libvirt.
Hugepages¶
Hugepages reduce TLB misses for memory-intensive VMs.
Configure Host Hugepages¶
# Check current hugepages
grep Huge /proc/meminfo
# Reserve 16GB of 2MB hugepages (8192 pages)
echo 8192 | sudo tee /proc/sys/vm/nr_hugepages
# Make persistent
echo "vm.nr_hugepages = 8192" | sudo tee /etc/sysctl.d/hugepages.conf
For 1GB hugepages (better for large VMs):
VM Hugepages Configuration¶
Or for 1GB pages:
vCPU Assignment¶
Basic vCPU Configuration¶
Via virsh:
# Set vCPU count (requires shutdown for increase)
virsh setvcpus myvm 8 --config --maximum
virsh setvcpus myvm 8 --config
CPU Topology¶
Define sockets/cores/threads to match guest OS expectations:
<vcpu placement='static'>8</vcpu>
<cpu mode='host-passthrough'>
<topology sockets='1' dies='1' cores='4' threads='2'/>
</cpu>
CPU Pinning¶
Pin vCPUs to specific host cores for consistent performance:
<vcpu placement='static'>4</vcpu>
<cputune>
<vcpupin vcpu='0' cpuset='4'/>
<vcpupin vcpu='1' cpuset='5'/>
<vcpupin vcpu='2' cpuset='6'/>
<vcpupin vcpu='3' cpuset='7'/>
</cputune>
View host CPU topology:
Strix Halo CCX pinning
The Ryzen AI Max+ 395 has 2 x CCX of 8 Zen 5 cores each, each CCX with its own L3 cache. Crossing the CCX boundary inside a single VM incurs a noticeable L3-miss penalty. For latency-sensitive guests (Windows VM for interactive use, gaming VM if you ever go that route), pin all vCPUs of one VM to a single CCX.
Check the CCX layout with lscpu --extended — look at the L3 column; cores sharing an L3 cache value are on the same CCX. Typical layout:
- CCX 0: cores 0-7 (siblings 16-23 with SMT)
- CCX 1: cores 8-15 (siblings 24-31 with SMT)
Example: pin an 8-vCPU Windows VM entirely to CCX 1:
<vcpu placement='static'>8</vcpu>
<cputune>
<vcpupin vcpu='0' cpuset='8'/>
<vcpupin vcpu='1' cpuset='24'/> <!-- SMT sibling -->
<vcpupin vcpu='2' cpuset='9'/>
<vcpupin vcpu='3' cpuset='25'/>
<vcpupin vcpu='4' cpuset='10'/>
<vcpupin vcpu='5' cpuset='26'/>
<vcpupin vcpu='6' cpuset='11'/>
<vcpupin vcpu='7' cpuset='27'/>
<emulatorpin cpuset='0-1'/> <!-- emulator threads on the OTHER CCX -->
</cputune>
This pattern leaves CCX 0 (cores 0-7 + siblings 16-23) for the host, Docker services, and Ollama/llama.cpp — which is roughly the right split given the workload mix.
Emulator Pinning¶
Pin QEMU emulator threads separately:
<cputune>
<vcpupin vcpu='0' cpuset='4'/>
<vcpupin vcpu='1' cpuset='5'/>
<emulatorpin cpuset='0-1'/>
</cputune>
NUMA Configuration¶
For multi-socket systems or large VMs, NUMA awareness improves performance.
View Host NUMA¶
VM NUMA Configuration¶
<numatune>
<memory mode='strict' nodeset='0'/>
</numatune>
<cpu mode='host-passthrough'>
<numa>
<cell id='0' cpus='0-7' memory='16' unit='GiB'/>
</numa>
</cpu>
I/O Resource Control¶
Disk I/O Limits¶
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2'/>
<source file='/mnt/tank/vm/myvm/disk.qcow2'/>
<target dev='vda' bus='virtio'/>
<iotune>
<read_bytes_sec>104857600</read_bytes_sec> <!-- 100 MB/s -->
<write_bytes_sec>52428800</write_bytes_sec> <!-- 50 MB/s -->
<read_iops_sec>1000</read_iops_sec>
<write_iops_sec>500</write_iops_sec>
</iotune>
</disk>
Set at runtime:
Network Bandwidth¶
<interface type='bridge'>
<source bridge='br0'/>
<bandwidth>
<inbound average='125000' peak='250000' burst='256'/> <!-- KB/s -->
<outbound average='125000' peak='250000' burst='256'/>
</bandwidth>
</interface>
Resource Monitoring¶
VM Statistics¶
# Overview of all VMs
virsh list --all
# CPU and memory for running VMs
virsh domstats
# Specific VM stats
virsh domstats myvm
# CPU usage
virsh cpu-stats myvm
# Memory stats (requires balloon)
virsh dommemstat myvm
Detailed Statistics¶
# Block device stats
virsh domblkstat myvm vda
# Network stats
virsh domifstat myvm vnet0
# Complete domain info
virsh dominfo myvm
Real-Time Monitoring¶
# Watch VM stats (updates every 2 seconds)
watch -n2 'virsh domstats --cpu-total --balloon --block --interface'
# virt-top for interactive view
sudo apt install virt-top
virt-top
Live Resource Adjustment¶
Memory¶
# Adjust current memory (within max)
virsh setmem myvm 8G --live
# Check current allocation
virsh dominfo myvm | grep memory
vCPUs¶
# Reduce vCPUs (if hotplug enabled)
virsh setvcpus myvm 4 --live
# Check current vCPUs
virsh vcpucount myvm
I/O Limits¶
# Adjust disk I/O
virsh blkdeviotune myvm vda \
--read-bytes-sec 209715200 \
--write-bytes-sec 104857600 \
--live
# Adjust network bandwidth
virsh domiftune myvm vnet0 --inbound 250000,500000,512 --live
Best Practices¶
Windows VMs¶
| Setting | Recommendation |
|---|---|
| Memory | Fixed allocation, no ballooning |
| vCPUs | Even number, matching host topology |
| Storage | virtio with latest drivers |
| Network | virtio with latest drivers |
Windows-specific XML:
<features>
<hyperv mode='custom'>
<relaxed state='on'/>
<vapic state='on'/>
<spinlocks state='on' retries='8191'/>
<vpindex state='on'/>
<synic state='on'/>
<stimer state='on'/>
</hyperv>
</features>
<clock offset='localtime'>
<timer name='hypervclock' present='yes'/>
</clock>
Linux VMs¶
| Setting | Recommendation |
|---|---|
| Memory | Ballooning enabled for flexibility |
| vCPUs | Match workload, can overcommit |
| Storage | virtio-scsi for multiple disks |
| Network | virtio (default driver) |
General Guidelines¶
- Don't overcommit memory for production VMs
- Reserve host resources: Leave 4-8GB RAM and 2-4 cores for host
- Use CPU pinning for latency-sensitive workloads
- Enable hugepages for VMs >8GB
- Monitor regularly with virsh domstats and virt-top
Resource Budget Example¶
For a 128GB / 16-core host running mixed workloads:
| Workload | Memory | vCPUs | Notes |
|---|---|---|---|
| Host reserved | 8GB | 4 | OS, Docker, services |
| Windows VM | 24GB | 6 | Gaming, GPU passthrough |
| Linux VM | 16GB | 4 | Development |
| LLM inference | 64GB | 8 | Ollama containers |
| Available | 16GB | - | Headroom |
See Capacity Planning for system-wide resource strategy.
Next Steps¶
- KVM Setup for installation and configuration
- GPU Passthrough for dedicated VM graphics
- Windows 11 VM for gaming VM setup