GPU Setup¶

Configure AMD integrated graphics for AI workloads on the MS-S1 MAX.

Overview¶

The AMD Ryzen AI Max+ 395 (Strix Halo) APU combines CPU and GPU on a single chip with access to system memory. Unlike discrete GPUs with dedicated VRAM, the integrated RDNA 3.5 graphics shares the 128GB LPDDR5X-8000 quad-channel system memory.

For a comprehensive explanation of the APU architecture, memory subsystem, and design trade-offs, see Hardware Architecture.

Quick Reference¶

Aspect	MS-S1 MAX APU
Architecture	RDNA 3.5
Compute Units	40 CUs
GPU ID	gfx1151
Memory	Shared 128GB LPDDR5X-8000 (quad-channel, soldered)
Bandwidth	~256 GB/s peak, ~210-220 GB/s real-world
ROCm Support	Native (Ubuntu 26.04 ships ROCm 7.1 in Universe; gfx1151 supported upstream)

Why APU for LLMs?¶

The MS-S1 MAX's 128GB configuration enables running large models that exceed typical discrete GPU VRAM:

70B models at high quantization - Full Q6 or Q8 fits in memory
405B models at lower quantization - Q2-Q3 within reach
No model offloading - Entire model stays in accessible memory
Simple setup - No PCIe passthrough needed

The tradeoff is lower memory bandwidth compared to discrete GPUs, resulting in slower tokens/second. However, the ability to run larger models often outweighs raw speed.

Section Contents¶

Quick Start ¶

Get from bare Ubuntu 26.04 LTS to running LLMs on GPU in one page:

Linux 7.0 (default), apt install rocm, VRAM allocation, Ollama

ROCm Installation ¶

Native ROCm installation for Ubuntu 26.04:

APU compatibility and current support status
In-distro ROCm 7.1 vs upstream AMD repo (newer ROCm)
Verification with rocminfo and rocm-smi

Driver Updates ¶

Keeping AMD drivers current:

Checking installed versions
Update procedures
Handling conflicts
Rollback if needed

Memory Configuration ¶

Optimizing memory for AI workloads:

Software VRAM allocation with amd-ttm (108-115GB)
UMA Frame Buffer Size settings
Kernel parameter alternatives
Bandwidth considerations

BIOS Setup - Configure BIOS for optimal APU performance
Hardware - MS-S1 MAX specifications
GPU Containers - ROCm in Docker
Unified Memory - Memory concepts (Apple Silicon focused, but concepts apply)

GPU Setup¶

Overview¶

Quick Reference¶

Why APU for LLMs?¶

Section Contents¶

Quick Start¶

ROCm Installation¶

Driver Updates¶

Memory Configuration¶

Related Documentation¶

Quick Start ¶

ROCm Installation ¶

Driver Updates ¶

Memory Configuration ¶