Skip to content

GPU Setup

Configure AMD integrated graphics for AI workloads on the MS-S1 MAX.

Overview

The AMD Ryzen AI Max+ 395 APU combines CPU and GPU on a single chip with access to system memory. Unlike discrete GPUs with dedicated VRAM, the integrated RDNA 3.5 graphics shares the 128GB DDR5 system RAM.

For a comprehensive explanation of the APU architecture, memory subsystem, and design trade-offs, see Hardware Architecture.

Quick Reference

Aspect MS-S1 MAX APU
Architecture RDNA 3.5
Compute Units 40 CUs
GPU ID gfx1151
Memory Shared 128GB DDR5
Bandwidth ~90 GB/s
ROCm Support Experimental (requires HSA_OVERRIDE_GFX_VERSION)

Why APU for LLMs?

The MS-S1 MAX's 128GB configuration enables running large models that exceed typical discrete GPU VRAM:

  • 70B models at high quantization - Full Q6 or Q8 fits in memory
  • 405B models at lower quantization - Q2-Q3 within reach
  • No model offloading - Entire model stays in accessible memory
  • Simple setup - No PCIe passthrough needed

The tradeoff is lower memory bandwidth compared to discrete GPUs, resulting in slower tokens/second. However, the ability to run larger models often outweighs raw speed.

Section Contents

ROCm Installation

Native ROCm installation for Ubuntu 24.04:

  • APU compatibility and current support status
  • Installation using amdgpu-install
  • Environment variables for Strix Point
  • Verification with rocminfo and rocm-smi

Driver Updates

Keeping AMD drivers current:

  • Checking installed versions
  • Update procedures
  • Handling conflicts
  • Rollback if needed

Memory Configuration

Optimizing memory for AI workloads:

  • UMA Frame Buffer Size settings
  • Memory allocation strategies
  • Bandwidth considerations
  • Monitoring memory usage