GPU Setup

This guide provides detailed instructions for configuring Ollama to utilize GPU acceleration on different hardware platforms including NVIDIA, AMD, and Intel GPUs.

GPU Acceleration Overview

GPU acceleration dramatically improves Ollama's performance, enabling:

  • Faster model loading times

  • Increased inference speed (token generation)

  • Higher throughput for concurrent requests

  • Ability to run larger models efficiently

Hardware Requirements

GPU Manufacturer
Minimum Requirements
Recommended

NVIDIA

CUDA-capable GPU (Compute 5.0+) Pascal/10xx series or newer

RTX series (30xx/40xx)

AMD

ROCm-compatible GPU (CDNA/RDNA) Radeon RX 6000+ series

Radeon RX 7000 series

Intel

Intel Arc GPUs with OneAPI support

Intel Arc A770/A750

NVIDIA GPU Setup

NVIDIA GPUs offer the best performance and compatibility with Ollama through CUDA integration.

Prerequisites

  1. Install the NVIDIA driver:

  2. Install the CUDA toolkit (11.4 or newer recommended):

  3. Add CUDA to your PATH:

Configuring Ollama for NVIDIA GPUs

Ollama automatically detects NVIDIA GPUs when available. You can customize GPU utilization with environment variables:

Verifying GPU Usage

NVIDIA Docker Setup

For Docker-based deployments:

AMD GPU Setup

AMD GPU support in Ollama uses the ROCm platform.

Prerequisites

  1. Install the ROCm driver stack:

  2. Add your user to the render group:

  3. Set up environment variables:

Configuring Ollama for AMD GPUs

Verifying AMD GPU Support

AMD Docker Setup

Intel GPU Setup

Intel Arc GPUs can accelerate Ollama through OneAPI integration.

Prerequisites

  1. Install the Intel GPU drivers:

  2. Add OneAPI to your PATH:

Configuring Ollama for Intel GPUs

Verifying Intel GPU Support

Troubleshooting GPU Issues

Common NVIDIA Issues

Issue
Solution

CUDA not found

Verify CUDA installation: nvcc --version

Insufficient memory

Reduce model size or context window: ollama run mistral:7b-q4_0 -c 2048

Multiple GPU conflict

Specify device: export CUDA_VISIBLE_DEVICES=0

Driver/CUDA mismatch

Install compatible versions: NVIDIA Compatibility

Common AMD Issues

Issue
Solution

ROCm device not found

Check installation: rocm-smi

Hip runtime error

Set HSA_OVERRIDE_GFX_VERSION=10.3.0

Permission issues

Add user to render group: sudo usermod -aG render $USER

Common Intel Issues

Issue
Solution

GPU not detected

Verify driver installation: clinfo

Memory allocation failed

Set -cl-intel-greater-than-4GB-buffer-required

Driver too old

Update Intel GPU driver

Performance Optimization

NVIDIA Performance Tips

AMD Performance Tips

Intel Performance Tips

Multi-GPU Configuration

For systems with multiple GPUs:

Real-World Deployment Examples

High-Performance Server (4x NVIDIA RTX 4090)

Mixed GPU Environment (NVIDIA + AMD)

For environments with both NVIDIA and AMD GPUs:

NixOS GPU Configuration

For NixOS users, configure GPU acceleration in configuration.nix:

Next Steps

After configuring GPU acceleration for Ollama:

  1. Explore available models optimized for GPU acceleration

  2. Set up advanced configurations for optimal performance

  3. Set up Open WebUI for a graphical interface

Last updated