GPU Setup
This guide provides detailed instructions for configuring Ollama to utilize GPU acceleration on different hardware platforms including NVIDIA, AMD, and Intel GPUs.
GPU Acceleration Overview
GPU acceleration dramatically improves Ollama's performance, enabling:
Faster model loading times
Increased inference speed (token generation)
Higher throughput for concurrent requests
Ability to run larger models efficiently
Hardware Requirements
NVIDIA
CUDA-capable GPU (Compute 5.0+) Pascal/10xx series or newer
RTX series (30xx/40xx)
AMD
ROCm-compatible GPU (CDNA/RDNA) Radeon RX 6000+ series
Radeon RX 7000 series
Intel
Intel Arc GPUs with OneAPI support
Intel Arc A770/A750
NVIDIA GPU Setup
NVIDIA GPUs offer the best performance and compatibility with Ollama through CUDA integration.
Prerequisites
Install the NVIDIA driver:
Install the CUDA toolkit (11.4 or newer recommended):
Add CUDA to your PATH:
Configuring Ollama for NVIDIA GPUs
Ollama automatically detects NVIDIA GPUs when available. You can customize GPU utilization with environment variables:
Verifying GPU Usage
NVIDIA Docker Setup
For Docker-based deployments:
AMD GPU Setup
AMD GPU support in Ollama uses the ROCm platform.
Prerequisites
Install the ROCm driver stack:
Add your user to the render group:
Set up environment variables:
Configuring Ollama for AMD GPUs
Verifying AMD GPU Support
AMD Docker Setup
Intel GPU Setup
Intel Arc GPUs can accelerate Ollama through OneAPI integration.
Prerequisites
Install the Intel GPU drivers:
Add OneAPI to your PATH:
Configuring Ollama for Intel GPUs
Verifying Intel GPU Support
Troubleshooting GPU Issues
Common NVIDIA Issues
CUDA not found
Verify CUDA installation: nvcc --version
Insufficient memory
Reduce model size or context window: ollama run mistral:7b-q4_0 -c 2048
Multiple GPU conflict
Specify device: export CUDA_VISIBLE_DEVICES=0
Driver/CUDA mismatch
Install compatible versions: NVIDIA Compatibility
Common AMD Issues
ROCm device not found
Check installation: rocm-smi
Hip runtime error
Set HSA_OVERRIDE_GFX_VERSION=10.3.0
Permission issues
Add user to render group: sudo usermod -aG render $USER
Common Intel Issues
GPU not detected
Verify driver installation: clinfo
Memory allocation failed
Set -cl-intel-greater-than-4GB-buffer-required
Driver too old
Update Intel GPU driver
Performance Optimization
NVIDIA Performance Tips
AMD Performance Tips
Intel Performance Tips
Multi-GPU Configuration
For systems with multiple GPUs:
Real-World Deployment Examples
High-Performance Server (4x NVIDIA RTX 4090)
Mixed GPU Environment (NVIDIA + AMD)
For environments with both NVIDIA and AMD GPUs:
NixOS GPU Configuration
For NixOS users, configure GPU acceleration in configuration.nix:
Next Steps
After configuring GPU acceleration for Ollama:
Explore available models optimized for GPU acceleration
Set up advanced configurations for optimal performance
Set up Open WebUI for a graphical interface
Last updated