Docker Setup

This guide provides detailed instructions for deploying Ollama in Docker containers, enabling consistent, isolated environments and streamlined deployment across different systems.

Why Use Ollama with Docker?

Docker provides several advantages for running Ollama:

  • Isolation: Run Ollama in a contained environment without affecting the host system

  • Portability: Deploy the same Ollama setup across different environments

  • Resource control: Limit CPU, memory, and GPU resources allocated to Ollama

  • Version management: Easily switch between different Ollama versions

  • Orchestration: Integrate with Kubernetes or Docker Swarm for scaling

Prerequisites

Before getting started, ensure you have:

  1. Docker installed on your system:

    # Linux
    curl -fsSL https://get.docker.com | sh
    sudo usermod -aG docker $USER
    # Log out and back in to apply group changes
    
    # Verify Docker installation
    docker --version
  2. Docker Compose (optional but recommended):

    # Install Docker Compose V2
    sudo apt update && sudo apt install -y docker-compose-plugin
    
    # Verify installation
    docker compose version
  3. At least 8GB of RAM and sufficient disk space for models (~5-10GB per model)

Basic Ollama Docker Setup

Using Official Docker Image

Pull and run the official Ollama Docker image:

Testing Your Ollama Container

Docker Compose Setup

A more manageable way to configure and run Ollama is using Docker Compose.

Basic Docker Compose Configuration

Create a file named docker-compose.yml:

Run with:

Advanced Docker Compose with Resource Limits

For more control over container resources:

GPU-Accelerated Docker Setup

NVIDIA GPU Support

To enable NVIDIA GPU acceleration with Docker:

  1. Install the NVIDIA Container Toolkit:

  2. Run Ollama with GPU support:

Docker Compose with NVIDIA GPUs

AMD ROCm GPU Support

For AMD GPUs with ROCm:

Multi-Container Setups

Ollama with Open WebUI

This setup combines Ollama with the Open WebUI for a more user-friendly interface:

Ollama for DevOps

A setup designed for DevOps workflows with Ollama and RAG capabilities:

Docker Network Configuration

Creating an Isolated Network

For multi-container deployments, create an isolated network:

Accessing Ollama from Other Containers

Other containers can access Ollama using the container name as hostname:

Custom Ollama Docker Images

Creating a Custom Dockerfile

Create a Dockerfile with pre-loaded models and custom configuration:

Build and run your custom image:

Production Best Practices

Security Considerations

  1. TLS Encryption:

  2. Authentication (using a reverse proxy like Nginx):

Health Checks and Monitoring

Add health checks to your Docker Compose:

Docker Swarm and Kubernetes

Docker Swarm Deployment

Kubernetes Deployment

Create a kubernetes.yaml file:

Apply with:

Troubleshooting Docker Issues

Issue
Solution

Container won't start

Check logs with docker logs ollama

Permission errors

Verify volume permissions with docker exec -it ollama ls -la /root/.ollama

Network connectivity

Test with docker exec -it ollama curl localhost:11434/api/tags

Out of memory

Increase memory limits in Docker settings

GPU not detected

Verify the NVIDIA Container Toolkit installation and check logs

Next Steps

After setting up Ollama in Docker:

  1. Explore GPU acceleration for faster model inference

  2. Configure and optimize models for your specific use cases

  3. Set up Open WebUI for a graphical user interface

Last updated