Docker Setup
This guide provides detailed instructions for deploying Ollama in Docker containers, enabling consistent, isolated environments and streamlined deployment across different systems.
Why Use Ollama with Docker?
Docker provides several advantages for running Ollama:
Isolation: Run Ollama in a contained environment without affecting the host system
Portability: Deploy the same Ollama setup across different environments
Resource control: Limit CPU, memory, and GPU resources allocated to Ollama
Version management: Easily switch between different Ollama versions
Orchestration: Integrate with Kubernetes or Docker Swarm for scaling
Prerequisites
Before getting started, ensure you have:
Docker installed on your system:
# Linux curl -fsSL https://get.docker.com | sh sudo usermod -aG docker $USER # Log out and back in to apply group changes # Verify Docker installation docker --versionDocker Compose (optional but recommended):
# Install Docker Compose V2 sudo apt update && sudo apt install -y docker-compose-plugin # Verify installation docker compose versionAt least 8GB of RAM and sufficient disk space for models (~5-10GB per model)
Basic Ollama Docker Setup
Using Official Docker Image
Pull and run the official Ollama Docker image:
Testing Your Ollama Container
Docker Compose Setup
A more manageable way to configure and run Ollama is using Docker Compose.
Basic Docker Compose Configuration
Create a file named docker-compose.yml:
Run with:
Advanced Docker Compose with Resource Limits
For more control over container resources:
GPU-Accelerated Docker Setup
NVIDIA GPU Support
To enable NVIDIA GPU acceleration with Docker:
Install the NVIDIA Container Toolkit:
Run Ollama with GPU support:
Docker Compose with NVIDIA GPUs
AMD ROCm GPU Support
For AMD GPUs with ROCm:
Multi-Container Setups
Ollama with Open WebUI
This setup combines Ollama with the Open WebUI for a more user-friendly interface:
Ollama for DevOps
A setup designed for DevOps workflows with Ollama and RAG capabilities:
Docker Network Configuration
Creating an Isolated Network
For multi-container deployments, create an isolated network:
Accessing Ollama from Other Containers
Other containers can access Ollama using the container name as hostname:
Custom Ollama Docker Images
Creating a Custom Dockerfile
Create a Dockerfile with pre-loaded models and custom configuration:
Build and run your custom image:
Production Best Practices
Security Considerations
TLS Encryption:
Authentication (using a reverse proxy like Nginx):
Health Checks and Monitoring
Add health checks to your Docker Compose:
Docker Swarm and Kubernetes
Docker Swarm Deployment
Kubernetes Deployment
Create a kubernetes.yaml file:
Apply with:
Troubleshooting Docker Issues
Container won't start
Check logs with docker logs ollama
Permission errors
Verify volume permissions with docker exec -it ollama ls -la /root/.ollama
Network connectivity
Test with docker exec -it ollama curl localhost:11434/api/tags
Out of memory
Increase memory limits in Docker settings
GPU not detected
Verify the NVIDIA Container Toolkit installation and check logs
Next Steps
After setting up Ollama in Docker:
Explore GPU acceleration for faster model inference
Configure and optimize models for your specific use cases
Implement DevOps workflows with Ollama
Set up Open WebUI for a graphical user interface
Last updated