Skip to main content

Docker Profiles

FOVEA uses Docker Compose profiles to support both CPU and GPU deployments from a single configuration file.

Introduction

Docker Compose profiles enable explicit deployment choices without maintaining separate configuration files:

  • Default profile: CPU mode (works on any system)
  • GPU profile: Requires NVIDIA GPU and nvidia-docker2
  • Build modes: minimal (fast builds) vs full (all inference engines)

CPU vs GPU Comparison

Performance Comparison

OperationCPU MinimalCPU FullGPU MinimalGPU Full
Video Summarization (30s clip)45s40s8s5s
Object Detection (1 frame)1.2s1.0s0.2s0.15s
Object Tracking (100 frames)25s20s5s3s
Cold Start (model loading)10s15s20s30s
Memory Usage2GB3GB6GB10GB

These benchmarks use a sample video with 1920x1080 resolution at 30 FPS. Actual performance varies based on hardware, video complexity, and model configuration.

Build Modes

Minimal Build

The minimal build mode (BUILD_MODE=minimal) includes base dependencies only:

  • Faster build times (5 minutes vs 15 minutes)
  • Smaller image size (2GB vs 8GB)
  • Suitable for CI/CD pipelines and local development
  • Includes basic inference capabilities

Full Build

The full build mode (BUILD_MODE=full) includes all inference engines:

  • SGLang inference engine (optimized for VLMs and LLMs)
  • vLLM fallback inference engine
  • All object detection models (YOLO variants)
  • All tracking models (SAMURAI, SAM2Long, ByteTrack)
  • Longer build times but better performance

When to Use Each Mode

CPU + Minimal

Use for:

  • CI/CD pipelines (fast builds, low resource usage)
  • Local development without GPU
  • Testing core functionality without AI features
  • Systems where model inference is not required

CPU + Full

Use for:

  • Production CPU deployments (cloud servers without GPUs)
  • Testing inference engines before GPU deployment
  • Development with full AI features on CPU-only machines

GPU + Minimal

Use for:

  • Quick GPU testing
  • Prototyping with minimal dependencies
  • Resource-constrained GPU environments

Use for:

  • Production deployments with NVIDIA GPUs
  • Maximum AI inference performance
  • Complete feature set with all models available

Commands

Start CPU Mode (Default)

docker compose up

Starts all services in CPU mode. The model service uses CPU inference.

Start GPU Mode

docker compose --profile gpu up

Starts services with GPU support. Requires NVIDIA GPU and nvidia-docker2 installed.

Build with Specific Mode

# Minimal build (fast, smaller image)
docker compose build --build-arg BUILD_MODE=minimal

# Full build (slow, larger image, all engines)
docker compose build --build-arg BUILD_MODE=full

Verify GPU Access

After starting GPU mode, verify GPU is accessible:

docker compose exec model-service nvidia-smi

This command should display NVIDIA GPU information. If it fails, check NVIDIA driver and nvidia-docker2 installation.

Next Steps