Deploy using Docker/Podman
This guide walks through deploying NetApp Project Neo v4 using Docker or Podman Compose. The deployment includes six services:
postgres-- PostgreSQL 17 shared databaseapi-- FastAPI service (HTTP API + MCP transport) on port8000worker-- Background processing (crawling, upload, NER orchestration)extractor-- Content extraction (MarkItDown, Docling, VLM)ner-- GLiNER2 Named Entity Recognitionneoui-- Web management console on port8081nginx-- An optional load balancer for scaled deployment
Prerequisites
Docker
- Docker installed on your system. You can download Docker from the official Docker website.
- Docker Compose installed. You can find installation instructions on the Docker Compose installation page.
Podman
- Podman installed on your system. You can download Podman from the official Podman website.
- Podman Compose installed. You can install it using the Podman Compose installation instructions.
- Linux distribution, like RHEL-based, might not deploy all the podman packages for advanced networking configuration such as
podman-pluginsandcontainernetworking-plugins.
WARNING
The main difference between docker and podman is that podman requires a sudo prefix for privileged containers. Docker's daemon already runs containers in a privileged mode.
- Sufficient system resources to run NetApp Neo. Refer to the Sizing Guide in the Deployment section for recommended specifications.
cifs-utilspackage deployed on the Linux host (required for SMB share mounting by the extractor service).SELinuxcontexts may require adjustments based on your specific Linux host security profile.
Deployment Guide
TIP
Both docker-compose.yml and .env files provides a comprehensive inline documentation for every environment variable, GPU configuration options.
Docker Compose file
Create a directory, e.g., neov4
mkdir neov4 && cd neov4The following Docker Compose file can be copied as docker-compose.yaml
# =============================================================================
# NetApp Project Neo - Docker Compose Example
# =============================================================================
#
# Multi-service deployment with independently scalable components:
# - postgres: PostgreSQL database (shared by all services)
# - api: HTTP API + MCP transport (user-facing)
# - worker: Background processing (crawling, upload, orchestration)
# - extractor: Content extraction (MarkItDown, Docling, VLM)
# - ner: Named Entity Recognition (GLiNER2)
# - neoui: Web management console
# - nginx: Load balancer if api service is scaled up
#
# Quick start:
# 1. Copy this file: cp docker-compose.example.yml docker-compose.yml
# 2. Configure environment variables in .env
# 3. Launch: docker compose up -d --build
#
# Scale independently:
# docker compose up -d --scale worker=3
# docker compose up -d --scale extractor=5 --scale ner=2
#
# =============================================================================
services:
# ---------------------------------------------------------------------------
# PostgreSQL Database
# ---------------------------------------------------------------------------
# Shared by API, worker, and extractor services.
# Data persists in the postgres_data volume.
postgres:
hostname: ${POSTGRES_HNAME}
image: docker.io/library/postgres:17
container_name: neo-postgres
env_file:
- .env
ports:
- "5432:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
interval: 10s
timeout: 5s
retries: 5
networks:
- neo-network
restart: unless-stopped
# ---------------------------------------------------------------------------
# API Service
# ---------------------------------------------------------------------------
# Lightweight FastAPI service handling HTTP endpoints, MCP transport, and
# OAuth. No background processing — delegates to the worker service.
api:
image: ghcr.io/netapp/netapp-neo-api:${NEO_VERSION}
ports:
- "8000:8000"
env_file:
- .env
environment:
# -----------------------------------------------------------------------
# Internal service communication
# -----------------------------------------------------------------------
# WORKER_SERVICE_URL: URL of the worker service. Used by the API to
# proxy SMB connection tests to the worker (which has SMB tools).
WORKER_SERVICE_URL: http://worker:8000
# NER_SERVICE_URL: URL of the NER service. Used by the API to
# forward device configuration changes (GPU/CPU switching).
NER_SERVICE_URL: http://ner:8000
depends_on:
postgres:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 10s
networks:
- neo-network
restart: unless-stopped
# ---------------------------------------------------------------------------
# Extractor Service
# ---------------------------------------------------------------------------
# Extracts text content from documents on SMB shares. Supports multiple
# backends: MarkItDown (Office/PDF), Docling (OCR/tables), Docling VLM
# (vision language models). Mounts shares independently via CIFS.
extractor:
image: ghcr.io/netapp/netapp-neo-extractor:${NEO_VERSION}
# privileged needed for NFS/CIFS mounting inside the container
privileged: true
env_file:
- .env
depends_on:
postgres:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
networks:
- neo-network
restart: unless-stopped
# ---------------------------------------------------------------------------
# NER Service
# ---------------------------------------------------------------------------
# Named Entity Recognition using GLiNER2. Receives extracted text from the
# worker, returns entities (people, organizations, dates, etc.),
# classifications (document type), and structured data. Stateless — no file
# or database access needed.
ner:
image: ghcr.io/netapp/netapp-neo-ner:${NEO_VERSION}
env_file:
- .env
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 5
start_period: 180s # GLiNER2 model loading on CUDA takes ~2 min
networks:
- neo-network
restart: unless-stopped
# ---------------------------------------------------------------------------
# Worker Service
# ---------------------------------------------------------------------------
# Background processing: SMB file crawling, content extraction orchestration,
# Microsoft Graph upload, ACL resolution, and NER orchestration. Delegates
# extraction to the extractor service and NER to the NER service.
worker:
image: ghcr.io/netapp/netapp-neo-worker:${NEO_VERSION}
cap_add:
- SYS_ADMIN
- DAC_READ_SEARCH
security_opt:
- apparmor:unconfined
env_file:
- .env
environment:
# -----------------------------------------------------------------------
# External service URLs
# -----------------------------------------------------------------------
# EXTRACTOR_SERVICE_URL: URL of the extractor service. The worker sends
# document extraction requests here instead of extracting locally.
EXTRACTOR_SERVICE_URL: http://extractor:8000
# NER_SERVICE_URL: URL of the NER service. The worker sends extracted
# text here for entity recognition instead of running GLiNER2 locally.
NER_SERVICE_URL: http://ner:8000
depends_on:
postgres:
condition: service_healthy
extractor:
condition: service_healthy
ner:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
networks:
- neo-network
restart: unless-stopped
# ---------------------------------------------------------------------------
# Neo UI (Web Management Console)
# ---------------------------------------------------------------------------
# Browser-based management console for configuring shares, monitoring
# crawls, and managing the connector. Connects to the API service.
neoui:
hostname: neoui
image: ghcr.io/beezy-dev/neo-ui-framework:${NUI_VERSION}
container_name: neoui
ports:
- "8080:80"
environment:
# NEO_API: URL of the API service. The UI proxies all requests here.
NEO_API: http://api:8000
depends_on:
api:
condition: service_healthy
networks:
- neo-network
restart: unless-stopped
# ---------------------------------------------------------------------------
# Nginx Load Balancer (optional)
# ---------------------------------------------------------------------------
# Only starts when using the "with-lb" profile:
# docker compose --profile with-lb up -d
#
# Requires nginx.conf and (optionally) certs/ in the project root.
# nginx:
# hostname: nginx
# image: nginx:alpine
# container_name: neo-lb
# ports:
# - "80:80"
# - "443:443"
# volumes:
# - ./nginx.conf:/etc/nginx/nginx.conf:ro
# - ./certs:/etc/nginx/certs:ro
# depends_on:
# - api
# networks:
# - neo-network
# restart: unless-stopped
# profiles:
# - with-lb
# =============================================================================
# Volumes & Networks
# =============================================================================
volumes:
postgres_data:
driver: local
networks:
neo-network:
driver: bridgeEnvironment file (.env)
Aside from the versioning and database paramters, Neo services can be configured after startup either via the UI or the API. Here are the parameters to be modified from the example .env file:
# Neo container image versioning
NEO_VERSION=4.0.3p7
NUI_VERSION=3.2.2
## Database Settings (required)
# Modify accordingly to your preferences
# CAN NOT BE MODIFIED AFTER FIRST RUN.
POSTGRES_HNAME=postgres
POSTGRES_USER=neo
POSTGRES_PASSWORD=neo_password
POSTGRES_DB=neo_connector
POSTGRES_PORT=5432Once you have the above parameters squared out, copy this .env file in the same directory where you have created the docker-compose.yaml file, and modify the versioning and database parameters accordingly to your preferences:
# =============================================================================
# NetApp Project Neo v4 - .env Example
# =============================================================================
#
# Copy this file as .env in the same directory as the docker-compose.yaml
# =============================================================================
## MUST BE CONFIGURED PARAMETERS
# =============================================================================
# Neo container image versioning
NEO_VERSION=4.0.3p7
NUI_VERSION=3.2.2
## Database Settings (required)
# Modify accordingly to your preferences
# CAN NOT BE MODIFIED AFTER FIRST RUN.
POSTGRES_HNAME=postgres
POSTGRES_USER=neo
POSTGRES_PASSWORD=neo_password
POSTGRES_DB=neo_connector
POSTGRES_PORT=5432
# =============================================================================
## OPTIONAL PARAMETERS
# =============================================================================
# License key for the connector.
# Can be configured via API at /api/v1/setup/license after deployment.
# NETAPP_CONNECTOR_LICENSE=your-license-key
## Authentication
# JWT_SECRET_KEY: Secret key for signing JWT access tokens. Auto-generated and stored in the database if not set. All services sharing the same database will use the same key automatically.
# Only set this if you need to override the database-stored key.
# JWT_SECRET_KEY=
# ACCESS_TOKEN_EXPIRE_MINUTES: How long JWT tokens remain valid.
# Default: 1440 (24 hours).
# ACCESS_TOKEN_EXPIRE_MINUTES=1440
## Encryption
# ENCRYPTION_KEY: Fernet key for encrypting sensitive data (SMB
# passwords). Auto-generated and stored in the database on first
# startup if not set. All services sharing the same database will
# retrieve the same key automatically. Only set this to override
# the database-stored key.
# ENCRYPTION_KEY=
## Microsoft Graph (optional - for M365 Copilot integration)
# MS_GRAPH_TENANT_ID=
# MS_GRAPH_CLIENT_ID=
# MS_GRAPH_CLIENT_SECRET=
# MS_GRAPH_CONNECTOR_ID=netappneo
## MCP OAuth (optional - for securing MCP endpoints)
# MCP_OAUTH_ENABLED=false
# MCP_OAUTH_TENANT_ID=
# MCP_OAUTH_CLIENT_ID=
## MCP API Key (optional - alternative to OAuth for MCP)
# MCP_API_KEY=
## Worker Concurrency
# NUM_UPLOAD_WORKERS=3
# NUM_EXTRACTION_WORKERS=2
# NUM_ACL_RESOLUTION_WORKERS=2
# NUM_NER_WORKERS=1
## NER Settings
# NER_CONFIDENCE_THRESHOLD=0.7
# NER_DEVICE=auto
## Extractor Settings
# EXTRACTOR_LOG_LEVEL=INFO
# EXTRACTOR_DEFAULT_PIPELINE=markitdown
# =============================================================================
## ONLY CHANGE IF INSTRUCTED BY NETAPP SUPPORT
# =============================================================================
# License Connector Identifier.
# Default: netappneo. Only change if instructed by NetApp support.
CONNECTOR_ID=netappneo
## Constructed DATABASE_URL (used by all services)
DATABASE_URL=postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HNAME}:${POSTGRES_PORT}/${POSTGRES_DB}TIP
The usage of Docker/Podman Secrets is recommended for production deployments to avoid storing credentials in plain text.
Start the containers
docker compose up -d --build
docker compose ps
Expected output:
NAME IMAGE STATUS PORTS
neo-postgres postgres:17 Up 30 seconds (healthy)
api-1 neo-api Up 25 seconds (healthy) 0.0.0.0:8000->8000/tcp
extractor-1 neo-extractor Up 28 seconds (healthy)
ner-1 neo-ner Up 28 seconds (healthy)
worker-1 neo-worker Up 20 seconds (healthy)
neoui ghcr.io/beezy-dev/neo-ui-framework:3.2.2 Up 18 seconds 0.0.0.0:8081->80/tcp
View logs:
docker compose logs -fsudo podman compose up -d --build
sudo podman compose ps
Expected output:
NAME IMAGE STATUS PORTS
neo-postgres postgres:17 Up 30 seconds (healthy)
api-1 neo-api Up 25 seconds (healthy) 0.0.0.0:8000->8000/tcp
extractor-1 neo-extractor Up 28 seconds (healthy)
ner-1 neo-ner Up 28 seconds (healthy)
worker-1 neo-worker Up 20 seconds (healthy)
neoui ghcr.io/beezy-dev/neo-ui-framework:3.2.2 Up 18 seconds 0.0.0.0:8081->80/tcp
View logs:
sudo podman compose logs -fTIP
The NER service takes up to 2-3 minutes to start on first launch while it downloads the GLiNER2 model. The worker service waits for NER to become healthy before starting.
You should see logs indicating that the API service has started in setup mode:
api-1 | INFO | app.main:lifespan - Starting up application...
api-1 | INFO | app.main:lifespan - Setup mode: Skipping license validation and Graph initialization
api-1 | INFO | app.main:lifespan - Complete setup via /api/v1/setup endpoints to enable full functionality
api-1 | INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)Scale services independently
Neo v4 supports independent scaling of worker, extractor, and NER services:
# Scale workers for higher crawling throughput
docker compose up -d --scale worker=3
# Scale extractors for faster document processing
docker compose up -d --scale extractor=5 --scale ner=2sudo podman compose up -d --scale worker=3
sudo podman compose up -d --scale extractor=5 --scale ner=2Configure
via GUI
Neo Console is available at http://your-host:8081 and will present the setup wizard on first launch.
Go to Settings and select the Neo Core tab to begin configuration.
- Enter a valid license key and save.
- Optionally configure Microsoft Graph, SSL, or proxy settings.
- Click Setup Complete to finalize. This triggers a restart of the services with the configured settings.
Once setup completes, the page displays a status of "Complete" and an Admin Credentials button appears with temporary login credentials.
IMPORTANT
The temporary password will not be accessible again after you log in. Save it in your password manager or change it immediately in the Users page.
via API
Neo can also be configured via the API. The interactive API documentation is available at http://your-host:8000/docs.
Step 1: Set the license key
curl -X POST http://localhost:8000/api/v1/setup/license \
-H "Content-Type: application/json" \
-d '{"license_key": "your-license-key"}'Step 2: (Optional) Configure Microsoft Graph
curl -X POST http://localhost:8000/api/v1/setup/graph \
-H "Content-Type: application/json" \
-d '{
"tenant_id": "your-tenant-id",
"client_id": "your-client-id",
"client_secret": "your-client-secret"
}'Step 3: Complete setup
curl -X POST http://localhost:8000/api/v1/setup/completeGPU Acceleration (optional)
The ner and extractor services support GPU acceleration for faster inference.
NVIDIA GPU
Add the following to the ner and/or extractor service in your docker-compose.yml:
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]Requires nvidia-container-toolkit installed on the host.
AMD ROCm GPU
Add the following to the ner and/or extractor service in your docker-compose.yml:
devices:
- /dev/kfd:/dev/kfd
- /dev/dri:/dev/dri
group_add:
- video
- render
environment:
NER_DEVICE: cuda # ROCm uses the CUDA compatibility layerRequires ROCm drivers installed on the host.
Troubleshooting
PostgreSQL
Check if the database was created:
docker exec -it neo-postgres psql -h localhost -U neo -d neo_connector -c '\l'sudo podman exec -it neo-postgres psql -h localhost -U neo -d neo_connector -c '\l'Expected output should include neo_connector in the database list.
API Service
docker compose logs -f apisudo podman compose logs -f apiCheck the health endpoint:
curl http://localhost:8000/healthWorker Service
docker compose logs -f workersudo podman compose logs -f workerTIP
The worker requires SYS_ADMIN and DAC_READ_SEARCH capabilities and apparmor:unconfined security option. If the worker fails to start, verify that your container runtime supports these settings.
Extractor Service
docker compose logs -f extractorsudo podman compose logs -f extractorTIP
The extractor runs in privileged mode to support NFS/CIFS mounting inside the container. If mounting fails, verify that cifs-utils is installed on the host.
NER Service
docker compose logs -f nersudo podman compose logs -f nerThe NER service downloads the GLiNER2 model on first startup. If it fails, check network connectivity and disk space.
Neo UI
docker logs -f neouisudo podman logs -f neouiCheck the browser developer console for additional error messages.
Next steps
This concludes the steps to deploy NetApp Neo using Docker/Podman Compose. For more advanced configurations and management options, refer to the Management section of the documentation.