| Software | Minimum Version | Purpose |
|---|---|---|
| Docker | 24.0+ | Container runtime |
| Docker Compose | V2 (2.20+) | Multi-container orchestration |
| Docker Buildx | 0.12+ | BuildKit image builder |
| Git | 2.30+ | Cloning repositories |
| ProtonVPN Account | Active subscription | VPN credentials for Gluetun |
# Check if Buildx is installed
docker buildx version
# If not installed, install it (Debian/Ubuntu)
sudo apt-get update
sudo apt-get install -y docker-buildx-plugin
# Enable BuildKit
echo 'export DOCKER_BUILDKIT=1' >> ~/.bashrc
source ~/.bashrc
| Component | Minimum | Recommended |
|---|---|---|
| CPU | 2 cores | 4+ cores |
| RAM | 4 GB | 8 GB+ |
| Disk | 20 GB | 50 GB+ SSD |
| Network | Stable internet connection | Low-latency to VPN servers |
Create the base directory structure for your deployment:
# Create base directory (adjust path to your preference)
mkdir -p /opt/firecrawl-vpn && cd /opt/firecrawl-vpn
# Create data directories
mkdir -p firecrawl-postgres
mkdir -p firecrawl-redis
mkdir -p firecrawl-data
/opt/firecrawl-vpn/
├── .env # Environment variables
├── docker-compose.yaml # Service definitions
├── settings.yml # SearXNG configuration
├── firecrawl-postgres/ # PostgreSQL data volume
│ └── ...
├── firecrawl-redis/ # Redis data volume
│ └── dump.rdb
└── firecrawl-data/ # Firecrawl working directory
└── ...
Firecrawl requires a custom Docker network for inter-service communication. The network subnet must be known ahead of time to configure Gluetun's firewall correctly.
# Create the arrs network with a specific subnet
docker network create --driver bridge --subnet 172.28.0.0/16 arrs
IMPORTANT: Note the subnet (
172.28.0.0/16). You will need this exact value for Gluetun'sFIREWALL_OUTBOUND_SUBNETSsetting in Section 4.
docker network inspect arrs
Gluetun is a Docker container that creates a VPN tunnel using WireGuard (or OpenVPN). All traffic from containers using the VPN will exit through your ProtonVPN server.
ca-wireguard.protonvpn.netSet these essential variables in your docker-compose.yaml:
environment:
- VPN_SERVICE_PROVIDER=protonvpn
- VPN_TYPE=wireguard
- WIREGUARD_PRIVATE_KEY=<YOUR_WIREGUARD_PRIVATE_KEY>
- WIREGUARD_ADDRESSES=<YOUR_WIREGUARD_IP>/32
- VPN_PORT_FORWARDING=on
- HTTPPROXY=on # Enable HTTP proxy on 8888
- HTTPPROXY_LOG=on # Optional: log proxy requests
- FIREWALL_OUTBOUND_SUBNETS=172.28.0.0/16 # Must match your Docker network subnet
- TZ=America/Chicago
The FIREWALL_OUTBOUND_SUBNETS setting tells Gluetun which networks it should allow traffic to without going through the VPN tunnel. This is essential because Docker internal services (Redis, PostgreSQL, RabbitMQ) live on a Docker bridge network that must be reachable directly.
- FIREWALL_OUTBOUND_SUBNETS=172.28.0.0/16
Why this matters: Without this setting, Gluetun will block all traffic to your internal Docker services, causing Redis connection timeouts and PostgreSQL connection failures. The value
172.28.0.0/16must match the subnet you created in Section 3. If you used a different subnet, update this value accordingly.How to find your Docker network subnet:
docker network inspect arrs | grep Subnet # Output: "Subnet": "172.28.0.0/16"
Gluetun must be connected to the arrs network so that all services can reach each other via Docker DNS.
networks:
- arrs
Recommended pattern:
networks: - arrs.network_mode: "service:gluetun".Create a .env file with the following template. Replace placeholder values with your own credentials.
# ============================================
# Firecrawl Core Settings
# ============================================
FIRECRAWL_VERSION=latest
FIRECRAWL_BULL_AUTH_KEY=YOUR_ADMIN_API_KEY_HERE
FIRECRAWL_BASE_URL=http://localhost:3002
# ============================================
# Database Authentication
# For local/self-hosted use, set to false to avoid Supabase errors
# ============================================
FIRECRAWL_USE_DB_AUTH=false
# ============================================
# PostgreSQL Configuration
# ============================================
POSTGRES_USER=firecrawl
POSTGRES_PASSWORD=YOUR_SECURE_PASSWORD_HERE
POSTGRES_DB=firecrawl
NUQ_POSTGRES_URL=postgresql://firecrawl:YOUR_SECURE_PASSWORD_HERE@nuq-postgres:5432/firecrawl
# ============================================
# Redis Configuration
# ============================================
REDIS_URL=redis://redis:6379
REDIS_RATE_LIMIT_URL=redis://redis:6379
# ============================================
# RabbitMQ Configuration (REQUIRED for NuQ workers)
# ============================================
NUQ_RABBITMQ_URL=amqp://rabbitmq:5672
# ============================================
# AI / LLM Configuration (OpenAI-compatible endpoint)
# ============================================
OPENAI_API_KEY=YOUR_OPENAI_API_KEY_HERE
OPENAI_BASE_URL=https://api.openai.com/v1
# ============================================
# Worker / Concurrency Settings
# ============================================
FIRECRAWL_CONCURRENCY=10
PLAYWRIGHT_MAX_CONCURRENCY_PER_PROJECT=5
# ============================================
# System Resource Thresholds
# ============================================
CPU_THRESHOLD_PERCENT=80
MEMORY_THRESHOLD_PERCENT=90
DISK_THRESHOLD_PERCENT=85
# ============================================
# Logging
# ============================================
LOG_LEVEL=info
# ============================================
# SearXNG (Self-Hosted Search Engine)
# ============================================
SEARXNG_SECRET=YOUR_SEARXNG_SECRET_KEY_HERE
SEARXNG_IMAGE_PROXY=true
SEARXNG_PORT=8085
SEARXNG_BIND_ADDRESS=0.0.0.0
Security Note: Generate secure random values for all password and secret fields. Do not use the placeholder values shown above in production.
Create a docker-compose.yaml file with all services. This configuration includes:
services:
# ============================================
# Gluetun VPN Tunnel
# ============================================
gluetun:
image: qmcgaw/gluetun
container_name: gluetun
cap_add:
- NET_ADMIN
devices:
- /dev/net/tun:/dev/net/tun
ports:
- 8888:8888/tcp # HTTP proxy (used by Firecrawl/Playwright)
- 8002:8000/tcp # Gluetun HTTP control server (optional)
# Add other ports for tunneled services as needed
volumes:
- ./auth:/gluetun/auth # Auth config for control server / proxy
environment:
- VPN_SERVICE_PROVIDER=protonvpn
- VPN_TYPE=wireguard
- WIREGUARD_PRIVATE_KEY=<YOUR_WIREGUARD_PRIVATE_KEY>
- WIREGUARD_ADDRESSES=<YOUR_WIREGUARD_IP>/32
- VPN_PORT_FORWARDING=on
- HTTPPROXY=on
- HTTPPROXY_LOG=on
- FIREWALL_OUTBOUND_SUBNETS=172.28.0.0/16
- TZ=America/Chicago
- UPDATER_PERIOD=24h
networks:
- arrs
restart: unless-stopped
# ============================================
# Firecrawl API (on arrs network, web traffic via Gluetun proxy)
# ============================================
firecrawl-api:
build:
context: ../firecrawl-src/apps/api
dockerfile: Dockerfile
ulimits:
nofile:
soft: 65535
hard: 65535
networks:
- arrs
ports:
- "3002:3002"
depends_on:
gluetun:
condition: service_started
redis:
condition: service_started
rabbitmq:
condition: service_started
playwright-service:
condition: service_started
nuq-postgres:
condition: service_healthy
environment:
HOST: "0.0.0.0"
PORT: "3002"
WORKER_PORT: "3005"
ENV: local
# Use Docker service names (no hardcoded IPs)
REDIS_URL: "redis://redis:6379"
REDIS_RATE_LIMIT_URL: "redis://redis:6379"
PLAYWRIGHT_MICROSERVICE_URL: "http://playwright-service:3000/scrape"
POSTGRES_USER: ${FIRECRAWL_POSTGRES_USER:-firecrawl}
POSTGRES_PASSWORD: ${FIRECRAWL_POSTGRES_PASSWORD}
POSTGRES_DB: ${FIRECRAWL_POSTGRES_DB:-firecrawl}
POSTGRES_HOST: "nuq-postgres"
POSTGRES_PORT: "5432"
# For local-only/self-hosted use, disable DB auth to avoid Supabase errors
USE_DB_AUTHENTICATION: "false"
NUM_WORKERS_PER_QUEUE: ${FIRECRAWL_NUM_WORKERS:-16}
CRAWL_CONCURRENT_REQUESTS: ${FIRECRAWL_CONCURRENT_REQUESTS:-20}
MAX_CONCURRENT_JOBS: ${FIRECRAWL_MAX_JOBS:-10}
BROWSER_POOL_SIZE: ${FIRECRAWL_BROWSER_POOL:-10}
OPENAI_BASE_URL: ${FIRECRAWL_OPENAI_BASE_URL}
OPENAI_API_KEY: ${FIRECRAWL_OPENAI_API_KEY}
MODEL_NAME: ${FIRECRAWL_MODEL_NAME}
MODEL_EMBEDDING_NAME: ${FIRECRAWL_MODEL_EMBEDDING_NAME}
BULL_AUTH_KEY: ${FIRECRAWL_BULL_AUTH_KEY}
TEST_API_KEY: ${FIRECRAWL_TEST_API_KEY}
LOGGING_LEVEL: ${FIRECRAWL_LOGGING_LEVEL:-info}
# Route scraping traffic through Gluetun VPN via HTTP proxy
PROXY_SERVER: "http://gluetun:8888"
PROXY_USERNAME: ${PROXY_USERNAME:-}
PROXY_PASSWORD: ${PROXY_PASSWORD:-}
NO_PROXY: "localhost,127.0.0.1,redis,nuq-postgres,playwright-service,host.docker.internal"
SEARXNG_ENDPOINT: ${FIRECRAWL_SEARXNG_ENDPOINT}
SEARXNG_ENGINES: ${FIRECRAWL_SEARXNG_ENGINES:-}
SEARXNG_CATEGORIES: ${FIRECRAWL_SEARXNG_CATEGORIES:-}
MAX_CPU: ${FIRECRAWL_MAX_CPU:-0.85}
MAX_RAM: ${FIRECRAWL_MAX_RAM:-0.90}
NUQ_RABBITMQ_URL: "amqp://rabbitmq:5672"
volumes:
- ./firecrawl:/app/data
cpus: 16.0
mem_limit: 32G
memswap_limit: 32G
restart: unless-stopped
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "5"
compress: "true"
# ============================================
# Playwright Service (on arrs network, web traffic via Gluetun proxy)
# ============================================
playwright-service:
build:
context: ../firecrawl-src/apps/playwright-service-ts
dockerfile: Dockerfile
shm_size: "2g"
networks:
- arrs
depends_on:
- gluetun
environment:
PORT: "3000"
# Route browser traffic through Gluetun HTTP proxy
PROXY_SERVER: "http://gluetun:8888"
PROXY_USERNAME: ${PROXY_USERNAME:-}
PROXY_PASSWORD: ${PROXY_PASSWORD:-}
BLOCK_MEDIA: ${BLOCK_MEDIA:-}
NO_PROXY: "localhost,127.0.0.1,redis,nuq-postgres,playwright-service,host.docker.internal"
MAX_CONCURRENT_PAGES: ${CRAWL_CONCURRENT_REQUESTS:-20}
cpus: 8.0
mem_limit: 16G
memswap_limit: 16G
restart: unless-stopped
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "5"
compress: "true"
# ============================================
# SearXNG (on arrs network, web traffic via Gluetun proxy)
# ============================================
searxng:
container_name: searxng
image: docker.io/searxng/searxng:${SEARXNG_VERSION:-latest}
networks:
- arrs
ports:
- "8001:8080"
env_file: ./.env
volumes:
- ./searxng-tunneled/:/etc/searxng/:Z
- ./searxng-tunneled/core-data:/var/cache/searxng/
restart: unless-stopped
depends_on:
- gluetun
# ============================================
# Redis (internal, NO VPN)
# ============================================
redis:
image: redis:7-alpine
container_name: firecrawl-redis
# Use noeviction so jobs/queues are not silently dropped when memory is full
command: redis-server --bind 0.0.0.0 --maxmemory 8gb --maxmemory-policy noeviction
volumes:
- ./firecrawl-redis:/data
networks:
- arrs
restart: unless-stopped
logging:
driver: "json-file"
options:
max-size: "5m"
max-file: "3"
compress: "true"
# ============================================
# NuQ PostgreSQL (internal, NO VPN)
# ============================================
nuq-postgres:
build:
context: ../firecrawl-src/apps/nuq-postgres
dockerfile: Dockerfile
container_name: firecrawl-nuq-postgres
environment:
POSTGRES_USER: ${FIRECRAWL_POSTGRES_USER:-firecrawl}
POSTGRES_PASSWORD: ${FIRECRAWL_POSTGRES_PASSWORD}
POSTGRES_DB: ${FIRECRAWL_POSTGRES_DB:-firecrawl}
volumes:
- ./firecrawl-postgres:/var/lib/postgresql/data
networks:
- arrs
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${FIRECRAWL_POSTGRES_USER:-firecrawl} -d ${FIRECRAWL_POSTGRES_DB:-firecrawl}"]
start_period: 30s
interval: 10s
timeout: 5s
retries: 10
restart: unless-stopped
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "5"
compress: "true"
# ============================================
# RabbitMQ (message queue for workers)
# ============================================
rabbitmq:
image: rabbitmq:3-management-alpine
container_name: firecrawl-rabbitmq
networks:
- arrs
restart: unless-stopped
logging:
driver: "json-file"
options:
max-size: "5m"
max-file: "3"
compress: "true"
networks:
arrs:
driver: bridge
The NuQ PostgreSQL database requires specific tables for job queues (nuq.queue_scrape, nuq.queue_crawl_finished, etc.). These tables are created by the nuq.sql init script, but only on first database initialization. If your PostgreSQL volume already contains data from a previous installation, you must apply the schema manually.
If you have the Firecrawl source code:
# Path to the NuQ SQL init script
FIRECRAWL_SRC_PATH=../firecrawl-src/apps/nuq-postgres/nuq.sql
If you don't have the source, clone it:
cd /opt
git clone https://github.com/mendableai/firecrawl.git
cd firecrawl
# Copy the SQL file into the PostgreSQL container
docker cp ../firecrawl-src/apps/nuq-postgres/nuq.sql firecrawl-nuq-postgres:/tmp/nuq.sql
# Execute it against the database
docker exec firecrawl-nuq-postgres psql -U firecrawl -d firecrawl -f /tmp/nuq.sql
# Clean up
docker exec firecrawl-nuq-postgres rm /tmp/nuq.sql
# Connect to PostgreSQL and list tables
docker exec -it firecrawl-nuq-postgres psql -U firecrawl -d firecrawl
# In psql, run:
\dt nuq.*
# Expected output should show:
# List of relations
# Schema | Name | Type | Owner
# --------+--------------------+-------+----------
# nuq | queue_crawl_finished | table | firecrawl
# nuq | queue_crawl_init | table | firecrawl
# nuq | queue_scrape | table | firecrawl
# nuq | queue_scrape_done | table | firecrawl
# (4 rows)
# Navigate to your deployment directory
cd /opt/firecrawl-vpn
# Enable BuildKit for faster builds
export DOCKER_BUILDKIT=1
export COMPOSE_DOCKER_CLI_BUILD=1
# Start all services in detached mode
docker compose up -d --remove-orphans
# Watch all container logs
docker compose logs -f
# Watch specific service logs
docker compose logs -f firecrawl-api
docker compose logs -f gluetun
Wait for all services to become healthy. The PostgreSQL container will show healthy status when ready.
docker compose ps
Expected output:
NAME STATUS
firecrawl-api Up (healthy)
firecrawl-gluetun Up
firecrawl-playwright Up
firecrawl-rabbitmq Up
firecrawl-redis Up
firecrawl-searxng Up
firecrawl-nuq-postgres Up (healthy)
curl -X POST http://localhost:3002/v1/scrape \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com"
}' | jq .
Expected response: A JSON object with success: true and data containing the scraped page content.
curl -X POST http://localhost:3002/v1/crawl \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com"
}' | jq .
Expected response: A JSON object with a jobId that you can use to check crawl status.
curl -X POST http://localhost:3002/v1/search \
-H "Content-Type: application/json" \
-d '{
"query": "test query"
}' | jq .
curl -X POST http://localhost:3002/v1/extract \
-H "Content-Type: application/json" \
-d '{
"urls": ["https://example.com"]
}' | jq .
curl "http://localhost:8085/search?q=test&format=json" | jq .
Expected response: JSON with search results from the self-hosted SearXNG engine.
Open your browser and navigate to:
http://localhost:3002/admin/YOUR_ADMIN_API_KEY_HERE/queues
You should see the BullMQ dashboard showing queue statuses for scrape, crawl, and extract jobs.
Check the Gluetun container logs for your public IP:
docker logs firecrawl-gluetun 2>&1 | grep -i "public ip"
Or test by scraping a site that returns your IP:
curl -X POST http://localhost:3002/v1/scrape \
-H "Content-Type: application/json" \
-d '{
"url": "https://api.ipify.org?format=json"
}' | jq .
The returned IP should match your ProtonVPN server's exit IP, not your real public IP.
Symptom: Firecrawl API logs show Error: connect ETIMEDOUT when connecting to Redis at 172.28.0.2:6379.
Root Cause: Gluetun's firewall is blocking traffic to the Docker bridge network because either:
FIREWALL_OUTBOUND_SUBNETS does not match the actual Docker network subnetarrs networkFix:
# 1. Verify the Docker network subnet
docker network inspect arrs | grep Subnet
# 2. Check that FIREWALL_OUTBOUND_SUBNETS matches (in docker-compose.yaml line ~FIREWALL_OUTBOUND_SUBNETS)
# Should be: FIREWALL_OUTBOUND_SUBNETS=172.28.0.0/16 (or whatever your subnet is)
# 3. Verify Gluetun is on the arrs network
docker network inspect arrs | grep gluetun
# 4. If missing, add networks: - arrs to the gluetun service in docker-compose.yaml
# 5. Restart affected services
docker compose restart gluetun firecrawl-api
Symptom: Logs show relation "nuq.queue_scrape" does not exist or similar errors.
Root Cause: The NuQ database schema was never applied because the PostgreSQL volume already had data from a previous installation.
Fix:
# Apply the NuQ schema manually (see Section 7.2)
docker cp ../firecrawl-src/apps/nuq-postgres/nuq.sql firecrawl-nuq-postgres:/tmp/nuq.sql
docker exec firecrawl-nuq-postgres psql -U firecrawl -d firecrawl -f /tmp/nuq.sql
docker exec firecrawl-nuq-postgres rm /tmp/nuq.sql
# Restart firecrawl-api
docker compose restart firecrawl-api
Symptom: The extract-worker process crashes with Error: NUQ_RABBITMQ_URL is not configured.
Root Cause: RabbitMQ service is missing from the docker-compose.yaml or the environment variable is not set.
Fix:
# 1. Add RabbitMQ service to docker-compose.yaml (see Section 6)
rabbitmq:
image: rabbitmq:3-management-alpine
container_name: firecrawl-rabbitmq
networks:
- arrs
restart: unless-stopped
# 2. Add NUQ_RABBITMQ_URL to firecrawl-api environment
NUQ_RABBITMQ_URL: amqp://rabbitmq:5672
# 3. Add rabbitmq to firecrawl-api depends_on
depends_on:
rabbitmq:
condition: service_started
# 4. Restart all services
docker compose up -d --remove-orphans
Symptom: Logs show:
Cause: DB/Supabase auth is enabled but not configured.
Fix (recommended for local/self-hosted):
Symptom: firecrawl-api cannot connect to Redis/PostgreSQL/RabbitMQ, or shows:
Cause: Containers were started from an older configuration and are not on the correct network.
Fix:
Symptom: Gluetun logs show repeated connection attempts or authentication failures.
Fix:
# Check Gluetun logs for specific error
docker logs firecrawl-gluetun --tail 50
# Common fixes:
# 1. Verify PROTONVPN_USER and PROTONVPN_PASS are correct
# 2. Check if the selected server is available
# 3. Try a different VPN_TYPE (wireguard vs openvpn)
# 4. Ensure port /dev/net/tun is mounted
Symptom: Scraped content shows your real IP instead of VPN exit IP.
Fix:
Symptom: SearXNG returns empty results or errors when queried.
Fix:
# 1. Check SearXNG logs
docker logs firecrawl-searxng --tail 50
# 2. Verify settings.yml is mounted correctly
docker compose exec firecrawl-searxng ls /etc/searxng/settings.yml
# 3. Ensure SEARXNG_SECRET is set (required for security)
# 4. Check that SearXNG is on the same network as firecrawl-api (via gluetun)
Symptom: System becomes unresponsive or containers are killed due to OOM.
Fix:
# 1. Reduce concurrency in .env
FIRECRAWL_CONCURRENCY=5
PLAYWRIGHT_MAX_CONCURRENCY_PER_PROJECT=2
# 2. Limit container memory in docker-compose.yaml
deploy:
resources:
limits:
memory: 2G
# 3. Monitor resource usage
docker stats
Note: If using pre-built Firecrawl images instead of building from source, you may encounter Supabase-related errors. The NuQ schema and worker setup described in this guide is specific to the self-hosted NuQ architecture. Pre-built images may expect different database tables or connection strings.
# Navigate to deployment directory
cd /opt/firecrawl-vpn
# Pull latest images
docker compose pull
# Stop and remove old containers
docker compose down
# Start with new images
export DOCKER_BUILDKIT=1
docker compose up -d --remove-orphans
# Check logs for any migration issues
docker compose logs -f firecrawl-api
# Create backup directory
mkdir -p /opt/firecrawl-vpn-backups/$(date +%Y%m%d_%H%M%S)
# Backup PostgreSQL data
docker run --rm \
-v firecrawl-vpn_firecrawl-postgres:/data/postgres \
-v $(pwd)/backup:/data/backup \
alpine tar czf /data/backup/postgres-backup-$(date +%Y%m%d).tar.gz -C /data/postgres .
# Backup Redis data
cp -r firecrawl-redis/* backup/redis-data/
# Backup configuration files
cp .env docker-compose.yaml settings.yml backup/
echo "Backup complete: $(pwd)/backup"
# Stop services
docker compose down
# Restore PostgreSQL
docker run --rm \
-v firecrawl-vpn_firecrawl-postgres:/data/postgres \
-v $(pwd)/backup:/data/backup \
alpine sh -c "cd /data/postgres && tar xzf /data/backup/postgres-backup-YYYYMMDD.tar.gz"
# Restore Redis
cp backup/redis-data/* firecrawl-redis/
# Restart services
docker compose up -d
# View all container status
docker compose ps
# View logs for a specific service
docker compose logs -f firecrawl-api
# Restart a specific service
docker compose restart firecrawl-api
# Stop all services
docker compose down
# Stop and remove volumes (WARNING: deletes all data)
docker compose down -v
# Execute shell inside a container
docker exec -it firecrawl-api sh
docker exec -it firecrawl-nuq-postgres psql -U firecrawl -d firecrawl
# Check network connectivity between containers
docker exec firecrawl-api ping redis
docker exec firecrawl-api ping nuq-postgres
docker exec firecrawl-api ping rabbitmq
# View container resource usage
docker stats
# Inspect Docker network
docker network inspect arrs
# Check Gluetun VPN connection status
docker logs firecrawl-gluetun | grep -i "public ip\|connected"
# Test if traffic is going through VPN
curl -X POST http://localhost:3002/v1/scrape \
-H "Content-Type: application/json" \
-d '{"url": "https://api.ipify.org?format=json"}' | jq .
# List active WireGuard interfaces in Gluetun
docker exec firecrawl-gluetun wg show
| Variable | Description | Default | Required |
|---|---|---|---|
FIRECRAWL_VERSION |
Firecrawl image version | latest |
No |
FIRECRAWL_BULL_AUTH_KEY |
Admin API key for BullMQ UI | None | Yes |
FIRECRAWL_BASE_URL |
Base URL for the API | http://localhost:3002 |
No |
FIRECRAWL_USE_DB_AUTH |
Enable database persistence | false |
Yes (set to true) |
FIRECRAWL_CONCURRENCY |
Max concurrent scrape jobs | 10 |
No |
LOG_LEVEL |
Logging verbosity | info |
No |
| Variable | Description | Default | Required |
|---|---|---|---|
POSTGRES_USER |
PostgreSQL username | firecrawl |
Yes |
POSTGRES_PASSWORD |
PostgreSQL password | None | Yes |
POSTGRES_DB |
Database name | firecrawl |
No |
NUQ_POSTGRES_URL |
Full connection string to NuQ PostgreSQL | None | Yes |
| Variable | Description | Default | Required |
|---|---|---|---|
REDIS_URL |
Redis connection URL | redis://redis:6379 |
Yes |
REDIS_RATE_LIMIT_URL |
Redis rate limiting URL | redis://redis:6379 |
Yes |
| Variable | Description | Default | Required |
|---|---|---|---|
NUQ_RABBITMQ_URL |
RabbitMQ connection URL | amqp://rabbitmq:5672 |
Yes |
| Variable | Description | Default | Required |
|---|---|---|---|
OPENAI_API_KEY |
OpenAI-compatible API key | None | Conditional |
OPENAI_BASE_URL |
OpenAI-compatible endpoint URL | https://api.openai.com/v1 |
No |
| Variable | Description | Default | Required |
|---|---|---|---|
VPN_PROVIDER |
VPN provider name | protonvpn |
Yes |
VPN_TYPE |
VPN protocol | wireguard |
Yes |
PROTONVPN_USER |
ProtonVPN username | None | Yes |
PROTONVPN_PASS |
ProtonVPN password | None | Yes |
SERVER_COUNTRIES |
Preferred VPN server country | US |
No |
FIREWALL_ENABLED |
Enable outbound firewall | yes |
Yes |
FIREWALL_OUTBOUND_PORTS |
Allowed outbound ports | 80,443 |
No |
FIREWALL_OUTBOUND_SUBNETS |
Internal networks to allow | Must match Docker subnet | Yes |
| Variable | Description | Default | Required |
|---|---|---|---|
SEARXNG_SECRET |
SearXNG secret key for sessions | None | Yes |
SEARXNG_IMAGE_PROXY |
Enable image proxy | true |
No |
SEARXNG_PORT |
SearXNG listening port | 8085 |
No |
Internet
|
v
[Gluetun VPN Tunnel]
|
+-- HTTP Proxy (port 8888)
|
+-- firecrawl-api (networks: arrs, uses PROXY_SERVER=http://gluetun:8888)
| |
| +-- playwright-service (networks: arrs, uses PROXY_SERVER=http://gluetun:8888)
| |
| +-- searxng (networks: arrs, web traffic via same proxy)
|
[arrs Docker Network] <--- All services share this network
|
+-- redis
+-- nuq-postgres
+-- rabbitmq
User Request
│
▼
Firecrawl API (on arrs network, uses PROXY_SERVER=http://gluetun:8888)
│
├──► Redis (redis:6379) ──────► Session/Rate Limiting
│
├──► NuQ PostgreSQL (nuq-postgres:5432) ──► Store Scraped Data
│
├──► RabbitMQ (rabbitmq:5672) ──────► Job Queue
│ │
│ ▼
│ Extract Worker / Crawl Worker
│
├──► Playwright Service (playwright-service:3000) ──► Browser Rendering
│
└──► SearXNG (searxng:8080) ──► Search Results
| Service | CPU | RAM | Disk |
|---|---|---|---|
| Gluetun | 0.1 core | 100 MB | Minimal |
| Firecrawl API | 0.5 core | 500 MB | Minimal |
| Playwright | 0.5 core | 500 MB | Minimal |
| SearXNG | 0.2 core | 200 MB | Minimal |
| Redis | 0.1 core | 256 MB | Depends on data |
| PostgreSQL | 0.5 core | 512 MB | Depends on data |
| RabbitMQ | 0.1 core | 128 MB | Depends on queue size |
Total Minimum: 2 cores, 2.2 GB RAM, 20 GB disk
Total Recommended: 4 cores, 8 GB RAM, 50 GB SSD
┌─────────────────────────────────────────────────────────────┐
│ Quick Start Checklist │
├─────────────────────────────────────────────────────────────┤
│ □ Install Docker, Docker Compose V2, Buildx │
│ □ Create Docker network: docker network create --subnet ... │
│ □ Get ProtonVPN credentials │
│ □ Create .env file with your credentials │
│ □ Create docker-compose.yaml (Section 6) │
│ □ Create settings.yml for SearXNG │
│ □ Start services: docker compose up -d │
│ □ Apply NuQ schema if needed (Section 7) │
│ □ Verify with curl tests (Section 9) │
│ □ Check VPN is working (Section 9.7) │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Common URLs │
├─────────────────────────────────────────────────────────────┤
│ Firecrawl API: http://<host>:3002/v1/scrape │
│ Admin UI: http://<host>:3002/admin/<key>/queues │
│ SearXNG: http://<host>:8085/search │
│ RabbitMQ Mgmt: http://<host>:15672 (default guest/guest) │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Critical Settings │
├─────────────────────────────────────────────────────────────┤
│ FIREWALL_OUTBOUND_SUBNETS must match Docker network subnet │
│ Gluetun MUST have: networks: - arrs │
│ USE_DB_AUTHENTICATION should be "false" for local/self-hosted│
│ NUQ_RABBITMQ_URL must be set for extract workers │
│ NuQ schema must be applied if PostgreSQL volume exists │
│ Use PROXY_SERVER=http://gluetun:8888 for VPN routing │
└─────────────────────────────────────────────────────────────┘