Service Health Monitoring for Self-Hosted Supabase: Configure Health Checks and Auto-Recovery

Running self-hosted Supabase in production means taking responsibility for uptime. Unlike Supabase Cloud's 99.9% SLA, your self-hosted instance has no safety net—when services fail, you need to know immediately and recover automatically. This guide walks through configuring health checks, monitoring endpoints, and auto-recovery for every Supabase service.

If you're new to self-hosting Supabase, start with our deployment guide first. This article assumes you have a running Docker Compose deployment and want to harden it for production reliability.

Understanding the Supabase Service Architecture

Self-hosted Supabase runs eight or more Docker containers, each with different failure modes. Understanding the dependencies between services helps you prioritize what to monitor.

┌─────────────────────────────────────────────────────────┐
│                    Kong (API Gateway)                   │
│                    Port 8000/8443                       │
└────────────┬──────────┬──────────┬──────────┬──────────┘
             │          │          │          │
    ┌────────▼───┐ ┌────▼────┐ ┌───▼───┐ ┌───▼─────┐
    │  PostgREST │ │  Auth   │ │Storage│ │Realtime │
    │  Port 3000 │ │Port 9999│ │  5000 │ │  4000   │
    └──────┬─────┘ └────┬────┘ └───┬───┘ └────┬────┘
           │            │          │          │
    ┌──────▼────────────▼──────────▼──────────▼──────┐
    │               PostgreSQL (db)                   │
    │                   Port 5432                     │
    └─────────────────────────────────────────────────┘

Critical path dependencies:

PostgreSQL (db) must be healthy before any other service starts
Kong routes all external traffic—if it fails, everything appears down
Auth, PostgREST, Storage, and Realtime depend on the database
Supavisor (connection pooler) sits between services and the database

The order matters. A healthy-looking Auth service is useless if the database connection is broken.

Built-In Docker Health Checks

The official Supabase Docker Compose file includes health checks for most services. Let's examine what's configured by default and how to verify it's working.

Checking Current Health Status

Run this command to see the health status of all containers:

docker compose ps --format "table {{.Name}}\t{{.Status}}\t{{.Health}}"

You should see output like:

NAME                 STATUS          HEALTH
supabase-db          Up 2 hours      healthy
supabase-kong        Up 2 hours      healthy
supabase-auth        Up 2 hours      healthy
supabase-rest        Up 2 hours      healthy
supabase-realtime    Up 2 hours      healthy
supabase-storage     Up 2 hours      healthy
supabase-studio      Up 2 hours      healthy

If any service shows unhealthy or starting for more than a few minutes, something is wrong.

Database Health Check Configuration

PostgreSQL is the foundation. The default health check uses pg_isready:

db:
  healthcheck:
    test: ["CMD", "pg_isready", "-U", "postgres", "-h", "localhost"]
    interval: 5s
    timeout: 5s
    retries: 10

This checks if Postgres is accepting connections. For production, consider a more thorough check that also validates the database is writable:

db:
  healthcheck:
    test: ["CMD-SHELL", "pg_isready -U postgres && psql -U postgres -c 'SELECT 1'"]
    interval: 10s
    timeout: 5s
    retries: 5
    start_period: 30s

The start_period gives the database time to initialize before Docker starts counting failures.

Auth Service Health Check

The Auth service (GoTrue) exposes a /health endpoint. The default configuration checks this endpoint:

auth:
  healthcheck:
    test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:9999/health"]
    interval: 10s
    timeout: 5s
    retries: 3

You can also test this manually:

curl http://localhost:9999/health
# Response: {"version":"2.x.x","name":"GoTrue","description":"..."}

Realtime Service Health Check

The Realtime service has a tenant-specific health endpoint. This is where things get tricky for self-hosted deployments:

realtime:
  healthcheck:
    test: ["CMD", "curl", "-sSfL", "--head", "-o", "/dev/null", 
           "-H", "Authorization: Bearer ${ANON_KEY}",
           "http://localhost:4000/api/tenants/realtime-dev/health"]
    interval: 10s
    timeout: 5s
    retries: 3

A common issue: the Realtime service enters a restart loop due to an unbound RLIMIT_NOFILE variable. If your Realtime container keeps restarting, add these environment variables:

realtime:
  environment:
    RLIMIT_NOFILE: "10000"
    SEED_SELF_HOST: "true"

This stabilizes the service and ensures proper initialization in self-hosted environments.

PostgREST Health Check

PostgREST doesn't have a dedicated health endpoint, but you can check its root endpoint:

rest:
  healthcheck:
    test: ["CMD", "curl", "-sSf", "http://localhost:3000/"]
    interval: 10s
    timeout: 5s
    retries: 3

A successful response returns the OpenAPI schema. Any non-200 response indicates a problem—usually a database connection issue.

External Health Monitoring

Docker health checks handle container-level recovery, but you also need external monitoring to alert you when something fails. The health checks run inside your server; if the server itself fails, they can't notify you.

HTTP Endpoint Monitoring

Set up external uptime monitoring for these critical endpoints:

Service	Endpoint	Expected Response
API Gateway	`https://your-domain.com/rest/v1/`	200 OK
Auth	`https://your-domain.com/auth/v1/health`	200 with JSON
Storage	`https://your-domain.com/storage/v1/`	200 OK
Realtime	`wss://your-domain.com/realtime/v1/websocket`	WebSocket upgrade

Services like Better Uptime, Uptime Robot, or self-hosted Uptime Kuma work well. Configure alerts for:

Response time exceeding 500ms
Any non-2xx status code
SSL certificate expiration (7 days warning)

PostgreSQL-Specific Monitoring

The database needs deeper monitoring than HTTP checks. Track these metrics:

-- Active connections (should be < max_connections)
SELECT count(*) FROM pg_stat_activity;

-- Long-running queries (potential locks)
SELECT pid, now() - pg_stat_activity.query_start AS duration, query
FROM pg_stat_activity
WHERE state != 'idle'
AND now() - pg_stat_activity.query_start > interval '5 minutes';

-- Dead tuples (indicates need for vacuum)
SELECT relname, n_dead_tup
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

For comprehensive database monitoring, see our guide on monitoring self-hosted Supabase.

Configuring Auto-Recovery

Docker's restart policies handle automatic recovery when containers fail. But the default configuration may not match your production needs.

Restart Policies Explained

Docker supports four restart policies:

Policy	Behavior
`no`	Never restart
`on-failure`	Restart only on non-zero exit code
`always`	Always restart (even on manual stop)
`unless-stopped`	Restart unless manually stopped

For production, use unless-stopped for most services:

services:
  db:
    restart: unless-stopped
  
  kong:
    restart: unless-stopped
  
  auth:
    restart: unless-stopped

Combining Health Checks with Restart Policies

When a health check fails, Docker marks the container as unhealthy but doesn't automatically restart it. You need external orchestration or a workaround.

Option 1: Use Docker autoheal

Deploy docker-autoheal alongside your Supabase stack:

autoheal:
  image: willfarrell/autoheal
  restart: always
  environment:
    AUTOHEAL_CONTAINER_LABEL: all
  volumes:
    - /var/run/docker.sock:/var/run/docker.sock

This watches all containers and restarts any that become unhealthy.

Option 2: Systemd timer for health checks

Create a systemd service that checks health and restarts unhealthy containers:

# /etc/systemd/system/supabase-health.service
[Unit]
Description=Check Supabase health and restart unhealthy containers

[Service]
Type=oneshot
ExecStart=/usr/local/bin/supabase-health-check.sh

#!/bin/bash
# /usr/local/bin/supabase-health-check.sh

cd /path/to/supabase

unhealthy=$(docker compose ps --filter "health=unhealthy" -q)

if [ -n "$unhealthy" ]; then
    echo "Unhealthy containers found, restarting..."
    docker compose restart $unhealthy
    
    # Send alert (webhook, email, etc.)
    curl -X POST "https://your-webhook-url" \
         -H "Content-Type: application/json" \
         -d '{"text":"Supabase containers restarted due to health check failure"}'
fi

Schedule it with a timer:

# /etc/systemd/system/supabase-health.timer
[Unit]
Description=Run Supabase health check every minute

[Timer]
OnBootSec=1min
OnUnitActiveSec=1min

[Install]
WantedBy=timers.target

Enable with: systemctl enable --now supabase-health.timer

Handling Common Failure Scenarios

Scenario 1: Database Connection Pool Exhaustion

Symptoms: Auth and PostgREST return 500 errors. Logs show "too many connections."

Health check impact: Services may still appear healthy because they're running—they just can't connect to the database.

Solution: Monitor connection count and configure connection pooling. Add a custom health check that validates database connectivity:

auth:
  healthcheck:
    test: ["CMD-SHELL", "wget -q --spider http://localhost:9999/health && curl -sf http://localhost:9999/admin/users?page=1&per_page=1 -H 'Authorization: Bearer ${SERVICE_ROLE_KEY}' || exit 1"]

This fails if Auth can't actually query the database.

Scenario 2: Storage Running But Files Inaccessible

Symptoms: Storage API returns 200 but file uploads fail silently.

Root cause: Usually S3/MinIO credentials expired or storage backend unreachable.

Better health check:

storage:
  healthcheck:
    test: ["CMD-SHELL", "curl -sf http://localhost:5000/status && curl -sf http://localhost:5000/bucket || exit 1"]
    interval: 30s

For production storage configuration, see our guide on S3 storage for self-hosted Supabase.

Scenario 3: Realtime Websocket Degradation

Symptoms: Websocket connections drop intermittently. Health check passes but clients experience disconnects.

Root cause: The Realtime service uses the BEAM VM (Erlang), which can handle degraded states gracefully. It may report healthy while shedding connections under load.

Solution: Add connection count monitoring:

# Check current websocket connections
curl -s "http://localhost:4000/api/tenants/realtime-dev/stats" \
     -H "Authorization: Bearer ${ANON_KEY}"

Alert if connections drop suddenly or if the connection count exceeds your expected baseline.

Creating a Comprehensive Health Dashboard

For teams managing self-hosted Supabase, a single dashboard showing all service health saves debugging time. Here's a simple approach using a shell script and a status page:

#!/bin/bash
# /usr/local/bin/supabase-status.sh

echo "=== Supabase Health Status ==="
echo "Time: $(date)"
echo ""

# Check each service
services=("db" "kong" "auth" "rest" "realtime" "storage" "studio")

for service in "${services[@]}"; do
    status=$(docker inspect --format='{{.State.Health.Status}}' "supabase-$service" 2>/dev/null || echo "not found")
    
    case $status in
        healthy)   emoji="✓" ;;
        unhealthy) emoji="✗" ;;
        starting)  emoji="⟳" ;;
        *)         emoji="?" ;;
    esac
    
    printf "%-12s %s %s\n" "$service" "$emoji" "$status"
done

echo ""
echo "=== Resource Usage ==="
docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}" | grep supabase

Run this as a cron job and pipe output to a monitoring system or serve it via a simple HTTP endpoint.

Supascale: Automated Health Management

Configuring health checks, monitoring, and auto-recovery for eight services is operational overhead. If you're running multiple self-hosted Supabase projects, this complexity multiplies.

Supascale handles service health monitoring automatically. The platform monitors all Supabase services, alerts you before issues cause downtime, and provides one-click access to logs when debugging is needed. Combined with automated backups and custom domain management, you get production-grade operations without the DevOps burden.

Summary

Service health monitoring for self-hosted Supabase requires:

Docker health checks for each service with appropriate intervals and retry counts
External uptime monitoring to catch server-level failures
Auto-recovery automation using autoheal, systemd timers, or orchestration tools
Database-specific monitoring beyond simple HTTP checks
Alerting before users notice problems

Start with the default health checks in Docker Compose, add external monitoring for critical endpoints, and implement auto-recovery for unattended operation. For teams who'd rather focus on building than operating, Supascale provides this infrastructure out of the box.