Running self-hosted Supabase in production means taking responsibility for uptime. Unlike Supabase Cloud's 99.9% SLA, your self-hosted instance has no safety net—when services fail, you need to know immediately and recover automatically. This guide walks through configuring health checks, monitoring endpoints, and auto-recovery for every Supabase service.
If you're new to self-hosting Supabase, start with our deployment guide first. This article assumes you have a running Docker Compose deployment and want to harden it for production reliability.
Understanding the Supabase Service Architecture
Self-hosted Supabase runs eight or more Docker containers, each with different failure modes. Understanding the dependencies between services helps you prioritize what to monitor.
┌─────────────────────────────────────────────────────────┐
│ Kong (API Gateway) │
│ Port 8000/8443 │
└────────────┬──────────┬──────────┬──────────┬──────────┘
│ │ │ │
┌────────▼───┐ ┌────▼────┐ ┌───▼───┐ ┌───▼─────┐
│ PostgREST │ │ Auth │ │Storage│ │Realtime │
│ Port 3000 │ │Port 9999│ │ 5000 │ │ 4000 │
└──────┬─────┘ └────┬────┘ └───┬───┘ └────┬────┘
│ │ │ │
┌──────▼────────────▼──────────▼──────────▼──────┐
│ PostgreSQL (db) │
│ Port 5432 │
└─────────────────────────────────────────────────┘
Critical path dependencies:
- PostgreSQL (db) must be healthy before any other service starts
- Kong routes all external traffic—if it fails, everything appears down
- Auth, PostgREST, Storage, and Realtime depend on the database
- Supavisor (connection pooler) sits between services and the database
The order matters. A healthy-looking Auth service is useless if the database connection is broken.
Built-In Docker Health Checks
The official Supabase Docker Compose file includes health checks for most services. Let's examine what's configured by default and how to verify it's working.
Checking Current Health Status
Run this command to see the health status of all containers:
docker compose ps --format "table {{.Name}}\t{{.Status}}\t{{.Health}}"
You should see output like:
NAME STATUS HEALTH supabase-db Up 2 hours healthy supabase-kong Up 2 hours healthy supabase-auth Up 2 hours healthy supabase-rest Up 2 hours healthy supabase-realtime Up 2 hours healthy supabase-storage Up 2 hours healthy supabase-studio Up 2 hours healthy
If any service shows unhealthy or starting for more than a few minutes, something is wrong.
Database Health Check Configuration
PostgreSQL is the foundation. The default health check uses pg_isready:
db:
healthcheck:
test: ["CMD", "pg_isready", "-U", "postgres", "-h", "localhost"]
interval: 5s
timeout: 5s
retries: 10
This checks if Postgres is accepting connections. For production, consider a more thorough check that also validates the database is writable:
db:
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres && psql -U postgres -c 'SELECT 1'"]
interval: 10s
timeout: 5s
retries: 5
start_period: 30s
The start_period gives the database time to initialize before Docker starts counting failures.
Auth Service Health Check
The Auth service (GoTrue) exposes a /health endpoint. The default configuration checks this endpoint:
auth:
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:9999/health"]
interval: 10s
timeout: 5s
retries: 3
You can also test this manually:
curl http://localhost:9999/health
# Response: {"version":"2.x.x","name":"GoTrue","description":"..."}
Realtime Service Health Check
The Realtime service has a tenant-specific health endpoint. This is where things get tricky for self-hosted deployments:
realtime:
healthcheck:
test: ["CMD", "curl", "-sSfL", "--head", "-o", "/dev/null",
"-H", "Authorization: Bearer ${ANON_KEY}",
"http://localhost:4000/api/tenants/realtime-dev/health"]
interval: 10s
timeout: 5s
retries: 3
A common issue: the Realtime service enters a restart loop due to an unbound RLIMIT_NOFILE variable. If your Realtime container keeps restarting, add these environment variables:
realtime:
environment:
RLIMIT_NOFILE: "10000"
SEED_SELF_HOST: "true"
This stabilizes the service and ensures proper initialization in self-hosted environments.
PostgREST Health Check
PostgREST doesn't have a dedicated health endpoint, but you can check its root endpoint:
rest:
healthcheck:
test: ["CMD", "curl", "-sSf", "http://localhost:3000/"]
interval: 10s
timeout: 5s
retries: 3
A successful response returns the OpenAPI schema. Any non-200 response indicates a problem—usually a database connection issue.
External Health Monitoring
Docker health checks handle container-level recovery, but you also need external monitoring to alert you when something fails. The health checks run inside your server; if the server itself fails, they can't notify you.
HTTP Endpoint Monitoring
Set up external uptime monitoring for these critical endpoints:
| Service | Endpoint | Expected Response |
|---|---|---|
| API Gateway | https://your-domain.com/rest/v1/ | 200 OK |
| Auth | https://your-domain.com/auth/v1/health | 200 with JSON |
| Storage | https://your-domain.com/storage/v1/ | 200 OK |
| Realtime | wss://your-domain.com/realtime/v1/websocket | WebSocket upgrade |
Services like Better Uptime, Uptime Robot, or self-hosted Uptime Kuma work well. Configure alerts for:
- Response time exceeding 500ms
- Any non-2xx status code
- SSL certificate expiration (7 days warning)
PostgreSQL-Specific Monitoring
The database needs deeper monitoring than HTTP checks. Track these metrics:
-- Active connections (should be < max_connections) SELECT count(*) FROM pg_stat_activity; -- Long-running queries (potential locks) SELECT pid, now() - pg_stat_activity.query_start AS duration, query FROM pg_stat_activity WHERE state != 'idle' AND now() - pg_stat_activity.query_start > interval '5 minutes'; -- Dead tuples (indicates need for vacuum) SELECT relname, n_dead_tup FROM pg_stat_user_tables WHERE n_dead_tup > 1000 ORDER BY n_dead_tup DESC;
For comprehensive database monitoring, see our guide on monitoring self-hosted Supabase.
Configuring Auto-Recovery
Docker's restart policies handle automatic recovery when containers fail. But the default configuration may not match your production needs.
Restart Policies Explained
Docker supports four restart policies:
| Policy | Behavior |
|---|---|
no | Never restart |
on-failure | Restart only on non-zero exit code |
always | Always restart (even on manual stop) |
unless-stopped | Restart unless manually stopped |
For production, use unless-stopped for most services:
services:
db:
restart: unless-stopped
kong:
restart: unless-stopped
auth:
restart: unless-stopped
Combining Health Checks with Restart Policies
When a health check fails, Docker marks the container as unhealthy but doesn't automatically restart it. You need external orchestration or a workaround.
Option 1: Use Docker autoheal
Deploy docker-autoheal alongside your Supabase stack:
autoheal:
image: willfarrell/autoheal
restart: always
environment:
AUTOHEAL_CONTAINER_LABEL: all
volumes:
- /var/run/docker.sock:/var/run/docker.sock
This watches all containers and restarts any that become unhealthy.
Option 2: Systemd timer for health checks
Create a systemd service that checks health and restarts unhealthy containers:
# /etc/systemd/system/supabase-health.service [Unit] Description=Check Supabase health and restart unhealthy containers [Service] Type=oneshot ExecStart=/usr/local/bin/supabase-health-check.sh
#!/bin/bash
# /usr/local/bin/supabase-health-check.sh
cd /path/to/supabase
unhealthy=$(docker compose ps --filter "health=unhealthy" -q)
if [ -n "$unhealthy" ]; then
echo "Unhealthy containers found, restarting..."
docker compose restart $unhealthy
# Send alert (webhook, email, etc.)
curl -X POST "https://your-webhook-url" \
-H "Content-Type: application/json" \
-d '{"text":"Supabase containers restarted due to health check failure"}'
fi
Schedule it with a timer:
# /etc/systemd/system/supabase-health.timer [Unit] Description=Run Supabase health check every minute [Timer] OnBootSec=1min OnUnitActiveSec=1min [Install] WantedBy=timers.target
Enable with: systemctl enable --now supabase-health.timer
Handling Common Failure Scenarios
Scenario 1: Database Connection Pool Exhaustion
Symptoms: Auth and PostgREST return 500 errors. Logs show "too many connections."
Health check impact: Services may still appear healthy because they're running—they just can't connect to the database.
Solution: Monitor connection count and configure connection pooling. Add a custom health check that validates database connectivity:
auth:
healthcheck:
test: ["CMD-SHELL", "wget -q --spider http://localhost:9999/health && curl -sf http://localhost:9999/admin/users?page=1&per_page=1 -H 'Authorization: Bearer ${SERVICE_ROLE_KEY}' || exit 1"]
This fails if Auth can't actually query the database.
Scenario 2: Storage Running But Files Inaccessible
Symptoms: Storage API returns 200 but file uploads fail silently.
Root cause: Usually S3/MinIO credentials expired or storage backend unreachable.
Better health check:
storage:
healthcheck:
test: ["CMD-SHELL", "curl -sf http://localhost:5000/status && curl -sf http://localhost:5000/bucket || exit 1"]
interval: 30s
For production storage configuration, see our guide on S3 storage for self-hosted Supabase.
Scenario 3: Realtime Websocket Degradation
Symptoms: Websocket connections drop intermittently. Health check passes but clients experience disconnects.
Root cause: The Realtime service uses the BEAM VM (Erlang), which can handle degraded states gracefully. It may report healthy while shedding connections under load.
Solution: Add connection count monitoring:
# Check current websocket connections
curl -s "http://localhost:4000/api/tenants/realtime-dev/stats" \
-H "Authorization: Bearer ${ANON_KEY}"
Alert if connections drop suddenly or if the connection count exceeds your expected baseline.
Creating a Comprehensive Health Dashboard
For teams managing self-hosted Supabase, a single dashboard showing all service health saves debugging time. Here's a simple approach using a shell script and a status page:
#!/bin/bash
# /usr/local/bin/supabase-status.sh
echo "=== Supabase Health Status ==="
echo "Time: $(date)"
echo ""
# Check each service
services=("db" "kong" "auth" "rest" "realtime" "storage" "studio")
for service in "${services[@]}"; do
status=$(docker inspect --format='{{.State.Health.Status}}' "supabase-$service" 2>/dev/null || echo "not found")
case $status in
healthy) emoji="✓" ;;
unhealthy) emoji="✗" ;;
starting) emoji="⟳" ;;
*) emoji="?" ;;
esac
printf "%-12s %s %s\n" "$service" "$emoji" "$status"
done
echo ""
echo "=== Resource Usage ==="
docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}" | grep supabase
Run this as a cron job and pipe output to a monitoring system or serve it via a simple HTTP endpoint.
Supascale: Automated Health Management
Configuring health checks, monitoring, and auto-recovery for eight services is operational overhead. If you're running multiple self-hosted Supabase projects, this complexity multiplies.
Supascale handles service health monitoring automatically. The platform monitors all Supabase services, alerts you before issues cause downtime, and provides one-click access to logs when debugging is needed. Combined with automated backups and custom domain management, you get production-grade operations without the DevOps burden.
Summary
Service health monitoring for self-hosted Supabase requires:
- Docker health checks for each service with appropriate intervals and retry counts
- External uptime monitoring to catch server-level failures
- Auto-recovery automation using autoheal, systemd timers, or orchestration tools
- Database-specific monitoring beyond simple HTTP checks
- Alerting before users notice problems
Start with the default health checks in Docker Compose, add external monitoring for critical endpoints, and implement auto-recovery for unattended operation. For teams who'd rather focus on building than operating, Supascale provides this infrastructure out of the box.
