Error Handling and Retry Patterns for Self-Hosted Supabase

Build resilient apps with retry logic, circuit breakers, and graceful degradation for self-hosted Supabase production deployments.

Cover Image for Error Handling and Retry Patterns for Self-Hosted Supabase

When you're running self-hosted Supabase, you take on responsibility for handling the edge cases that managed services typically abstract away. Network blips, database connection timeouts, and transient failures become your problem to solve. The good news: implementing proper error handling and retry patterns isn't complicated—and it makes the difference between an app that frustrates users and one that gracefully recovers from inevitable hiccups.

This guide covers practical resilience patterns for self-hosted Supabase deployments, from built-in retry mechanisms to custom circuit breakers.

Why Self-Hosted Deployments Need Extra Resilience

On Supabase Cloud, infrastructure engineers handle connection pooling, load balancing, and automatic failover. When you self-host, your single VPS or Docker Compose stack doesn't have that luxury.

Common failure scenarios include:

  • Network timeouts between your app and Postgres
  • Connection pool exhaustion during traffic spikes
  • Temporary service unavailability during container restarts or updates
  • Database locks from concurrent operations
  • Edge Function cold starts causing initial request failures

Without proper handling, these translate to user-facing errors, lost transactions, and frustrated customers.

Built-In Retry Logic in supabase-js

Starting with supabase-js v2.102.0, PostgREST queries include automatic retries for transient errors. This is enabled by default and uses exponential backoff with jitter.

What Gets Retried Automatically

The client retries on these HTTP status codes:

  • 408 - Request Timeout
  • 409 - Conflict
  • 503 - Service Unavailable
  • 504 - Gateway Timeout

Network-level failures also trigger retries. The client attempts up to 3 retries with exponential backoff (1s → 2s → 4s, capped at 30s).

Checking Retry Behavior

Each retry includes an X-Retry-Count header, so you can observe retry patterns in your server logs:

const { data, error } = await supabase
  .from('orders')
  .select('*')
  .eq('user_id', userId);

if (error) {
  // After 3 retries, this error is likely a persistent issue
  console.error('Query failed after retries:', error.message);
}

Configuring Custom Retry Behavior

For more control, wrap the native fetch with a retry library:

import { createClient } from '@supabase/supabase-js';
import fetchRetry from 'fetch-retry';

const fetchWithRetry = fetchRetry(fetch, {
  retries: 5,
  retryDelay: (attempt) => Math.pow(2, attempt) * 1000, // Exponential backoff
  retryOn: [503, 504, 520, 408],
});

const supabase = createClient(
  process.env.SUPABASE_URL!,
  process.env.SUPABASE_ANON_KEY!,
  {
    global: {
      fetch: fetchWithRetry,
    },
  }
);

This gives you fine-grained control over which errors trigger retries and how long to wait between attempts.

Implementing Circuit Breakers

Retries are great for transient failures, but what happens when your database is truly down? Hammering a failing service with retries makes things worse. Circuit breakers solve this by "tripping" after repeated failures and giving the service time to recover.

Basic Circuit Breaker Pattern

class CircuitBreaker {
  private failures = 0;
  private lastFailure: number | null = null;
  private state: 'closed' | 'open' | 'half-open' = 'closed';
  
  constructor(
    private threshold: number = 5,
    private resetTimeout: number = 30000
  ) {}

  async execute<T>(fn: () => Promise<T>): Promise<T> {
    if (this.state === 'open') {
      if (Date.now() - (this.lastFailure || 0) > this.resetTimeout) {
        this.state = 'half-open';
      } else {
        throw new Error('Circuit breaker is open');
      }
    }

    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      throw error;
    }
  }

  private onSuccess() {
    this.failures = 0;
    this.state = 'closed';
  }

  private onFailure() {
    this.failures++;
    this.lastFailure = Date.now();
    if (this.failures >= this.threshold) {
      this.state = 'open';
    }
  }
}

Using the Circuit Breaker with Supabase

const dbCircuitBreaker = new CircuitBreaker(5, 30000);

async function getUser(userId: string) {
  return dbCircuitBreaker.execute(async () => {
    const { data, error } = await supabase
      .from('users')
      .select('*')
      .eq('id', userId)
      .single();
    
    if (error) throw error;
    return data;
  });
}

// In your API handler
try {
  const user = await getUser(userId);
  return Response.json(user);
} catch (error) {
  if (error.message === 'Circuit breaker is open') {
    // Return cached data or graceful degradation
    return Response.json({ error: 'Service temporarily unavailable' }, { status: 503 });
  }
  throw error;
}

For production use, consider libraries like opossum which provide battle-tested circuit breaker implementations.

Idempotency for Safe Retries

Retrying a failed request is only safe if the operation is idempotent—meaning running it twice produces the same result as running it once. GET requests are naturally idempotent, but writes require careful handling.

Idempotency Keys for Write Operations

import { randomUUID } from 'crypto';

async function createOrder(orderData: OrderInput, idempotencyKey?: string) {
  const key = idempotencyKey || randomUUID();
  
  // Check if we've already processed this request
  const { data: existing } = await supabase
    .from('processed_requests')
    .select('result')
    .eq('idempotency_key', key)
    .single();

  if (existing) {
    return existing.result;
  }

  // Process the order
  const { data: order, error } = await supabase
    .from('orders')
    .insert(orderData)
    .select()
    .single();

  if (error) throw error;

  // Record the successful request
  await supabase.from('processed_requests').insert({
    idempotency_key: key,
    result: order,
    created_at: new Date().toISOString(),
  });

  return order;
}

Database-Level Idempotency

For critical operations, enforce idempotency at the database level with unique constraints:

-- Add an idempotency_key column
ALTER TABLE orders ADD COLUMN idempotency_key TEXT UNIQUE;

-- Create orders with client-generated keys
INSERT INTO orders (user_id, total, idempotency_key)
VALUES ($1, $2, $3)
ON CONFLICT (idempotency_key) DO NOTHING
RETURNING *;

This prevents duplicate orders even if your application retries the same request multiple times.

Graceful Degradation Strategies

When failures happen, don't just show users an error page. Implement graceful degradation to provide a reduced but functional experience.

Caching for Read Resilience

Cache frequently accessed data so you can serve stale data when the database is unavailable:

import { Redis } from '@upstash/redis';

const redis = new Redis({
  url: process.env.UPSTASH_REDIS_URL!,
  token: process.env.UPSTASH_REDIS_TOKEN!,
});

async function getProductWithFallback(productId: string) {
  const cacheKey = `product:${productId}`;
  
  try {
    // Try to fetch fresh data
    const { data, error } = await supabase
      .from('products')
      .select('*')
      .eq('id', productId)
      .single();
    
    if (error) throw error;
    
    // Update cache with fresh data
    await redis.set(cacheKey, JSON.stringify(data), { ex: 3600 });
    return { data, stale: false };
  } catch (error) {
    // Fall back to cached data
    const cached = await redis.get(cacheKey);
    if (cached) {
      return { data: JSON.parse(cached as string), stale: true };
    }
    throw error;
  }
}

Write Buffering for Eventual Consistency

For non-critical writes, buffer them locally and sync when the database is available:

async function trackEvent(event: AnalyticsEvent) {
  try {
    await supabase.from('analytics_events').insert(event);
  } catch (error) {
    // Store locally for later sync
    const pending = JSON.parse(localStorage.getItem('pendingEvents') || '[]');
    pending.push({ ...event, failed_at: Date.now() });
    localStorage.setItem('pendingEvents', JSON.stringify(pending));
    
    // Schedule retry
    scheduleEventSync();
  }
}

function scheduleEventSync() {
  setTimeout(async () => {
    const pending = JSON.parse(localStorage.getItem('pendingEvents') || '[]');
    if (pending.length === 0) return;
    
    try {
      await supabase.from('analytics_events').insert(pending);
      localStorage.removeItem('pendingEvents');
    } catch {
      scheduleEventSync(); // Try again later
    }
  }, 60000); // Retry in 1 minute
}

Edge Function Error Handling

Edge Functions require special attention since they run in isolated environments with cold starts and timeouts.

Structured Error Responses

class AppError extends Error {
  constructor(
    message: string,
    public code: string,
    public statusCode: number = 500
  ) {
    super(message);
  }
}

function errorResponse(error: AppError | Error) {
  const statusCode = error instanceof AppError ? error.statusCode : 500;
  const code = error instanceof AppError ? error.code : 'INTERNAL_ERROR';
  
  return new Response(
    JSON.stringify({
      error: {
        message: error.message,
        code,
      },
    }),
    {
      status: statusCode,
      headers: { 'Content-Type': 'application/json' },
    }
  );
}

Deno.serve(async (req) => {
  try {
    // Your function logic
    const result = await processRequest(req);
    return Response.json(result);
  } catch (error) {
    console.error('Function error:', JSON.stringify({
      message: error.message,
      stack: error.stack,
      timestamp: new Date().toISOString(),
    }));
    
    return errorResponse(error);
  }
});

Timeout Handling

Edge Functions have execution limits. Handle timeouts gracefully:

async function withTimeout<T>(
  promise: Promise<T>,
  ms: number
): Promise<T> {
  const timeout = new Promise<never>((_, reject) => {
    setTimeout(() => reject(new AppError('Operation timed out', 'TIMEOUT', 504)), ms);
  });
  
  return Promise.race([promise, timeout]);
}

// Use in your function
const data = await withTimeout(
  supabase.from('large_table').select('*'),
  25000 // 25 second timeout
);

Monitoring Failures in Self-Hosted Environments

You can't fix what you can't see. Set up proper monitoring to catch issues before they become outages.

Logging Retry Attempts

const supabase = createClient(url, key, {
  global: {
    fetch: async (url, options) => {
      const start = Date.now();
      const response = await fetch(url, options);
      const duration = Date.now() - start;
      
      // Log slow or failed requests
      if (duration > 1000 || !response.ok) {
        console.log(JSON.stringify({
          type: 'supabase_request',
          url: url.toString(),
          status: response.status,
          duration,
          retry_count: response.headers.get('X-Retry-Count') || '0',
        }));
      }
      
      return response;
    },
  },
});

Health Check Endpoints

Create a health check that tests your Supabase connection:

// /api/health
export async function GET() {
  const checks = {
    database: false,
    auth: false,
    storage: false,
  };

  try {
    // Test database
    const { error: dbError } = await supabase.from('health_check').select('1').limit(1);
    checks.database = !dbError;

    // Test auth
    const { error: authError } = await supabase.auth.getSession();
    checks.auth = !authError;

    // Test storage
    const { error: storageError } = await supabase.storage.listBuckets();
    checks.storage = !storageError;

    const healthy = Object.values(checks).every(Boolean);
    return Response.json({ status: healthy ? 'healthy' : 'degraded', checks }, {
      status: healthy ? 200 : 503,
    });
  } catch (error) {
    return Response.json({ status: 'unhealthy', error: error.message }, { status: 503 });
  }
}

Connection Pool Management

Connection exhaustion is a common failure mode. Configure your connection pooling properly and handle pool exhaustion gracefully:

// Use transaction mode for serverless environments
const supabasePooled = createClient(
  process.env.SUPABASE_URL!.replace(':5432', ':6543'), // Transaction mode port
  process.env.SUPABASE_ANON_KEY!,
  {
    db: {
      schema: 'public',
    },
  }
);

// Add connection timeout handling
async function queryWithPoolCheck<T>(
  query: () => Promise<{ data: T; error: any }>
): Promise<T> {
  try {
    const { data, error } = await query();
    if (error?.code === 'XX000' || error?.message?.includes('connection')) {
      throw new AppError('Database connection unavailable', 'POOL_EXHAUSTED', 503);
    }
    if (error) throw error;
    return data;
  } catch (error) {
    if (error.message?.includes('timeout') || error.message?.includes('ECONNREFUSED')) {
      throw new AppError('Database temporarily unavailable', 'CONNECTION_TIMEOUT', 503);
    }
    throw error;
  }
}

Testing Your Resilience Patterns

Don't wait for production failures to test your error handling. Use chaos engineering principles:

// Test helper that randomly fails
function chaosWrapper<T>(fn: () => Promise<T>, failureRate = 0.1): () => Promise<T> {
  return async () => {
    if (process.env.CHAOS_TESTING && Math.random() < failureRate) {
      throw new Error('Chaos monkey strikes!');
    }
    return fn();
  };
}

// In tests or staging
const getUser = chaosWrapper(async (id: string) => {
  const { data, error } = await supabase
    .from('users')
    .select()
    .eq('id', id)
    .single();
  if (error) throw error;
  return data;
});

Key Takeaways

Building resilient self-hosted Supabase applications requires thinking about failure modes upfront:

  1. Leverage built-in retries in supabase-js v2.102.0+ for transient failures
  2. Implement circuit breakers to prevent cascading failures
  3. Use idempotency keys to make write operations safe to retry
  4. Cache aggressively for graceful degradation when the database is unavailable
  5. Monitor failures to catch issues before they escalate
  6. Test your resilience patterns before production exposes their weaknesses

With Supascale, you get automated backups and one-click restore, so even catastrophic failures don't mean data loss. But proper error handling in your application code prevents most failures from reaching that point.


Further Reading