OpenTelemetry for Self-Hosted Supabase: Distributed Tracing Guide

Set up OpenTelemetry tracing for self-hosted Supabase with Jaeger, Grafana Tempo, or commercial APM tools for full observability.

Cover Image for OpenTelemetry for Self-Hosted Supabase: Distributed Tracing Guide

When something goes wrong in your self-hosted Supabase stack, how do you trace a request from your frontend, through Kong, PostgREST, and into PostgreSQL? If you're relying solely on container logs, you're debugging blind. OpenTelemetry (OTel) has become the standard for distributed tracing, and integrating it with your self-hosted Supabase deployment gives you the visibility you need to diagnose performance issues and failures across your entire stack.

This guide walks you through setting up OpenTelemetry tracing for self-hosted Supabase, connecting to backends like Jaeger or Grafana Tempo, and instrumenting your application code to achieve end-to-end request visibility.

Why OpenTelemetry Matters for Self-Hosted Supabase

Supabase Cloud offers built-in observability features through their dashboard—metrics, logs, and query performance tools. Self-hosted deployments don't have these luxuries out of the box. You're responsible for your own monitoring setup, and that's where OpenTelemetry fills the gap.

OpenTelemetry provides three pillars of observability:

  1. Traces: Follow a request's journey through your distributed system
  2. Metrics: Quantitative measurements (latency histograms, error rates, throughput)
  3. Logs: Contextual event records correlated with traces

The power comes from correlation. When a user reports "the app is slow," you can trace their specific request through Kong's API gateway, into PostgREST, down to the exact PostgreSQL query that took 3 seconds, and see which table scan caused it.

The Self-Hosted Observability Gap

According to Supabase's telemetry documentation, they're actively adding OpenTelemetry support across their products. However, features like the Metrics API aren't available in self-hosted instances. You need to build this yourself.

The good news: PostgreSQL, PostgREST, and the other components in your stack can all export telemetry data. You just need to connect the pieces.

Architecture Overview

Here's what we're building:

┌─────────────┐    ┌──────────────┐    ┌────────────────┐
│ Your App    │───▶│ Kong Gateway │───▶│ PostgREST/Auth │
│ (OTel SDK)  │    │  (tracing)   │    │   (tracing)    │
└─────────────┘    └──────────────┘    └────────────────┘
       │                  │                    │
       │                  │                    │
       ▼                  ▼                    ▼
┌───────────────────────────────────────────────────────┐
│              OpenTelemetry Collector                  │
│         (receives, processes, exports)                │
└───────────────────────────────────────────────────────┘
                         │
        ┌────────────────┼────────────────┐
        ▼                ▼                ▼
┌──────────────┐  ┌─────────────┐  ┌──────────────┐
│    Jaeger    │  │   Grafana   │  │  Datadog/    │
│   (traces)   │  │    Tempo    │  │  Honeycomb   │
└──────────────┘  └─────────────┘  └──────────────┘

The OpenTelemetry Collector acts as a central hub, receiving traces from all components and forwarding them to your backend of choice.

Setting Up the OpenTelemetry Collector

First, add the OTel Collector to your Docker Compose stack. Create a file called otel-collector-config.yaml:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 1s
    send_batch_size: 1024
  
  # Add resource attributes to identify the service
  resource:
    attributes:
      - key: deployment.environment
        value: production
        action: upsert

exporters:
  # For local development with Jaeger
  otlp/jaeger:
    endpoint: jaeger:4317
    tls:
      insecure: true
  
  # For Grafana Tempo
  otlp/tempo:
    endpoint: tempo:4317
    tls:
      insecure: true
  
  # Debug output (remove in production)
  logging:
    loglevel: debug

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch, resource]
      exporters: [otlp/jaeger, logging]

Add to your docker-compose.yml:

services:
  otel-collector:
    image: otel/opentelemetry-collector-contrib:0.96.0
    command: ["--config=/etc/otel-collector-config.yaml"]
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
    ports:
      - "4317:4317"   # OTLP gRPC
      - "4318:4318"   # OTLP HTTP
    networks:
      - supabase_network

  jaeger:
    image: jaegertracing/all-in-one:1.54
    ports:
      - "16686:16686"  # Jaeger UI
      - "4317"         # OTLP gRPC (internal)
    environment:
      - COLLECTOR_OTLP_ENABLED=true
    networks:
      - supabase_network

Instrumenting Your Application

The most value comes from instrumenting your application code. Here's how to set up OpenTelemetry in a Node.js/TypeScript application using the Supabase client:

Install Dependencies

npm install @opentelemetry/api \
  @opentelemetry/sdk-node \
  @opentelemetry/auto-instrumentations-node \
  @opentelemetry/exporter-trace-otlp-grpc \
  @opentelemetry/resources \
  @opentelemetry/semantic-conventions

Create Instrumentation File

Create instrumentation.ts that runs before your application:

import { NodeSDK } from '@opentelemetry/sdk-node'
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node'
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-grpc'
import { Resource } from '@opentelemetry/resources'
import { SemanticResourceAttributes } from '@opentelemetry/semantic-conventions'

const sdk = new NodeSDK({
  resource: new Resource({
    [SemanticResourceAttributes.SERVICE_NAME]: 'my-supabase-app',
    [SemanticResourceAttributes.SERVICE_VERSION]: '1.0.0',
    [SemanticResourceAttributes.DEPLOYMENT_ENVIRONMENT]: process.env.NODE_ENV,
  }),
  traceExporter: new OTLPTraceExporter({
    url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT || 'http://localhost:4317',
  }),
  instrumentations: [
    getNodeAutoInstrumentations({
      '@opentelemetry/instrumentation-http': {
        enabled: true,
      },
      '@opentelemetry/instrumentation-pg': {
        enabled: true, // Traces PostgreSQL queries
      },
    }),
  ],
})

sdk.start()

process.on('SIGTERM', () => {
  sdk.shutdown()
    .then(() => console.log('Tracing terminated'))
    .catch((error) => console.log('Error terminating tracing', error))
    .finally(() => process.exit(0))
})

Wrap Supabase Operations with Custom Spans

For more granular tracing, wrap your Supabase operations:

import { trace, SpanStatusCode } from '@opentelemetry/api'
import { createClient } from '@supabase/supabase-js'

const tracer = trace.getTracer('supabase-client')

export async function fetchUserProfile(userId: string) {
  return tracer.startActiveSpan('supabase.query.profiles', async (span) => {
    try {
      span.setAttribute('db.system', 'postgresql')
      span.setAttribute('db.operation', 'SELECT')
      span.setAttribute('db.table', 'profiles')
      span.setAttribute('user.id', userId)

      const { data, error } = await supabase
        .from('profiles')
        .select('*')
        .eq('id', userId)
        .single()

      if (error) {
        span.setStatus({ code: SpanStatusCode.ERROR, message: error.message })
        span.recordException(error)
        throw error
      }

      span.setAttribute('db.rows_affected', data ? 1 : 0)
      span.setStatus({ code: SpanStatusCode.OK })
      return data
    } finally {
      span.end()
    }
  })
}

Tracing PostgreSQL Queries

For deep visibility into database performance, enable pg_stat_statements (likely already enabled in your Supabase setup) and consider adding query logging:

-- Enable query logging for slow queries (> 1 second)
ALTER SYSTEM SET log_min_duration_statement = 1000;
SELECT pg_reload_conf();

If you're using the PostgreSQL OpenTelemetry instrumentation, queries will automatically appear as spans with:

  • Query text (sanitized to remove parameters)
  • Execution time
  • Rows returned
  • Connection details

Kong Gateway Tracing

Kong, the API gateway in Supabase's stack, supports OpenTelemetry natively. Add this to your Kong configuration:

# In your kong.yml or via environment variables
plugins:
  - name: opentelemetry
    config:
      endpoint: "http://otel-collector:4318/v1/traces"
      resource_attributes:
        service.name: "supabase-kong"
      headers:
        X-Custom-Header: "supabase-gateway"

Or via environment variables in docker-compose.yml:

kong:
  environment:
    KONG_TRACING_INSTRUMENTATIONS: all
    KONG_TRACING_SAMPLING_RATE: 1.0
    KONG_PLUGINS: bundled,opentelemetry

Correlating Traces Across Services

The magic of distributed tracing is correlation. When your application makes a request to Supabase, the trace context propagates through:

  1. Your app → Creates trace, adds span for "fetch user"
  2. Kong → Receives trace headers, adds gateway span
  3. PostgREST → Adds API processing span
  4. PostgreSQL → Query execution span

This requires proper context propagation. The traceparent header carries trace context:

traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01

The Supabase JavaScript client doesn't automatically propagate trace context, so you'll need to add it manually for full end-to-end visibility:

import { context, propagation } from '@opentelemetry/api'

// Inject trace context into Supabase requests
const headers: Record<string, string> = {}
propagation.inject(context.active(), headers)

const { data } = await supabase
  .from('profiles')
  .select('*')
  .setHeader('traceparent', headers.traceparent)

Visualizing Traces in Jaeger

Once everything is connected, open Jaeger UI at http://localhost:16686. You'll see:

  • Service list: All instrumented services
  • Trace search: Filter by service, operation, duration, tags
  • Trace timeline: Visual representation of request flow
  • Span details: Individual operation metadata

Look for:

  • Slow spans: Database queries taking >100ms
  • Error spans: Failed operations with exception details
  • Gap analysis: Time spent between spans (network latency)

Production Considerations

Sampling Strategy

Tracing everything in production generates massive data volumes. Configure sampling:

# In otel-collector-config.yaml
processors:
  probabilistic_sampler:
    sampling_percentage: 10  # Sample 10% of traces
  
  tail_sampling:
    decision_wait: 10s
    policies:
      - name: error-policy
        type: status_code
        status_code: {status_codes: [ERROR]}
      - name: slow-policy
        type: latency
        latency: {threshold_ms: 1000}

This keeps all error traces and slow requests while sampling normal traffic.

Cost Management

If using commercial backends (Datadog, Honeycomb), trace volume directly impacts cost. Consider:

  • Aggressive sampling (1-5% for high-traffic apps)
  • Filtering out health checks and internal traffic
  • Shorter retention periods for normal traces

Security

Traces may contain sensitive data. Configure the collector to redact:

processors:
  attributes:
    actions:
      - key: db.statement
        action: hash  # Hash SQL queries
      - key: http.url
        action: delete  # Remove URLs with tokens

Alternative: Grafana Tempo with Loki

For a fully open-source stack that integrates with your existing log management, consider Grafana Tempo + Loki:

tempo:
  image: grafana/tempo:latest
  command: ["-config.file=/etc/tempo.yaml"]
  volumes:
    - ./tempo.yaml:/etc/tempo.yaml
  ports:
    - "3200:3200"   # Tempo API
    - "4317"        # OTLP gRPC

grafana:
  image: grafana/grafana:latest
  environment:
    - GF_AUTH_ANONYMOUS_ENABLED=true
  volumes:
    - ./grafana-datasources.yaml:/etc/grafana/provisioning/datasources/datasources.yaml
  ports:
    - "3000:3000"

This gives you trace-to-log correlation, letting you jump from a slow span directly to the relevant log entries.

How Supascale Helps

Managing observability infrastructure adds complexity to an already complex self-hosted setup. Supascale simplifies your operational burden by handling Supabase lifecycle management—deployment, backups, upgrades—so you can focus on building observability for your specific needs.

While you configure OpenTelemetry and tracing backends for your custom requirements, Supascale ensures your Supabase infrastructure stays healthy with automated backups to S3-compatible storage, one-click restores, and simplified environment variable management. For teams serious about production self-hosting, this separation of concerns matters.

Summary

OpenTelemetry brings modern observability to self-hosted Supabase:

  1. Deploy the OTel Collector as your central telemetry hub
  2. Instrument your application with the OpenTelemetry SDK
  3. Configure Kong for gateway-level tracing
  4. Propagate context through the Supabase client
  5. Choose your backend: Jaeger for simplicity, Tempo for Grafana integration, or commercial APM for enterprise features

The investment in tracing infrastructure pays off the first time you debug a production issue in minutes instead of hours. Start with application-level instrumentation, then expand to cover the full request path.


Further Reading