Running a self-hosted Supabase instance gives you full control over your data—but what happens when that data needs to flow elsewhere? Whether you're feeding a data warehouse, syncing with a legacy system, or powering real-time analytics dashboards, PostgreSQL's logical replication is your escape hatch from data silos.
This guide walks through setting up logical replication on self-hosted Supabase to stream changes to external PostgreSQL databases, data warehouses, and analytics platforms. No managed service lock-in required.
Why Logical Replication Matters for Self-Hosted Users
Physical replication copies the entire database at the binary level—great for read replicas, but inflexible. Logical replication works differently: it streams row-level changes (INSERT, UPDATE, DELETE) using the Write-Ahead Log (WAL), letting you:
- Sync specific tables to external systems without copying everything
- Transform data in flight before it reaches the destination
- Replicate across PostgreSQL versions (even major version differences)
- Feed non-PostgreSQL destinations using CDC tools like Debezium
For self-hosted Supabase users, this opens up powerful integrations that Supabase Cloud's managed replication (currently in private alpha) doesn't yet fully support.
Real Use Cases from the Community
The Supabase community on GitHub Discussions has documented several practical applications:
- Analytics pipelines: Streaming operational data to BigQuery or ClickHouse for reporting
- Multi-database architectures: Keeping a legacy MySQL system in sync during migration
- Compliance and auditing: Replicating to an immutable audit database
- Cross-region distribution: Syncing data to PostgreSQL instances closer to users
Prerequisites
Before diving into configuration, ensure your self-hosted Supabase setup meets these requirements:
PostgreSQL Configuration
Logical replication requires specific PostgreSQL settings. If you deployed Supabase using our installation guide, you'll need to verify these settings in your postgresql.conf:
-- Check current settings SHOW wal_level; -- Must be 'logical' SHOW max_replication_slots; -- At least 4 (1 per subscription + buffer) SHOW max_wal_senders; -- At least 4
If wal_level isn't set to logical, you'll need to modify your Docker configuration. In your docker-compose.yml, add these command arguments to the db service:
db:
image: supabase/postgres:15.6.1.145
command: >
postgres
-c wal_level=logical
-c max_replication_slots=10
-c max_wal_senders=10
Important: Changing wal_level requires a PostgreSQL restart. Plan for brief downtime.
Network Requirements
Your destination database needs to reach the source Supabase instance. For self-hosted setups, this typically means:
- Opening port 5432 (or your custom PostgreSQL port) in your firewall
- Ensuring the destination supports IPv6, or configuring IPv4 access
- Using direct connections—connection poolers (PgBouncer/Supavisor) don't support replication protocols
Setting Up Logical Replication
The setup involves three components working together:
- Publication: Defines which tables to replicate on the source
- Replication Slot: Tracks WAL position for the subscriber
- Subscription: Created on the destination to pull changes
Step 1: Create a Publication on Supabase
Connect to your Supabase PostgreSQL instance and create a publication for the tables you want to replicate:
-- Replicate specific tables CREATE PUBLICATION my_analytics_pub FOR TABLE public.orders, public.users, public.products; -- Or replicate all tables in a schema CREATE PUBLICATION all_public FOR ALL TABLES IN SCHEMA public;
Best practice: Start with specific tables rather than FOR ALL TABLES. This reduces WAL traffic and makes it easier to manage schema changes.
Step 2: Create a Replication User
Don't use your superuser credentials for replication. Create a dedicated user with minimal required permissions:
-- Create replication user CREATE USER replication_user WITH REPLICATION LOGIN PASSWORD 'secure_password_here'; -- Grant access to the publication GRANT USAGE ON SCHEMA public TO replication_user; GRANT SELECT ON ALL TABLES IN SCHEMA public TO replication_user; -- For future tables ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO replication_user;
Step 3: Configure pg_hba.conf for Replication Access
Add a line to pg_hba.conf allowing replication connections from your destination:
# Allow replication from specific IP hostssl replication replication_user 203.0.113.50/32 scram-sha-256 # Or from a subnet (less secure) hostssl replication replication_user 10.0.0.0/8 scram-sha-256
Reload PostgreSQL to apply:
docker exec supabase-db pg_ctl reload
Step 4: Create Subscription on Destination
On your destination PostgreSQL database, create the subscription:
CREATE SUBSCRIPTION my_analytics_sub CONNECTION 'host=your-supabase-host.com port=5432 dbname=postgres user=replication_user password=secure_password_here sslmode=require' PUBLICATION my_analytics_pub;
PostgreSQL will:
- Create a replication slot on the source automatically
- Copy existing table data (initial sync)
- Begin streaming new changes
Step 5: Verify Replication Status
Check that replication is working on the source:
-- View active replication slots SELECT slot_name, active, restart_lsn, confirmed_flush_lsn FROM pg_replication_slots; -- Monitor replication lag SELECT client_addr, state, sent_lsn, write_lsn, flush_lsn, replay_lsn, pg_wal_lsn_diff(sent_lsn, replay_lsn) AS lag_bytes FROM pg_stat_replication;
On the destination, verify data is flowing:
-- Check subscription status SELECT subname, subenabled, subslotname FROM pg_subscription; -- View subscription stats SELECT * FROM pg_stat_subscription;
Handling Common Challenges
Schema Changes (DDL)
Here's the hard truth: PostgreSQL logical replication doesn't replicate schema changes. If you add a column on your Supabase source, the destination won't automatically get it.
Workarounds:
- Manual synchronization: Apply DDL changes to both databases within your deployment pipeline
- Use tools like pgroll: Automates schema migrations across replicated databases
- Schema versioning tables: Track schema versions and alert when they diverge
Primary Key Requirements
Every replicated table must have a primary key. Without one, PostgreSQL can't identify which row to update or delete:
-- Check for tables without primary keys SELECT t.table_schema, t.table_name FROM information_schema.tables t LEFT JOIN information_schema.table_constraints tc ON t.table_schema = tc.table_schema AND t.table_name = tc.table_name AND tc.constraint_type = 'PRIMARY KEY' WHERE t.table_type = 'BASE TABLE' AND t.table_schema = 'public' AND tc.constraint_name IS NULL;
WAL Retention and Disk Space
Replication slots prevent WAL segments from being deleted until the subscriber confirms receipt. If your subscriber goes offline, WAL files accumulate:
-- Check WAL retention per slot
SELECT slot_name,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained_wal
FROM pg_replication_slots;
Configure a maximum WAL retention to prevent disk exhaustion:
-- In postgresql.conf or as a command argument max_slot_wal_keep_size = '10GB'
When this limit is reached, the slot becomes invalid and you'll need to recreate the subscription.
Replicating to Non-PostgreSQL Destinations
For data warehouses like BigQuery, Snowflake, or ClickHouse, you'll need a CDC (Change Data Capture) tool to read the WAL and translate it:
Option 1: Debezium
Debezium is the industry standard for CDC. It connects to your PostgreSQL WAL and outputs changes to Kafka, which can then feed any destination:
# docker-compose addition for Debezium
debezium:
image: debezium/connect:2.5
environment:
BOOTSTRAP_SERVERS: kafka:9092
GROUP_ID: supabase-cdc
CONFIG_STORAGE_TOPIC: debezium_configs
OFFSET_STORAGE_TOPIC: debezium_offsets
This is covered in more detail in our Change Data Capture guide.
Option 2: Supabase's Open-Source ETL
Supabase's managed replication is built on an open-source ETL tool. For self-hosted users, you can run it yourself:
git clone https://github.com/supabase/etl cd etl # Configure destination in config.yaml docker-compose up -d
This is particularly useful if you're eventually planning to use Supabase Warehouse (currently in development).
Monitoring Your Replication Pipeline
Once replication is running, you need visibility into its health. Here's a minimal monitoring query you can run periodically or integrate with your monitoring stack:
-- Replication health check
SELECT
s.slot_name,
s.active,
s.confirmed_flush_lsn,
pg_current_wal_lsn() AS current_lsn,
pg_wal_lsn_diff(pg_current_wal_lsn(), s.confirmed_flush_lsn) AS lag_bytes,
CASE
WHEN pg_wal_lsn_diff(pg_current_wal_lsn(), s.confirmed_flush_lsn) > 1073741824
THEN 'CRITICAL: >1GB lag'
WHEN pg_wal_lsn_diff(pg_current_wal_lsn(), s.confirmed_flush_lsn) > 104857600
THEN 'WARNING: >100MB lag'
ELSE 'OK'
END AS status
FROM pg_replication_slots s
WHERE s.slot_type = 'logical';
Set up alerts when:
activebecomesfalse(subscriber disconnected)lag_bytesexceeds your threshold (data falling behind)- Replication slot doesn't exist (was dropped)
Performance Considerations
Logical replication adds overhead to your Supabase instance. Here's what to watch:
CPU Impact
Decoding WAL for logical replication uses CPU. On busy databases, consider:
- Deploying on instances with higher CPU allocation
- Using
replica_identity = DEFAULT(only primary key in WAL) rather thanFULL(entire row)
Network Bandwidth
Initial sync copies all existing data. For large tables:
- Schedule initial sync during low-traffic periods
- Use
copy_data = falsein subscription if destination already has the data
Write Amplification
Each write to a replicated table generates WAL entries that must be:
- Written to disk
- Read by the WAL sender
- Sent over the network
- Applied at the destination
For write-heavy workloads, this can impact overall throughput.
Managing Replication with Supascale
Setting up logical replication manually works, but managing multiple Supabase projects with different replication targets gets complicated. Supascale helps by providing:
- Centralized configuration: Manage PostgreSQL settings across projects
- Backup coordination: Ensure backups include replication slot state for clean failover
- Monitoring integration: Track replication lag alongside other metrics
The platform handles the operational complexity while you focus on what data goes where.
Troubleshooting
"Replication slot does not exist"
The destination lost its slot reference. On the source:
-- Create slot manually if needed
SELECT pg_create_logical_replication_slot('my_slot_name', 'pgoutput');
Then recreate the subscription pointing to the existing slot.
"Could not start WAL streaming"
Usually a network or authentication issue. Verify:
- Firewall allows connections on port 5432
pg_hba.confhas correct entry for replication- SSL settings match between source and destination
Replication Lag Keeps Growing
The destination can't keep up. Check:
- Destination database performance (CPU, disk I/O)
- Network latency between source and destination
- Complex triggers or constraints on destination tables slowing inserts
Conclusion
Logical replication transforms self-hosted Supabase from an isolated database into a data hub that feeds your entire infrastructure. Whether you're building analytics pipelines, maintaining compliance systems, or architecting multi-database applications, the ability to stream changes to external systems is essential.
The setup requires careful attention to PostgreSQL configuration, network security, and monitoring—but once running, it's remarkably reliable. Start with a single table, validate the pipeline end-to-end, then expand to more tables as you build confidence.
For teams managing multiple Supabase projects with complex data flows, Supascale provides the operational layer that makes this manageable at scale.
