Documentation Index
Fetch the complete documentation index at: https://artie.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Replication slot too large and not decreasing
A growing replication slot means PostgreSQL is retaining WAL segments that have not yet been consumed. If the slot size keeps increasing and never decreases, the retained WAL can eventually exhaust disk space.Symptoms
- Replication slot size is growing continuously
- WAL disk usage is increasing or triggering storage alerts
- The
retained_walvalue from the diagnostic query below keeps climbing
Diagnosing the issue
Check your replication slot size and status:Common causes and resolutions
Long-running transactions
Long-running transactions
Long-running or idle-in-transaction sessions prevent PostgreSQL from advancing the replication slot past their transaction boundary.Resolution: Terminate the blocking session and consider setting
idle_in_transaction_session_timeout to automatically kill idle transactions.Idle database without heartbeats enabled
Idle database without heartbeats enabled
On idle databases (especially AWS RDS), WAL segments accumulate because there are no changes for the replication slot to consume. RDS writes internal heartbeats to
rdsadmin every 5 minutes, generating ~18 GB of WAL per day on an otherwise idle instance.Resolution: Enable heartbeats in Artie to periodically advance the replication slot. See Enabling heartbeats for setup instructions.For RDS-specific details, see Preventing WAL growth on RDS.Pipeline is paused or unhealthy
Pipeline is paused or unhealthy
If the Artie pipeline is paused, stopped, or in an error state, it will not consume from the replication slot, causing WAL to accumulate.Resolution: Check the pipeline status in the Artie dashboard and resume or re-deploy it.
max_slot_wal_keep_size is not configured
max_slot_wal_keep_size is not configured
By default,
max_slot_wal_keep_size is set to -1 (unlimited), meaning PostgreSQL will retain WAL indefinitely for a slot. This can lead to unbounded disk growth.Resolution: Set max_slot_wal_keep_size to a reasonable value to cap WAL retention. Note that if the limit is reached, PostgreSQL will invalidate the slot (see Replication slot lost below).Replication slot lost
A lost replication slot means the slot was either dropped or invalidated, so the pipeline can no longer stream changes from where it left off.Symptoms
- Pipeline errors indicating the replication slot does not exist
- Errors referencing WAL segments that have been removed
pg_replication_slotsreturns no rows for your slot, or showswal_status = 'lost'
Diagnosing the issue
Check whether the slot still exists and its status:max_slot_wal_keep_size setting:
Common causes and resolutions
Slot invalidated by max_slot_wal_keep_size
Slot invalidated by max_slot_wal_keep_size
This is often a consequence of the slot growing too large (see Replication slot too large above). When
max_slot_wal_keep_size is configured, PostgreSQL will invalidate any slot whose retained WAL exceeds the limit.Resolution: Re-deploy the pipeline in Artie to recreate the replication slot, then trigger a backfill to re-sync your data. To prevent this from happening again, enable heartbeats to keep the slot advancing. See Enabling heartbeats.Manual slot deletion
Manual slot deletion
Someone manually dropped the replication slot using
pg_drop_replication_slot().Resolution: Re-deploy the pipeline in Artie to recreate the slot, then trigger a backfill.Database failover
Database failover
During a failover event (especially on Amazon Aurora), replication slots on the old primary are not automatically carried over to the new primary.Resolution: Re-deploy the pipeline in Artie to create a new replication slot on the new primary, then trigger a backfill to re-sync data.
Provider-specific WAL retention limits
Provider-specific WAL retention limits
Some managed PostgreSQL providers impose their own WAL retention limits that can cause slot invalidation independently of
max_slot_wal_keep_size.Resolution: Check your provider’s documentation for WAL retention policies. Re-deploy the pipeline and trigger a backfill to recover. Enable heartbeats to keep the slot active and prevent future invalidation. See Enabling heartbeats.Setting max_slot_wal_keep_size
max_slot_wal_keep_size caps how much WAL Postgres retains for a replication slot before invalidating it. Treat this as a break-glass mechanism - if the limit is hit, the slot is invalidated and a full backfill is required to recover.
When choosing a value, consider:
- Your database size - larger databases generate more WAL and need more headroom.
- Historical slot size - check how large your slot has grown during normal operations and past incidents (pipeline pauses, long-running transactions, etc.).
- Set it high enough that you don’t have to think about it - aim for at least 3-5x your observed peak slot size.