How it works
Artie runs two parallel processes during a :1
Backfill process (historical data)
- Scans full table in batches, writes directly to destination
- Bypasses Kafka
- Can use read replica to minimize primary DB load
2
CDC process (live changes)
- Reads transaction logs immediately
- Queues all changes (inserts, updates, deletes) in Kafka
- Applies queued changes after backfill completes
Managing backfills
You can track the progress of your backfills in the Analytics Portal or in the Pipeline Overview page.Triggering an ad hoc backfill
Triggering an ad hoc backfill
You can trigger an ad hoc backfill from the Pipeline Overview page by clicking on the Backfill tables button.

Canceling a backfill
Canceling a backfill
You can cancel a backfill from the Pipeline Overview page by clicking on the Cancel backfill button.

Questions
When do backfills occur?
- When you first launch a pipeline
- When you add new tables
- When you trigger an ad hoc backfill from the dashboard
How many tables backfill at once?
Default: 10 tables in parallel per pipeline to keep source DB load manageable. Table states:Queued to backfill— Waiting in queueBackfilling— Actively running
How are backfills ordered?
FIFO (first-in, first-out). Tables backfill in the order they were added, up to the concurrency limit.What happens to CDC changes during a backfill?
Our process continues to capture all changes to Kafka in the background. Once backfill completes:- We switch to CDC stream
- We apply queued changes in order
- The table transitions to fully streaming state