Overview
An online migration lets you stand up a new pipeline alongside an existing one without adding any load to your source database and without an interruption to your current replication. You do this by sharing one source reader between the old pipeline and the new one: Artie reads your source once and fans the change stream out to both pipelines. Use this when you want to:- Move a pipeline to a different destination database (e.g. a new DuckDB database) without re-reading the source.
- Re-create a pipeline with different settings (new schema mapping, history mode, partitioning, etc.) and cut over once it has caught up.
- Validate a new destination in parallel before retiring the old one.
How it works
Before the migration you have one pipeline with a dedicated source reader: During the migration you promote that source reader to shared and attach a second pipeline to it. You are still only reading from MySQL once — the single reader publishes to Kafka, and both pipelines consume from it independently: Once the new pipeline has backfilled and caught up to the source, you delete the old pipeline and the migration is complete:Before you start
- Your existing pipeline and its source reader must be managed in Terraform, or imported into Terraform state. See
terraform importfor the source reader and pipeline resources. - A shared source reader must declare its tables explicitly via the
tablesblock. List every table that any attached pipeline needs (see Step 1). - The destination connector for the new pipeline (e.g. a second DuckDB / MotherDuck database) should already exist as an
artie_connectorresource.
Steps
Promote the existing source reader to shared
is_shared = true on the source reader and add a tables block listing every table the reader should capture. A shared reader requires the table list because it is no longer tied to a single pipeline’s table selection.terraform plan and confirm the only change is on the source reader, then terraform apply.Create the new pipeline against the same source reader
artie_pipeline resource that points at the same source_reader_uuid as your existing pipeline, but writes to a different DuckDB database. This is the new pipeline that will eventually replace the old one.terraform apply. Artie creates the new pipeline, backfills its tables into analytics_v2, and then begins consuming live changes from the shared reader’s Kafka stream.Let the new pipeline backfill and catch up
- Old pipeline: MySQL → Kafka → DuckDB
analytics_v1 - New pipeline: MySQL → Kafka → DuckDB
analytics_v2
analytics_v2 (row counts, spot checks, downstream queries) before cutting over.Cut over and delete the old pipeline
analytics_v2 is correct, point your downstream consumers at the new database, remove the old pipeline’s resource block from your Terraform configuration, and run terraform apply.You are now left with a single source reader feeding only the new pipeline, with no downtime and no extra load on your source.(Optional) Convert the reader back to UI-managed
is_shared = false and remove the now-redundant tables block (a non-shared reader infers its tables from its single pipeline):terraform apply. Artie tears down the standalone shared-reader deployment and folds the reader back into the new pipeline’s own deployment. From this point the reader and pipeline are fully manageable from the dashboard.Frequently asked questions
Does this add load to my source database?
Does this add load to my source database?
Will promoting the reader to shared interrupt my existing pipeline?
Will promoting the reader to shared interrupt my existing pipeline?
What if I forget to list a table the old pipeline needs?
What if I forget to list a table the old pipeline needs?
tables block. If you omit a table the old pipeline depends on, Artie will stop capturing changes for that table. Always make the reader’s tables block a superset of every table used by every attached pipeline.Can I do this from the dashboard UI?
Can I do this from the dashboard UI?
artie_source_reader with is_shared = true and reference its UUID from each artie_pipeline resource.Can I convert the reader back to dashboard-managed after the migration?
Can I convert the reader back to dashboard-managed after the migration?
is_shared = false in Terraform and apply. Artie tears down the standalone shared-reader deployment and folds the reader back into its pipeline, which re-enables editing it from the dashboard. This only works when at most one pipeline is still attached — delete any extra pipelines first.Can I write to more than two destinations at once?
Can I write to more than two destinations at once?