Online migration

Overview

An online migration lets you stand up a new pipeline alongside an existing one without adding any load to your source database and without an interruption to your current replication. You do this by sharing one source reader between the old pipeline and the new one: Artie reads your source once and fans the change stream out to both pipelines. Use this when you want to:

Move a pipeline to a different destination database (e.g. a new DuckDB database) without re-reading the source.
Re-create a pipeline with different settings (new schema mapping, history mode, partitioning, etc.) and cut over once it has caught up.
Validate a new destination in parallel before retiring the old one.

Sharing a source reader is currently only configurable through Terraform. It cannot be done from the dashboard UI yet. The examples below use the artie Terraform provider.

How it works

Before the migration you have one pipeline with a dedicated source reader: During the migration you promote that source reader to shared and attach a second pipeline to it. You are still only reading from MySQL once — the single reader publishes to Kafka, and both pipelines consume from it independently: Once the new pipeline has backfilled and caught up to the source, you delete the old pipeline and the migration is complete:

Offsets are tracked per pipeline. Promoting the source reader to shared does not reset, replay, or interfere with the old pipeline’s position in the stream. The old pipeline keeps replicating exactly as before while the new pipeline backfills and catches up at its own pace.

Before you start

Your existing pipeline and its source reader must be managed in Terraform, or imported into Terraform state. See terraform import for the source reader and pipeline resources.
A shared source reader must declare its tables explicitly via the tables block. List every table that any attached pipeline needs (see Step 1).
The destination connector for the new pipeline (e.g. a second DuckDB / MotherDuck database) should already exist as an artie_connector resource.

Steps

Promote the existing source reader to shared

Set is_shared = true on the source reader and add a tables block listing every table the reader should capture. A shared reader requires the table list because it is no longer tied to a single pipeline’s table selection.

resource "artie_source_reader" "mysql_reader" {
  name           = "MySQL Production Reader"
  connector_uuid = artie_connector.mysql_prod.uuid
  database_name  = "app"

  is_shared = true # ← promote to shared

  # Required once shared: list every table the reader should capture.
  tables = {
    "app.orders" = {
      name   = "orders"
      schema = "app"
    }
    "app.customers" = {
      name   = "customers"
      schema = "app"
    }
  }
}

Promoting a reader to shared is a non-destructive change — it does not restart replication or reset offsets for the existing pipeline. However, make sure the tables block is a superset of every table your existing pipeline replicates. Omitting a table the old pipeline depends on will stop changes for that table from being captured.

Run terraform plan and confirm the only change is on the source reader, then terraform apply.

Create the new pipeline against the same source reader

Add a second artie_pipeline resource that points at the same source_reader_uuid as your existing pipeline, but writes to a different DuckDB database. This is the new pipeline that will eventually replace the old one.

# Existing pipeline (unchanged) → DuckDB database "analytics_v1"
resource "artie_pipeline" "mysql_to_duckdb_v1" {
  name               = "MySQL to DuckDB (v1)"
  source_reader_uuid = artie_source_reader.mysql_reader.uuid

  destination_connector_uuid = artie_connector.duckdb.uuid
  destination_config = {
    database = "analytics_v1"
    schema   = "main"
  }

  tables = {
    "app.orders"    = { name = "orders", schema = "app" }
    "app.customers" = { name = "customers", schema = "app" }
  }
}

# New pipeline → DuckDB database "analytics_v2", same source reader
resource "artie_pipeline" "mysql_to_duckdb_v2" {
  name               = "MySQL to DuckDB (v2)"
  source_reader_uuid = artie_source_reader.mysql_reader.uuid # ← shared reader

  destination_connector_uuid = artie_connector.duckdb.uuid
  destination_config = {
    database = "analytics_v2" # ← different database
    schema   = "main"
  }

  tables = {
    "app.orders"    = { name = "orders", schema = "app" }
    "app.customers" = { name = "customers", schema = "app" }
  }
}

Run terraform apply. Artie creates the new pipeline, backfills its tables into analytics_v2, and then begins consuming live changes from the shared reader’s Kafka stream.

Because both pipelines consume from the same Kafka stream, your MySQL database is still only read once. Adding the new pipeline does not add a second connection, replication slot, or binlog reader on the source.

Let the new pipeline backfill and catch up

At this point you have one source reader and two pipelines:

Old pipeline: MySQL → Kafka → DuckDB analytics_v1
New pipeline: MySQL → Kafka → DuckDB analytics_v2

Watch the new pipeline in the analytics portal until its backfill completes and its replication lag drops to near zero — meaning it has caught up to the live change stream and is in sync with the old pipeline.Validate the data in analytics_v2 (row counts, spot checks, downstream queries) before cutting over.

Cut over and delete the old pipeline

Once you have confirmed the new pipeline is caught up and the data in analytics_v2 is correct, point your downstream consumers at the new database, remove the old pipeline’s resource block from your Terraform configuration, and run terraform apply.You are now left with a single source reader feeding only the new pipeline, with no downtime and no extra load on your source.

(Optional) Convert the reader back to UI-managed

Once the new pipeline is the only pipeline attached to the reader, you can convert the reader back to a regular, single-pipeline reader so it becomes editable from the dashboard again. While a reader is shared, its source configuration is locked in the UI (“This source reader is shared by multiple pipelines, so it can only be changed via Terraform”) — flipping it back unlocks that.Set is_shared = false and remove the now-redundant tables block (a non-shared reader infers its tables from its single pipeline):

resource "artie_source_reader" "mysql_reader" {
  name           = "MySQL Production Reader"
  connector_uuid = artie_connector.mysql_prod.uuid
  database_name  = "app"

  is_shared = false # ← convert back to single-pipeline
}

Run terraform apply. Artie tears down the standalone shared-reader deployment and folds the reader back into the new pipeline’s own deployment. From this point the reader and pipeline are fully manageable from the dashboard.

You can only set is_shared = false when at most one pipeline is attached to the reader. If more than one pipeline still references it, the change is rejected (“this source reader is being used by multiple pipelines so it cannot be set to not shared”). Delete the extra pipelines first.

If you expect to run more online migrations in the future, you can leave is_shared = true and skip this step — a shared reader works perfectly well with a single pipeline. Only convert back if you want the source reader to be editable from the dashboard.

Frequently asked questions

Does this add load to my source database?

No. The source reader reads your database once and publishes changes to Kafka. Every pipeline attached to the reader consumes from that same Kafka stream, so there is only ever one connection, one replication slot, and one log reader on your source — regardless of how many pipelines you run.

Will promoting the reader to shared interrupt my existing pipeline?

No. Offsets are tracked per pipeline, so the existing pipeline keeps replicating from exactly where it left off. Promoting the reader to shared only changes how many pipelines may attach to it.

What if I forget to list a table the old pipeline needs?

A shared source reader only captures the tables listed in its tables block. If you omit a table the old pipeline depends on, Artie will stop capturing changes for that table. Always make the reader’s tables block a superset of every table used by every attached pipeline.

Can I do this from the dashboard UI?

Not yet. Sharing a source reader across multiple pipelines is currently only available through Terraform. Create an artie_source_reader with is_shared = true and reference its UUID from each artie_pipeline resource.

Can I convert the reader back to dashboard-managed after the migration?

Yes. Once the reader is down to a single pipeline, set is_shared = false in Terraform and apply. Artie tears down the standalone shared-reader deployment and folds the reader back into its pipeline, which re-enables editing it from the dashboard. This only works when at most one pipeline is still attached — delete any extra pipelines first.

Can I write to more than two destinations at once?

Yes. A shared source reader can feed any number of pipelines in parallel — for example streaming the same source to DuckDB, Snowflake, and S3 simultaneously. See Writing to multiple destinations.

​Overview

​How it works

​Before you start

​Steps

​Frequently asked questions

Overview

How it works

Before you start

Steps

Frequently asked questions