Skip to main content

Documentation Index

Fetch the complete documentation index at: https://artie.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

A reference of terms and concepts you’ll encounter when working with Artie.

A

A manually triggered backfill initiated from the pipeline overview, as opposed to the automatic backfill that runs during initial onboarding.

B

A process that copies existing historical data from a source table to the destination. Backfills run alongside ongoing CDC so that both current and historical rows are replicated.See the Backfill guide for more details.
A deployment model where the data plane runs inside your own cloud account or network while Artie hosts the control plane. This keeps all customer data within your infrastructure.

C

Change data capture is a continuous process that captures inserts, updates, and deletes from a database’s transaction log. Artie uses CDC to stream changes in near real-time to your destination.
Encrypting column values before writing them to the destination. Values can be decrypted later using either an Artie-managed key or a customer-provided KMS data encryption key (DEK). See Tables.
A filter that replicates all columns except those in a specified list — useful for omitting PII or large unused columns. See Column inclusion and exclusion.
Replacing column values with a deterministic SHA-256 hash before writing to the destination. Hashed values are masked but remain joinable and stable across rows. See Tables.
A filter that replicates only an explicit list of columns, excluding everything else by default. See Column inclusion and exclusion.
The Artie-hosted layer responsible for the dashboard, orchestration, pipeline configuration, monitoring, and metrics. No customer row data passes through the control plane. See Architecture.
Artie has an in-memory compaction mechanism that reduces the volume of writes sent to the destination by compacting multiple changes to the before flushing. By doing so, Artie can reduce the number of writes sent to the destination.

D

The environment where all data processing happens: reading from your source, buffering changes in Kafka, and writing to your destination. A data plane is scoped to a cloud provider and region. In BYOC deployments the data plane runs in your own network. See Architecture.
A configurable name override that lets a source table appear under a different name in the destination. See Tables.

F

One of three thresholds — time elapsed, deduplicated row count, or deduplicated byte size — that triggers a flush when any single condition is met. See Flush rules.
The process of writing buffered, in-memory data to the destination as an optimized batch, then committing the Kafka offset and resuming reads. See Flush rules.

H

A table setting that creates a companion {TABLE}__HISTORY table recording every insert, update, and delete for auditing and point-in-time analysis. See Tables.

M

An additional destination column (such as a partition or cluster key) included in merge logic so the data warehouse can prune work during merges and improve performance. See Tables.
For high-throughput OLAP destinations, splitting one large merge into sequential smaller steps to avoid timeouts on very large flush batches. See Tables.

P

A configured replication path from a source through Artie to a destination, including which tables to replicate and their individual settings.

S

Automatic management of destination schema changes: adding new columns, optionally dropping columns that were removed upstream after a safety verification period and supporting column type changes.See the Schema evolution guide for more details.
A table setting that causes Artie to ignore delete events so deleted source rows are not removed from the destination. See Tables.
A destination setting where deleted source rows are retained in the destination and flagged with a delete column (such as __artie_delete) instead of being physically removed.
Splitting rows into time-based physical tables (for example, monthly or daily) while exposing a single unified view for queries. See Soft partitioning.
Optional destination columns prefixed with __artie_ that Artie adds for tracking operations, deletes, source metadata, and static tags. See System columns.

T

Compressing column values in flight so large payloads fit within Kafka message size limits. Values are decompressed before landing in the destination, so the logical schema is unchanged. See Tables.

U

In soft partitioning, a database view that combines all partition tables using UNION ALL so consumers can query a single logical table name. See Soft partitioning.