Skip to main content

How it works

is a critical part of Artie’s data pipeline that determines when and how data gets written to your destination.
1

Data buffering

  • Artie’s reading process will read changes from your source database and publish them to Kafka
  • Artie’s writing process will read messages from Kafka and write them to your destination
  • Messages are temporarily stored in memory and deduplicated based on primary key(s) or unique index
  • Multiple changes to the same record are merged to reduce write volume
2

Flush trigger evaluation

  • Artie continuously monitors three flush conditions
  • When any condition is met, a flush is triggered
  • Reading from Kafka pauses during the flush operation
3

Data loading

  • Buffered data is written to your destination in an optimized batch
  • After completion, Artie will commit the offset and resume reading from Kafka
  • The cycle repeats for continuous data flow

Conditions

Artie evaluates three conditions to determine when to flush data. Any one of these conditions will trigger a flush:

Time elapsed

Maximum time in seconds — Ensures data freshness even during low-volume periods

Message count

Number of deduplicated messages — Based on unique primary keys or unique index.

Byte size

Total bytes of deduplicated data — Actual payload size after deduplication