Tuning flush rules

Why destination type matters

Every flush carries a fixed overhead cost — staging table creation time, time spent merging the data and any clean up work. The key to tuning flush rules is understanding how that fixed cost relates to your destination type. Both OLAP and OLTP destinations follow the same general flush pattern: The difference is how much latency each step adds. OLAP destinations (Snowflake, BigQuery, Databricks, Redshift) add significantly more latency per flush. Staging requires uploading files to cloud storage, DDL operations in a warehouse are slow, and — critically — the MERGE step requires a full table scan because OLAP databases don’t have indexes on primary keys. Every flush must scan the entire destination table to find matching rows, regardless of how many rows are in the batch. With small batches, this fixed latency dominates:

Batch size	Overhead	Merge time	Total	Per-row cost
1,000 rows	~5s	~0.1s	~5.1s	5.1ms
100,000 rows	~5s	~2s	~7s	0.07ms

OLTP destinations (PostgreSQL, MySQL, SQL Server) go through the same steps, but transactional databases have B-tree indexes on primary keys, so the MERGE uses fast index lookups instead of table scans. Combined with lightweight DDL operations, the fixed overhead per flush is low enough that smaller, more frequent flushes work well.

Recommended configurations

OLAP destinations
OLTP destinations

For analytical databases like Snowflake, Databricks, BigQuery, or Redshift:

Setting flush rules too low can hinder throughput and cause latency spikes:

Fixed overhead costs: Each flush has connection/metadata overhead that dominates processing time with small batches
Inefficient resource usage: OLAP systems are designed for large parallel operations, not frequent micro-operations
Storage and query degradation: Many small files hurt compression, increase metadata lookups, and trigger excessive compaction

Recommended approach

Larger, less frequent flushes are optimal because:

Columnar storage benefits from batch processing
Reduced metadata overhead and better compression
More efficient query performance with fewer small files

Example configuration:

Rows: 100k-500k
Bytes: 50-500 MB
Time: 3-15 minutes

For tables with very high write throughput, multi-step merge can be enabled to support extremely large flush batches (1 GB+).

Debugging high latency

If your pipeline latency is higher than expected, use the “Flush Count” graph in the analytics portal to identify which condition is triggering flushes — size, rows, or time — then adjust accordingly.

OLAP destinations
OLTP destinations

Check the flush reason

Look at the “Flush Count” graph to see which condition is triggering flushes.

If the reason is size or rows

Your flushes are triggering before enough data accumulates, producing small batches with high per-row overhead. Increase the limits toward the upper end of the recommended range:

Rows: increase toward 500k
Bytes: increase toward 500 MB

If the reason is time

Increase the time interval (e.g., from 3 minutes to 10-15 minutes). This may seem counterintuitive, but waiting longer allows more data to accumulate per flush, which increases overall throughput by amortizing the fixed merge overhead across a larger batch.

For more details on how flush rules work, see the overview.

Monitoring

You can see which flush rule triggered each flush in the analytics portal:

Flush count graph in the Artie analytics portal showing the number of flushes triggered by each condition (size, rows, or time) over a selected time period

Getting Started

Concepts

Connectors

Monitoring

Artie Dashboard

Tuning flush rules

Why destination type matters

Recommended configurations

Recommended approach

Recommended approach

Debugging high latency

Monitoring

Getting Started

Concepts

Connectors

Monitoring

Artie Dashboard

​Why destination type matters

​Recommended configurations

Recommended approach

Recommended approach

​Debugging high latency

​Monitoring

Why destination type matters

Recommended configurations

Debugging high latency

Monitoring