> ## Documentation Index
> Fetch the complete documentation index at: https://artie.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Enable partitioning

> Partition your BigQuery destination tables to improve Artie merge performance and reduce costs.

Partitioning your BigQuery tables reduces the amount of data scanned during Artie's merge operations, which lowers merge latency and can significantly reduce your BigQuery bill. This guide walks through converting an existing unpartitioned table to a partitioned one while keeping your Artie pipeline in sync.

## Prerequisites

* An existing Artie pipeline writing to a BigQuery destination
* Permissions to run DDL statements (`CREATE TABLE`, `DROP TABLE`, `ALTER TABLE`) in your BigQuery dataset

## Enable partitioning

For this example, consider a source table in Postgres:

```sql theme={null}
CREATE TABLE events (
    event_id SERIAL PRIMARY KEY,
    event_name VARCHAR(255) NOT NULL,
    description TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
```

<Steps>
  <Step title="Pause your Artie pipeline">
    In the Artie dashboard, navigate to your pipeline and click **Pause**. This prevents new data from being merged while you restructure the table.
  </Step>

  <Step title="Choose a partition column">
    Pick a `TIMESTAMP` or `DATE` column that your queries frequently filter on. Good candidates are columns like `created_at`, `updated_at`, or `event_date`.

    For most CDC workloads, **daily time partitioning** on a `created_at`-style column provides the best balance between partition pruning and manageability.
  </Step>

  <Step title="Recreate the table with partitioning in BigQuery">
    Run the following SQL in the BigQuery console to create a partitioned copy, drop the original, and rename the copy:

    <Warning>
      This operation drops the original table. Verify that the partitioned copy contains all expected data before running the `DROP TABLE` statement.
    </Warning>

    ```sql theme={null}
    CREATE TABLE artie.events_copy
      PARTITION BY DATE(created_at)
      AS SELECT * FROM artie.events;
    ```

    After verifying the data in `events_copy` looks correct:

    ```sql theme={null}
    DROP TABLE artie.events;
    ALTER TABLE artie.events_copy RENAME TO events;
    ```
  </Step>

  <Step title="Update table settings in Artie">
    In the Artie dashboard, go to your pipeline > **Edit** > **Tables** > find the `events` table > **Table settings**.

    Under the **BigQuery partitioning** setting, specify the partition column (e.g. `created_at`). This tells Artie to include the partition column in merge predicates, which enables BigQuery to prune partitions during merges.

    See [Table settings](/pipelines/tables#table-settings) for more details on table-level configuration.
  </Step>

  <Step title="Deploy and verify">
    Click **Save** and then **Deploy** to resume your pipeline.

    To confirm partitioning is active, run the following query in BigQuery:

    ```sql theme={null}
    SELECT table_name, partition_id, total_rows
    FROM `artie.INFORMATION_SCHEMA.PARTITIONS`
    WHERE table_name = 'events'
    ORDER BY partition_id;
    ```

    You should see multiple rows, one per partition, instead of a single `NULL` partition.
  </Step>
</Steps>

## Background

<Accordion title="What is a partitioned table?">
  A partitioned table is divided into segments called partitions, making it easier to manage and query your data. By dividing a large table into smaller partitions, you can improve query performance and control costs by reducing the number of bytes read by a query. You partition tables by specifying a partition column which is used to segment the table.
</Accordion>

<Accordion title="Why use table partitions?">
  * **Faster merges** - Artie's merge operations scan only the relevant partitions instead of the entire table.
  * **Lower costs** - Fewer bytes scanned means lower BigQuery compute charges, especially under [on-demand pricing](/guides/database/bigquery/capacity-pricing).
  * **Higher quotas** - Partitioned tables have higher [DML quotas](https://cloud.google.com/bigquery/quotas#standard_tables) than unpartitioned tables.
  * **Partition-level management** - You can write to, delete, or expire individual partitions without affecting the rest of the table.
</Accordion>

<Accordion title="What are the different partitioning strategies?">
  | Partitioning type                                                                                                | Description                                                                                                               | Artie recommendation                                                                                                                                                   |
  | ---------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
  | [Time partitioning](https://cloud.google.com/bigquery/docs/partitioned-tables#date_timestamp_partitioned_tables) | Partitions based on a `TIMESTAMP`, `DATE`, or `DATETIME` column. Supports hourly, daily, monthly, and yearly granularity. | **Recommended.** Works well with CDC workloads where data naturally has a time dimension.                                                                              |
  | [Integer range](https://cloud.google.com/bigquery/docs/partitioned-tables#integer_range)                         | Partitions based on value ranges of an integer column (e.g. `customer_id` 0–9 in one partition, 10–19 in the next).       | Useful when your data has no natural time column but has a well-distributed integer key.                                                                               |
  | [Ingestion time](https://cloud.google.com/bigquery/docs/partitioned-tables#ingestion_time)                       | Partitions based on when rows were inserted into BigQuery.                                                                | **Not recommended.** Artie's merge operations update existing rows, which can create duplicates across partitions since the ingestion timestamp changes on each merge. |
</Accordion>

## Related

* [BigQuery destination setup](/destinations/bigquery) - Configure BigQuery as a destination in Artie, including queue priority settings.
* [Capacity pricing](/guides/database/bigquery/capacity-pricing) - Understand BigQuery's pricing models and avoid unexpected costs.
