> ## Documentation Index
> Fetch the complete documentation index at: https://artie.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# MySQL Source Connector: Setup and Configuration

> Configure MySQL as a source in Artie with binlog replication, service account setup, GTID support, and automatic gh-ost migration handling.

## Required settings

* Database connection
* Service account
* Database with `binlog_format` set to `ROW`
* Database with `binlog retention hours` set to at least 24 hours

## Setup

<Steps>
  <Step title="Create a service account">
    Run the following SQL to create a dedicated user for Artie:

    ```sql setup.sql theme={null}
    CREATE USER 'artie' IDENTIFIED BY 'password';
    -- SELECT is required for backfills.
    -- REPLICATION CLIENT and REPLICATION SLAVE are required for Artie to read the binary log.
    GRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'artie';
    ```
  </Step>

  <Step title="Enable binlog (ROW format)">
    Binary logging must be enabled with `binlog_format` set to `ROW`.

    <Tabs>
      <Tab title="AWS RDS / Aurora">
        You will need to create or update a parameter group for your RDS instance.

        <Note>
          For Aurora clusters, create a **DB cluster parameter group** instead of a DB parameter group.
        </Note>

        <Steps>
          <Step title="Create a parameter group">
            Navigate to **RDS** > **Parameter groups** > **Create parameter group**. Select the appropriate family for your MySQL version.
          </Step>

          <Step title="Set binlog_format to ROW">
            Open the newly created parameter group, search for `binlog_format`, and set it to `ROW`.
          </Step>

          <Step title="Apply the parameter group">
            Associate the parameter group with your RDS instance (or Aurora cluster), then reboot the database for the change to take effect.
          </Step>
        </Steps>
      </Tab>

      <Tab title="Self-hosted MySQL">
        Add the following to your MySQL configuration file (`my.cnf` or `my.ini`):

        ```ini theme={null}
        [mysqld]
        binlog_format = ROW
        ```

        Then restart the MySQL server.
      </Tab>
    </Tabs>
  </Step>

  <Step title="Set binlog retention period">
    The default binlog retention is `NULL`, which means logs are not retained. Artie requires at least 24 hours of retention.

    <Tabs>
      <Tab title="AWS RDS / Aurora">
        ```sql theme={null}
        CALL mysql.rds_set_configuration('binlog retention hours', 24);
        ```
      </Tab>

      <Tab title="Azure">
        ```sql theme={null}
        SET GLOBAL binlog_expire_logs_seconds = 604800;
        ```
      </Tab>

      <Tab title="GCP CloudSQL / Self-hosted">
        No action needed - binlog retention is managed automatically.
      </Tab>
    </Tabs>
  </Step>
</Steps>

## Additional features

* **GTID support** - when enabled, Artie automatically uses GTID-based replication for more resilient syncing.

<Accordion title="Enabling GTID">
  Ensure the following settings are enabled on your MySQL server:

  ```sql theme={null}
  SET GLOBAL gtid_mode = ON;
  SET GLOBAL enforce_gtid_consistency = ON;
  ```

  You can verify the settings with:

  ```sql theme={null}
  SHOW VARIABLES LIKE 'gtid_mode';
  SHOW VARIABLES LIKE 'enforce_gtid_consistency';
  ```

  <Info>
    Once GTID is enabled, Artie will automatically detect it and switch to GTID-based replication. No additional configuration is needed in the Artie dashboard.
  </Info>
</Accordion>

* **Automatic [gh-ost](https://github.com/github/gh-ost) migration handling** - Artie detects gh-ost schema migrations and processes them seamlessly.

<Accordion title="How does Artie handle gh-ost migrations?">
  [gh-ost](https://github.com/github/gh-ost) is a popular tool for online schema migrations in MySQL. During a gh-ost migration, a ghost table is created and data is copied over before the tables are swapped.

  Artie automatically detects gh-ost ghost tables in the binlog stream and handles the table swap transparently - no manual intervention or pipeline restarts required.
</Accordion>

## Frequently asked questions

### Which MySQL versions are supported?

Artie supports MySQL 5.7 and above, including MySQL 8.x. We also support Amazon Aurora MySQL, Azure Database for MySQL, and GCP CloudSQL for MySQL.

### Can I replicate from a read replica?

Not for CDC. Artie requires access to the binary log, which is only available on the primary (writer) instance.However, Artie can connect to the read replica for backfills.

### How do I verify that binlog is enabled?

Run the following query on your MySQL instance:

```sql theme={null}
SHOW VARIABLES LIKE 'binlog_format';
```

The result should show `ROW`. If it shows `STATEMENT` or `MIXED`, update your configuration as described in the [setup steps above](#setup).

### What happens if binlog retention is too short?

If binlog files are purged before Artie can read them, the pipeline will need to re-snapshot the affected tables. We recommend at least 24 hours of retention to provide a comfortable buffer.
