> ## Documentation Index
> Fetch the complete documentation index at: https://artie.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# DynamoDB Source Connector: Setup and Configuration

> Configure DynamoDB as a source in Artie using DynamoDB Streams, point-in-time recovery backfills, and S3 export for efficient data replication.

## Required settings

* DynamoDB Streams ARN (view type must be set to `NEW_IMAGE` or `NEW_AND_OLD_IMAGES`)
* DynamoDB Table must have point-in-time recovery enabled if you want the table backfilled
* Service account

<Accordion title="Finding DynamoDB Streams ARN">
  <img src="https://mintcdn.com/artie/cR74rDu7gj_LCvTI/assets/dynamodb_streams.png?fit=max&auto=format&n=cR74rDu7gj_LCvTI&q=85&s=cc6d45344a60ae98a23962ed728c98d7" alt="DynamoDB Streams" width="1440" height="685" data-path="assets/dynamodb_streams.png" />
</Accordion>

<Accordion title="Creating a service account (Terraform)">
  ```hcl theme={null}
  provider "aws" {
    region = "us-east-1"
  }

  resource "aws_iam_role" "dynamodb_streams_role" {
    name = "DynamoDBStreamsRole"

    assume_role_policy = jsonencode({
      Version   = "2012-10-17",
      Statement = [
        {
          Action = "sts:AssumeRole",
          Principal = {
            Service = "ec2.amazonaws.com"
          },
          Effect = "Allow",
          Sid    = ""
        }
      ]
    })
  }

  resource "aws_iam_policy" "dynamodb_streams_access" {
    name        = "DynamoDBStreamsAccess"
    description = "Policy that grants access to DynamoDB streams and exports for Artie"

    policy = jsonencode({
      Version   = "2012-10-17",
      Statement = [
        {
          Effect = "Allow",
          Action = [
            "dynamodb:GetShardIterator",
            "dynamodb:DescribeStream",
            "dynamodb:GetRecords",
            "dynamodb:ListStreams",

            // We'll need this to check if the table has PITR enabled
            "dynamodb:DescribeContinuousBackups",

            // Required for export
            "dynamodb:DescribeTable",
            "dynamodb:ListExports",
            "dynamodb:DescribeExport",
            "dynamodb:ExportTableToPointInTime"
          ],
          // Don't want to use "*"? You can specify like this:
          // Resource = [ TABLE_ARN, TABLE_ARN + "/stream/*" ]
          Resource = "*" # Modify this to restrict access to specific streams or resources
        },
        // Export (snapshot) requires access to S3
        {
          "Effect" : "Allow",
          "Action" : [
            "s3:ListBucket"
          ],
          "Resource" : "arn:aws:s3:::artie-exports"
        },
        {
          "Effect" : "Allow",
          "Action" : [
            "s3:GetObject",
            // Required for export
            "s3:PutObject",
            "s3:GetBucketLocation"
          ],
          "Resource" : "arn:aws:s3:::artie-exports/*"
        }
      ]
    })
  }

  resource "aws_iam_role_policy_attachment" "dynamodb_streams_role_policy_attachment" {
    role       = aws_iam_role.dynamodb_streams_role.name
    policy_arn = aws_iam_policy.dynamodb_streams_access.arn
  }

  output "service_role_arn" {
    value = aws_iam_role.dynamodb_streams_role.arn
  }

  resource "aws_iam_user" "dynamodb_streams_user" {
    name = "dynamodb-artie-user"
    path = "/"
  }

  resource "aws_iam_user_policy_attachment" "user_dynamodb_streams_attachment" {
    user       = aws_iam_user.dynamodb_streams_user.name
    policy_arn = aws_iam_policy.dynamodb_streams_access.arn
  }

  resource "aws_iam_access_key" "dynamodb_streams_user_key" {
    user = aws_iam_user.dynamodb_streams_user.name
  }

  output "aws_access_key_id" {
    value     = aws_iam_access_key.dynamodb_streams_user_key.id
    sensitive = true
  }

  output "aws_secret_access_key" {
    value     = aws_iam_access_key.dynamodb_streams_user_key.secret
    sensitive = true
  }
  ```
</Accordion>

## Backfills

In order to backfill data from a DynamoDB table, we will perform a data export to S3 and then import the data into the target table.

This is more favorable as it does not consume read capacity units (RCUs) from the source table, and you will only need to pay for data transfer costs (which are a fraction of RCUs).
