> ## Documentation Index
> Fetch the complete documentation index at: https://artie.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# MongoDB

> Learn how to use Artie to replicate data from MongoDB via change streams.

Artie syncs data directly from MongoDB [change streams](https://www.mongodb.com/docs/manual/changeStreams/), capturing insert, update, and delete operations in real-time.

<Note>
  MongoDB server **must** be in a replica set. If your pipeline only has a `standalone` server, you can create a replica set with one member.

  Need help? Check out this [guide](https://www.mongodb.com/docs/manual/tutorial/convert-standalone-to-replica-set/).
</Note>

<Info>
  Using AWS DocumentDB? See our [DocumentDB guide](/sources/documentdb) instead.
</Info>

## Required settings

* Connection string
* Service account

### Connection string

1. Go to Atlas UI
2. Find your pipeline and click "Connect"

We support both MongoDB SRV format or standard connection string.

<Accordion title="Retrieve connection string from Atlas">
  <img src="https://mintcdn.com/artie/cR74rDu7gj_LCvTI/assets/mongodb_connection_string.png?fit=max&auto=format&n=cR74rDu7gj_LCvTI&q=85&s=096e3c6b75e5c4f47b15cb0a4dabda69" alt="MongoDB connection string" width="755" height="486" data-path="assets/mongodb_connection_string.png" />
</Accordion>

### Service account

You can create a service account through the Atlas UI or by running a script.

<Accordion title="Option #1 - Atlas UI">
  * Click on `Database Access` on the left
  * Click on `Add New Database User`
  * Under `Database User Privileges`, open `Built-in Role` and Select `Only read any database`

      <img src="https://mintcdn.com/artie/cR74rDu7gj_LCvTI/assets/mongodb_atlas.png?fit=max&auto=format&n=cR74rDu7gj_LCvTI&q=85&s=5ba4ab3f4253bb1c26176672b3179d8b" alt="MongoDB Atlas" width="921" height="687" data-path="assets/mongodb_atlas.png" />
</Accordion>

<Accordion title="Option #2 - Service account script">
  ```js theme={null}
  /* If the user does not exist. */
  use admin;
  db.createUser({ user: "artie", pwd: "<password>", roles: ["readAnyDatabase", { role: "read", db: "local" }] });

  /* If the user already exists */
  db.updateUser("artie", { roles: ["readAnyDatabase", { role: "read", db: "local" }] });
  ```
</Accordion>

## Advanced

<Accordion title="Enabling full document before change">
  <Note>This feature requires MongoDB 6.0 or later.</Note>

  If you are replicating a MongoDB collection into a partitioned table downstream, you will want to consider enabling this so that the full document before change is available for deletes.

  This is because we will need the previous row to grab the partitioned field(s) in order to select the right partition downstream.

  To enable this, you'll want to run the following commands:

  ```js theme={null}
  // Enable preAndPostImage on the replica set
  use admin;
  db.runCommand({
     setClusterParameter: {
        changeStreamOptions: {
           preAndPostImages: {
              expireAfterSeconds: 100
           }
        }
     }
  });

  // Enable preAndPostImage on the collection
  use databaseName;
  db.runCommand({ collMod:"collectionName",changeStreamPreAndPostImages: { enabled: true } });

  // See the previous setting
  db.adminCommand( { getClusterParameter: "changeStreamOptions" } );

  // To disable this behavior, you can set expiredAfterSeconds to off
  db.runCommand( {
     setClusterParameter:
        { changeStreamOptions: {
           preAndPostImages: { expireAfterSeconds: 'off' }
        } }
  } );
  ```
</Accordion>

<Accordion title="How do you handle typing and nested fields?">
  **Data types:** Artie determines the data type for each field by looking at the first non-null value and using its BSON type. This type is then mapped to the appropriate destination column type.

  **Nested objects:** Nested documents and arrays are preserved as JSON objects in the destination. Artie does not flatten or unfurl nested structures into separate columns.

  For example, if the source document contains:

  ```json theme={null}
  {
    "test_nested_object": {
      "a": {"b": {"c": "hello"}},
      "test_timestamp": {"$timestamp": { "t": 1678929517, "i": 1 }},
      "super_nested": {
        "test_timestamp": {"$timestamp": { "t": 1678929517, "i": 1 }},
        "foo": "bar"
      }
    }
  }
  ```

  Artie will output `test_nested_object` as:

  ```json theme={null}
  {
    "a": {
      "b": {
        "c": "hello"
      }
    },
    "super_nested": {
      "foo": "bar",
      "test_timestamp": "2023-03-16T01:18:37Z"
    },
    "test_timestamp": "2023-03-16T01:18:37Z"
  }
  ```

  Note that BSON-specific types (like `$timestamp`) are converted to standard formats. If the destination does not support a `VARIANT` or semi-structured type, the nested object will be stored as a JSON string.
</Accordion>
