Setting up Pub/Sub
Overview
In this tutorial, we will learn how to run Debezium Server with Pub/Sub sink and Artie Transfer locally using Docker.
Prerequisites
- Terraform
- Docker
- gcloud CLI
- GCP Project
Set-up
gcloud CLI
Please visit this link to download the CLI. Once you have done so, run this command:
gcloud auth application-default login
Pub/Sub API
To use Pub/Sub in your GCP project, you will also need to enable it. Visit this link to enable it.
Creating a service account
locals {
project = "PROJECT_ID"
role = "roles/pubsub.editor"
}
provider "google" {
project = local.project
}
resource "google_service_account" "artie-svc-account" {
account_id = "artie-service-account"
display_name = "Service Account for Artie Transfer and Debezium"
}
resource "google_project_iam_member" "transfer" {
project = local.project
role = local.role
member = "serviceAccount:${google_service_account.artie-svc-account.email}"
}
$ terraform init
$ terraform plan
$ terraform apply
Download the service account credentials
Once your service account has been created, head to the GCP console and create a key for the service account. Save the key as we will be referencing it in the later steps.
Create the Pub/Sub topic and subscriptions
Debezium will not automatically create topics or subscriptions for you.
resource "google_pubsub_topic" "customer_topic" {
name = "dbserver1.inventory.customers"
project = local.project
timeouts {}
}
resource "google_pubsub_subscription" "customer_subscription" {
ack_deadline_seconds = 300
enable_exactly_once_delivery = false
enable_message_ordering = true
message_retention_duration = "604800s"
name = "transfer_${google_pubsub_topic.customer_topic.name}"
project = local.project
retain_acked_messages = false
topic = google_pubsub_topic.customer_topic.id
timeouts {}
}
$ terraform plan
$ terraform apply
Running Debezium
Within the pubsub examples folder, make sure to modify the application.properties to specify the project_id
. If you need help locating your GCP Project ID, see #getting-your-project-identifier
debezium.source.offset.storage.file.filename=/tmp/foo
debezium.source.offset.flush.interval.ms=0
debezium.sink.type=pubsub
debezium.sink.pubsub.project.id=PROJECT_ID
debezium.sink.pubsub.ordering.enabled=true
debezium.source.connector.class=io.debezium.connector.postgresql.PostgresConnector
debezium.source.database.hostname=postgres
debezium.source.database.port=5432
debezium.source.database.user=postgres
debezium.source.database.password=postgres
debezium.source.database.dbname=postgres
debezium.source.topic.prefix=dbserver1
debezium.source.table.include.list=inventory.customers
debezium.source.plugin.name=pgoutput
Running Transfer
Below is the config.yaml where the test database
will just output the query commands into the terminal. Make sure to also fill out the projectID
Visit options.md to see all the possible options for your configuration file and examples.md.
outputSource: test
queue: pubsub
pubsub:
projectID: artie-labs
pathToCredentials: /tmp/credentials/service-account.json
topicConfigs:
- db: customers
tableName: customers
schema: public
topic: "dbserver1.inventory.customers"
cdcFormat: debezium.postgres.wal2json
cdcKeyFormat: org.apache.kafka.connect.json.JsonConverter
telemetry:
metrics:
provider: datadog
settings:
tags:
- env:production
namespace: "transfer."
addr: "127.0.0.1:8125"
Docker Compose File
Now, within the docker-compose.yaml file, you will need to specify the path to your credentials that you have downloaded from the prior step. #download-the-service-account-credentials.
version: '3.9'
services:
postgres:
image: quay.io/debezium/example-postgres:2.0
ports:
- 5432:5432
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
debezium-server:
image: quay.io/debezium/server:2.0
container_name: debezium-server
command: sh -c "sleep 15 && /debezium/run.sh"
environment:
GOOGLE_APPLICATION_CREDENTIALS: /tmp/credentials/service-account.json
links:
- postgres
ports:
- 8080:8080
volumes:
- ./application.properties:/debezium/conf/application.properties
- REPLACE_ME:/tmp/credentials/service-account.json
depends_on:
- postgres
transfer:
build:
context: .
dockerfile: Dockerfile
volumes:
- REPLACE_ME:/tmp/credentials/service-account.json
Putting everything together
When running this, the PostgreSQL database already has some seeded data. As a result, we can see the merge statement being issued to add the seeded data.
Now that PostgreSQL is running locally on 0.0.0.0:5432
, you can open up a SQL editor to interact with the data model. The example below, we are updating the first_name
of a customer object and the change is directly streamed to Artie.
We hope you found this tutorial helpful.
- The code for this tutorial can be found here.
- To understand how Artie Transfer works with Google Pub/Sub under the hood, please click on this link.
- If you run into any other issues, please file a bug report on our GitHub page or get in touch at
[email protected]
.