CDC Pipeline Using a Single Slot
Debezium can capture change data (CDC) from multiple tables within the same PostgreSQL database using just one replication slot. This means it doesn’t need to create a separate slot for each table - it streams all table changes through a single slot, which reduces overhead and makes replication more efficient.
Instead of maintaining one replication slot per table, all relevant changes are multiplexed through a single slot, minimizing replication slot overhead, simplifying resource management, and ensuring consistent log consumption across the subscribed tables.
Pipeline creation in DeltaStream
Example Scenario
Create a CDC pipeline that captures changes from two PostgreSQL tables—public.customers and test.users—and writes each table’s changes into a corresponding Snowflake table.
Step 1
Run DDL to define a single CDC source to read changes that Debezium captured for source tables.
When capturing changes from multiple tables with different schemas, the before and after fields, representing the state of a row before and after a change, are defined as BYTES. This choice allows flexibility, as using a fixed schema would not accommodate the variety in column names and data types across different source tables. The BYTES data type enables a generic representation of these changes until they are transformed and written into a destination with defined schemas.
CREATE STREAM cdc_raw_source (
op VARCHAR,
ts_ms BIGINT,
`before` BYTES,
`after` BYTES,
`source` STRUCT<`schema` VARCHAR, `table` VARCHAR>
) WITH (
'store'='postgres_store',
'value.format'='json',
'postgresql.cdc.table.list'='public.customers,test.users');When capturing changes from multiple tables, set the postgresql.cdc.table.list source property to a comma-separated list of fully qualified table names in Postgres in the form <schema>.<table>.
Step 2
Run query to write CDC records for all tables into a shared Kafka topic.
Step 3
Run one query per table to write its captured changes into its own dedicated Kafka topic.
Step 4
you can use "Generate Stream DDL" command to obtain the tables' schema, if the input tables' schema is complex.
Step 5
Run a DDL per topic to define table specific CDC source for the table.
Step 6
Write a query for each source to sink its changes into a Snowflake table
Last updated

