Data Store

How DeltaStream works with your data stores

In DeltaStream, data stores are where your streaming data resides. Apache Kafka and Amazon Kinesis are two types of data stores. DeltaStream reads data from streaming data stores and performs the desired computation. Then it writes the results of the computation to that same data store or another store.

You own and manage your own data stores. To access the data in a data store, you configure connectivity and access to it. For instance, if you have an Apache Kafka cluster provided by Confluent Cloud, you can declare a data store in DeltaStream by setting up the connectivity and access. When you've defined the data store, DeltaStream can read from entities in the Kafka cluster and write into entities in the Kafka cluster.

DeltaStream supports the following:

Apache Kafka (AWS MSK, Confluent Cloud, and RedPanda)
AWS Kinesis
PostgresSQL
Snowflake
Databricks (only as sink for CTAS queries).
ClickHouse (for materialized views)
AWS S3
Apache Iceberg with the REST catalog
Apache Iceberg with the AWS Glue catalog

Streaming and Non-streaming Entities

Streaming Entity

A DeltaStream entity is an interface around the event organization layer for the physical streaming stores. In Apache Kafka-type data stores a DeltaStream entity corresponds with a Kafka topic; in AWS Kinesis-type data stores a DeltaStream entity corresponds with a Kinesis data stream.

DeltaStream uses entities to store the data backing streams and changelogs.

Note You can create, delete, and view the content of entities.

Non-streaming Entity

DeltaStream uses entities to represent the tables in non-streaming data stores such as PostgreSQL, Snowflake, and Databricks. Similar to the concept of entities in streaming data stores, DeltaStream also uses entities to refer to, inspect, add, or delete tables in postgreSQL, Snowflake, and Databricks.

Schema Registry

A schema registry is a centralized repository for managing and validating schemas for data in Apache Kafka topics. DeltaStream uses the schema registry to represent a schema registry service for Apache Kafka clusters.

Here's an example: If you use Confluent Cloud with a schema registry service, you can define a schema registry in DeltaStream that represents the Confluent Cloud's schema registry service. Then you can use that service to assign the schema registry to the data stores that use that service. DeltaStream uses the corresponding schema registry to fetch the topic schemas to deserialize topic content.

PreviousCompute Pools NextDatabase

Last updated 2 months ago