LogoLogo
Start Trial
  • Overview
    • What is DeltaStream?
    • Core Concepts
      • Access Control
      • Region
      • SQL
      • Store
      • Database
      • Query
      • Visualizing Data Lineage
      • Function
  • Getting Started
    • Free Trial Quick Start
    • Starting with the Web App
    • Starting with the CLI
  • Tutorials
    • Managing Users and User Roles
      • Inviting Users to an Organization
      • Administering Users in your Organization
      • Using the CLI to Manage User Roles
      • Example: Setting Up Custom Roles for Production and Stage
    • Creating Stores for Streaming Data
    • Using Multiple Stores in Queries
    • Creating Relations to Structure Raw Data
    • Namespacing with Database and Schema
    • Creating and Querying Materialized Views
    • Creating a Function
    • Securing Your Connections to Data Stores
      • Introducing DeltaStream Private Links
      • Creating an AWS Private Link from DeltaStream to your Confluent Kafka Dedicated Cluster
      • Enabling Private Link Connectivity to Confluent Enterprise Cluster and Schema Registry
      • Creating a Private Link from DeltaStream to Amazon MSK
      • Creating a Private Link for RDS Databases
      • Deleting a Private Link
    • Integrations
      • Connecting to Confluent Cloud
      • Databricks
      • PostgreSQL
      • Snowflake
      • WarpStream
    • Serialization
      • Working with ProtoBuf Serialized Data and DeltaStream Descriptors
      • Working with Avro Serialized Data and Schema Registries
      • Configuring Deserialization Error Handling
  • Reference
    • Enterprise Security Integrations
      • Okta SAML Integration
      • Okta SCIM Integration
    • Metrics
      • Prometheus Integration
      • Built-In Metrics
      • Custom Metrics in Functions
    • SQL Syntax
      • Data Formats (Serialization)
        • Serializing with JSON
        • Serializing with Primitive Data Types
        • Serializing with Protobuf
      • Data Types
      • Identifiers and Keywords
      • Command
        • ACCEPT INVITATION
        • CAN I
        • COPY DESCRIPTOR_SOURCE
        • COPY FUNCTION_SOURCE
        • DESCRIBE ENTITY
        • DESCRIBE QUERY
        • DESCRIBE QUERY METRICS
        • DESCRIBE QUERY EVENTS
        • DESCRIBE QUERY STATE
        • DESCRIBE RELATION
        • DESCRIBE RELATION COLUMNS
        • DESCRIBE ROLE
        • DESCRIBE SECURITY INTEGRATION
        • DESCRIBE <statement>
        • DESCRIBE STORE
        • DESCRIBE USER
        • GENERATE COLUMNS
        • GENERATE TEMPLATE
        • GRANT OWNERSHIP
        • GRANT PRIVILEGES
        • GRANT ROLE
        • INVITE USER
        • LIST API_TOKENS
        • LIST DATABASES
        • LIST DESCRIPTORS
        • LIST DESCRIPTOR_SOURCES
        • LIST ENTITIES
        • LIST FUNCTIONS
        • LIST FUNCTION_SOURCES
        • LIST INVITATIONS
        • LIST METRICS INTEGRATIONS
        • LIST ORGANIZATIONS
        • LIST QUERIES
        • LIST REGIONS
        • LIST RELATIONS
        • LIST ROLES
        • LIST SCHEMAS
        • LIST SCHEMA_REGISTRIES
        • LIST SECRETS
        • LIST SECURITY INTEGRATIONS
        • LIST STORES
        • LIST USERS
        • PRINT ENTITY
        • REJECT INVITATION
        • REVOKE INVITATION
        • REVOKE PRIVILEGES
        • REVOKE ROLE
        • SET DEFAULT
        • USE
      • DDL
        • ALTER API_TOKEN
        • ALTER SECURITY INTEGRATION
        • CREATE API_TOKEN
        • CREATE CHANGELOG
        • CREATE DATABASE
        • CREATE DESCRIPTOR_SOURCE
        • CREATE ENTITY
        • CREATE FUNCTION_SOURCE
        • CREATE FUNCTION
        • CREATE INDEX
        • CREATE METRICS INTEGRATION
        • CREATE ORGANIZATION
        • CREATE ROLE
        • CREATE SCHEMA_REGISTRY
        • CREATE SCHEMA
        • CREATE SECRET
        • CREATE SECURITY INTEGRATION
        • CREATE STORE
        • CREATE STREAM
        • CREATE TABLE
        • DROP API_TOKEN
        • DROP CHANGELOG
        • DROP DATABASE
        • DROP DESCRIPTOR_SOURCE
        • DROP ENTITY
        • DROP FUNCTION_SOURCE
        • DROP FUNCTION
        • DROP METRICS INTEGRATION
        • DROP RELATION
        • DROP ROLE
        • DROP SCHEMA
        • DROP SCHEMA_REGISTRY
        • DROP SECRET
        • DROP SECURITY INTEGRATION
        • DROP STORE
        • DROP STREAM
        • DROP USER
        • UPDATE ENTITY
        • UPDATE SCHEMA_REGISTRY
        • UPDATE SECRET
        • UPDATE STORE
      • Query
        • APPLICATION
        • Change Data Capture (CDC)
        • CREATE CHANGELOG AS SELECT
        • CREATE STREAM AS SELECT
        • CREATE TABLE AS SELECT
        • Function
          • Built-in Functions
          • Row Metadata Functions
        • INSERT INTO
        • Materialized View
          • CREATE MATERIALIZED VIEW AS
          • SELECT (FROM MATERIALIZED VIEW)
        • Query Name and Version
        • Resume Query
        • RESTART QUERY
        • SELECT
          • FROM
          • JOIN
          • MATCH_RECOGNIZE
          • WITH (Common Table Expression)
        • TERMINATE QUERY
      • Sandbox
        • START SANDBOX
        • DESCRIBE SANDBOX
        • STOP SANDBOX
      • Row Key Definition
    • Rest API
Powered by GitBook
On this page
  • Schema
  • Relation
  • Stream
  • Changelog
  • Materialized View
  • Table
  • Row Key
  1. Overview
  2. Core Concepts

Database

Data organization in DeltaStream

PreviousStoreNextQuery

Last updated 5 months ago

Databases are the foundation for organizing data in DeltaStream. They provide the building block of its namespacing model.

You create databases for logical groupings for different teams or projects. For instance, you can create one database for a logging project and another for an ads team.

Schema

A schema is a logical grouping of relational objects such as streams, changelogs, materialized views, and tables. Schemas are grouped in a database. A combination of databases and schemas enable you to organize their streams, changelogs, and other database objects in a hierarchical fashion in DeltaStream. Such hierarchies also are one of the bases for providing role-based access control (RBAC) in DeltaStream in the same way as do other relational databases.

Relation

DeltaStream provides a relational model for streaming data wherein data is stored in relations. DeltaStream supports the following relation types:

  • Stream

  • Changelog

  • Materialized View

  • Table

In DeltaStream, these relations are building blocks of your applications and pipelines. You can specity relation names as fully- or partially-qualified names by specifying a and/or name in the format of [<database_name>.<schema_name>.]<relation_name>, like this:

db1.public.pageviews

Otherwise, DeltaStream uses the current database and schema in the scope of a client to identify a relation.

Stream

A stream is a sequence of immutable, partitioned, and partially-ordered events.

Tip DeltaStream uses the terms "events" and "records" synonymously.

  • A stream is a relational representation of data in streaming stores, such as the data in a Kafka topic or a Kinesis stream.

  • The records in a stream are independent of each other; there is no correlation between two records in a stream.

  • A stream declares the schema of the records; this includes the column name, the column type, and optional constraints.

Changelog

As with a stream, a changelog is

  • a sequence of partitioned and partially-ordered events

  • a relational representation of data in the streaming stores, such as the data in a Kafka topic or a Kinesis stream.

A changelog defines a PRIMARY KEY used to represent the change over time for records with the same primary key. Records in a changelog correlate with each other based on the PRIMARY KEY. This means a record in a changelog either is an insert (if it’s the first time the record with the given PRIMARY KEY is appended to the changelog) or an upsert (if a previous record with the same PRIMARY KEY has already been inserted into the changelog).

Materialized View

A materialized view creates a snapshot of a streaming query result and continuously updates the snapshot as records arrive to the query input(s). A materialized view is queryable in DeltaStream; when you query it the results are computed using the data in the snapshot at query runtime.

Note Queries on a materialized view are not streaming queries. They are the same as the queries on tables and materialized views in traditional relational databases.

Table

Row Key

Note Some operations such as GROUP BY and JOIN impact the row key definition and add row keys to their results’ records.

A table is similar to a materialized view in that it stores records from a streaming source. Unlike materialized views, however, tables do not support upserts. Rather, DeltaStream stores all records from a source or an upstream query operation (such as a JOIN or aggregation) as a sequence of records, as they are provided, for the sink that writes to the table. When you use a table with records that have a primary key -- for example, a -- the resulting rows in the table represent the incremental changes to each record key.

Each record in a or can have a row key. (Defining a row key is optional for a relation.) The value of a key for a given record is extracted from its corresponding message, which is read from the source relation’s . For example, if you use a Kafka topic as the relation’s entity, Kafka messages’ key bytes assign row key values to the relation’s records, based on the relation’s row key definition (if any).

When writing query results to a sink, the records’ keys are written as the messages’ keys into the sink relation’s . For example, when the result of a join query is written into a Kafka topic, the row keys of the resulting records are set as Kafka messages’ keys.

For more details, see .

Row Key Definition
database
schema
changelog
stream
changelog
entity
entity