LogoLogo
Start Trial
Start Trial
  • Overview
    • What is DeltaStream?
    • Core Concepts
      • Access Control
      • Region
      • SQL
      • Store
      • Database
      • Query
      • Stream 360
      • Function
  • Getting Started
    • Starting with the Web App
    • Starting with the CLI
    • Trial Quick Start
  • Tutorials
    • Invite Users to an Organization
    • User Management
      • User Management for Admins
      • Basic User Role Commands
      • Example: Setting Up Custom Roles for Production and Stage
    • Creating Stores for Streaming Data
    • Connecting to Confluent Cloud
    • Using Multiple Stores in Queries
    • Creating Relations to Structure Raw Data
    • Namespacing with Database and Schema
    • Creating and Querying Materialized Views
    • Creating a Function
    • Integrations
      • Databricks
      • Snowflake
      • PostgreSQL
    • Serialization
      • Working with ProtoBuf Serialized Data and DeltaStream Descriptors
      • Working with Avro Serialized Data and Schema Registries
      • Configuring Deserialization Error Handling
  • Reference
    • Enterprise Security Integrations
      • Okta SAML Integration
      • Okta SCIM Integration
    • Metrics
      • Integrations
        • Prometheus Integration
      • Built-In Metrics
      • Custom Metrics in Functions
    • SQL Syntax
      • Data Formats (Serialization)
        • Serializing with JSON
        • Serializing with Primitive
        • Serializing with Protobuf
      • Data Types
      • Identifiers and Keywords
      • Command
        • ACCEPT INVITATION
        • CAN I
        • COPY DESCRIPTOR_SOURCE
        • COPY FUNCTION_SOURCE
        • DESCRIBE DESCRIPTOR_SOURCE
        • DESCRIBE ENTITY
        • DESCRIBE QUERY
        • DESCRIBE QUERY METRICS
        • DESCRIBE QUERY EVENTS
        • DESCRIBE QUERY STATE
        • DESCRIBE RELATION
        • DESCRIBE RELATION COLUMNS
        • DESCRIBE ROLE
        • DESCRIBE SECURITY INTEGRATION
        • DESCRIBE <statement>
        • DESCRIBE STORE
        • DESCRIBE USER
        • GENERATE COLUMNS
        • GENERATE TEMPLATE
        • GRANT OWNERSHIP
        • GRANT PRIVILEGES
        • GRANT ROLE
        • INVITE USER
        • LIST API_TOKENS
        • LIST DATABASES
        • LIST DESCRIPTORS
        • LIST DESCRIPTOR_SOURCES
        • LIST ENTITIES
        • LIST FUNCTIONS
        • LIST FUNCTION_SOURCES
        • LIST INVITATIONS
        • LIST METRICS INTEGRATIONS
        • LIST ORGANIZATIONS
        • LIST QUERIES
        • LIST REGIONS
        • LIST RELATIONS
        • LIST ROLES
        • LIST SCHEMAS
        • LIST SCHEMA_REGISTRIES
        • LIST SECRETS
        • LIST SECURITY INTEGRATIONS
        • LIST STORES
        • LIST USERS
        • PRINT ENTITY
        • REJECT INVITATION
        • REVOKE INVITATION
        • REVOKE PRIVILEGES
        • REVOKE ROLE
        • SET DEFAULT
        • USE
      • DDL
        • ALTER API_TOKEN
        • ALTER SECURITY INTEGRATION
        • CREATE API_TOKEN
        • CREATE CHANGELOG
        • CREATE DATABASE
        • CREATE DESCRIPTOR_SOURCE
        • CREATE ENTITY
        • CREATE FUNCTION_SOURCE
        • CREATE FUNCTION
        • CREATE INDEX
        • CREATE METRICS INTEGRATION
        • CREATE ORGANIZATION
        • CREATE ROLE
        • CREATE SCHEMA_REGISTRY
        • CREATE SCHEMA
        • CREATE SECRET
        • CREATE SECURITY INTEGRATION
        • CREATE STORE
        • CREATE STREAM
        • DROP API_TOKEN
        • DROP CHANGELOG
        • DROP DATABASE
        • DROP DESCRIPTOR_SOURCE
        • DROP ENTITY
        • DROP FUNCTION_SOURCE
        • DROP FUNCTION
        • DROP METRICS INTEGRATION
        • DROP RELATION
        • DROP ROLE
        • DROP SCHEMA
        • DROP SCHEMA_REGISTRY
        • DROP SECRET
        • DROP SECURITY INTEGRATION
        • DROP STORE
        • DROP STREAM
        • DROP USER
        • UPDATE ENTITY
        • UPDATE SCHEMA_REGISTRY
        • UPDATE SECRET
        • UPDATE STORE
      • Query
        • APPLICATION
        • Change Data Capture (CDC)
        • CREATE CHANGELOG AS SELECT
        • CREATE STREAM AS SELECT
        • CREATE TABLE AS SELECT
        • Function
          • Built-in Functions
          • Row Metadata Functions
        • INSERT INTO
        • Materialized View
          • CREATE MATERIALIZED VIEW AS
          • SELECT (FROM MATERIALIZED VIEW)
        • Query Name and Version
        • Resume Query
        • RESTART QUERY
        • SELECT
          • FROM
          • JOIN
          • MATCH_RECOGNIZE
          • WITH (Common Table Expression)
        • TERMINATE QUERY
      • Sandbox
        • START SANDBOX
        • DESCRIBE SANDBOX
        • STOP SANDBOX
      • Row Key Definition
    • GraphQL API
    • Rest API
Powered by GitBook
On this page
  • Interacting with DeltaStream
  • When Should you use DeltaStream?
Export as PDF

What is DeltaStream?

NextCore Concepts

Last updated 1 day ago

DeltaStream is a serverless stream processing platform that integrates with streaming storage services including Apache Kafka and AWS Kinesis, Confluent Cloud, AWS MSK, and Redpanda. Think about it as the compute layer on top of your streaming storage.

DeltaStream provides a SQL-based interface wherein you can easily create stream processing applications such as streaming pipelines, materialized views, microservices, and many more.

DeltaStream is more than simply a query processing layer on top of Kafka or Kinesis. It brings relational database concepts to the data streaming world, including namespacing and role-based access controls that enable you to securely access, process, and share your streaming data regardless of where it is stored. Unlike existing solutions that focus primarily on processing capabilities, DeltaStream provides a holistic solution for both processing and operational management of your streaming data.

DeltaStream’s primary capabilities make it uniquely suited for processing and managing data streams:

  • DeltaStream is serverless. No longer must you worry about clusters/servers, architecting, or scaling infrastructure to run real-time applications. Gone are the days of cluster sizing, keeping track of which cluster queries run in, or knowing how many tasks to allocate to your applications. DeltaStream removes much of the complexity; queries can

    • run in isolation

    • scale up/down independently

    • seamlessly recover from failures.

    This enables you to focus just on building the core products that bring value to you and your organization.

  • SQL as the primary interface. Do all you need to do in a simple and familiar SQL interface:

    • Create databases and streams

    • Run continuous queries

    • Build materialized views on these streams.

    DeltaStream provides SQL extensions that enable you to express streaming concepts that don’t have equivalents in traditional SQL. Additionally, if your compute logic requires more than SQL, you can use DeltaStream’s UDFs/UDAFs to define and perform such computations.

  • Always up-to-date materialized views. Materialized view is a native capability in DeltaStream. You use continuous queries to build “always up-to-date” materialized views. Then when you create a materialized view you can query it the same way you query materialized views in relational databases.

  • Unified view over multiple streaming stores. DeltaStream gives you a single view into all your streaming data across all your streaming stores. Whether you are using one or multiple Kafka clusters or multiple platforms such as Kafka and Kinesis, DeltaStream provides a unified view of the streaming data. Further, you can write queries on these streams regardless of where they are stored.

  • Intuitive namespacing. Streaming storage systems such as Apache Kafka have a flat namespace — roughly analogous to a file system with no folders. This makes it challenging to organize streams in such systems. By providing namespacing, DeltaStream enables you to organize your streams in databases and schemas similar to the way you'd organize your tables in relational databases. Such storage abstraction enables you to organize your streaming data across all your streaming storage systems.

  • Fine-grained security that is familiar and straightforward. You can define fine-grained access privileges to determine who can access and perform which operations on objects in DeltaStream. With DeltaStream’s role-based access control (RBAC) you define roles and assign them to users. And you can do it all in familiar SQL. For instance, with just a one-line statement you can give read privileges on a specific stream to a given role.

  • Break down silos for your streaming data with secure sharing. With namespacing, storage abstraction, and role-based access control, DeltaStream breaks down silos for your streaming data and enables you to share streaming data securely across multiple teams in your organizations.

  • Push notifications. You can create notifications on results of your continuous queries and push them to a variety of services such as Slack, email, PagerDuty, or custom API calls. For instance, with a stream of sensor data from vehicles, you can write a query to compute the average speed of each vehicle and send a notification to the driver if the average is higher than a threshold for a given time window.

Interacting with DeltaStream

You can interact with DeltaStream through its REST API, a Web application, or the CLI. The following figure displays a screenshot of the DeltaStream Web application. Also, using our REST API, you can have your own application call the API, or have tools such as GitHub Actions submit a set of statements that define an application or pipeline.

When Should you use DeltaStream?

With the aforementioned capabilities you can quickly and easily build streaming applications and pipelines on your streaming data. If you are already using a streaming storage service such as Apache Kafka, AWS Kinesis, Confluent Cloud, AWS MSK, or Redpanda, consider using DeltaStream.

Here are two use cases:

You have a vehicle information topic in your production Kafka cluster where you ingest real-time information such as GPS coordinates, speed, and other vehicle data. You need to share this stream in real time with another team, but only wish to share information from vehicles in a certain geographic region while also obfuscating some of the data. Further, you don’t want to give access to the production Kafka cluster and wish to provide the shared information in a topic in a new Kafka cluster.

To do this you can write a SQL query, such as the one shown below in DeltaStream, to read the original stream and perform the desired projection, transformations, and filtering. You can continuously write the result into a new stream backed by a topic in the new Kafka cluster that you have already declared as test_kafka.

CREATE STREAM resultStream WITH('store'='test_kafka') AS 
SELECT 
     vid, lat, lon, mask(pii, '*') 
FROM vehecleStream 
WHERE isInGeoFence(lat, lon) = true;

When you have the results stream, you can use the following statement to grant read privilege for the team. They only only the result stream and never see the source stream or the production Kafka cluster.

GRANT USAGE, SELECT PRIVILEGE ON resultStream TO analyst;

Next example: a wiki service where all user interactions with every wiki page is streamed into a Kinesis stream.

In this case assume you wish to provide real-time page statistics such as the number of edits per wiki page. You can easily build a materialized view in DeltaStream using an aggregate query, as in the following:

CREATE MATERIALIZED VIEW wiki_edit_count AS 
SELECT 
    page_id, count(*) AS edit_count 
FROM wiki_events 
WHERE wiki_event_type = 'edit' 
GROUP BY page_id;

This creates a materialized view in DeltaStream that gives you the edit count per wiki page; every time an edit event is appended to the wiki_events stream, the view updates in real time. To display the up-to-date edit count for a wiki page every time it is loaded, simply query the materialized view and include the edit count in the wiki page. DeltaStream ensures that every time someone opens a wiki page they see the latest up-to-date edit counts for that page.

The DeltaStream is a relational platform
DeltaStream Web App workspace