LogoLogo
Start Trial
  • Overview
    • What is DeltaStream?
    • Core Concepts
      • Access Control
      • Region
      • SQL
      • Store
      • Database
      • Query
      • Visualizing Data Lineage
      • Function
  • Getting Started
    • Free Trial Quick Start
    • Starting with the Web App
    • Starting with the CLI
  • Tutorials
    • Managing Users and User Roles
      • Inviting Users to an Organization
      • Administering Users in your Organization
      • Using the CLI to Manage User Roles
      • Example: Setting Up Custom Roles for Production and Stage
    • Creating Stores for Streaming Data
    • Using Multiple Stores in Queries
    • Creating Relations to Structure Raw Data
    • Namespacing with Database and Schema
    • Creating and Querying Materialized Views
    • Creating a Function
    • Securing Your Connections to Data Stores
      • Introducing DeltaStream Private Links
      • Creating an AWS Private Link from DeltaStream to your Confluent Kafka Dedicated Cluster
      • Enabling Private Link Connectivity to Confluent Enterprise Cluster and Schema Registry
      • Creating a Private Link from DeltaStream to Amazon MSK
      • Creating a Private Link for RDS Databases
      • Deleting a Private Link
    • Integrations
      • Connecting to Confluent Cloud
      • Databricks
      • PostgreSQL
      • Snowflake
      • WarpStream
    • Serialization
      • Working with ProtoBuf Serialized Data and DeltaStream Descriptors
      • Working with Avro Serialized Data and Schema Registries
      • Configuring Deserialization Error Handling
  • Reference
    • Enterprise Security Integrations
      • Okta SAML Integration
      • Okta SCIM Integration
    • Metrics
      • Prometheus Integration
      • Built-In Metrics
      • Custom Metrics in Functions
    • SQL Syntax
      • Data Formats (Serialization)
        • Serializing with JSON
        • Serializing with Primitive Data Types
        • Serializing with Protobuf
      • Data Types
      • Identifiers and Keywords
      • Command
        • ACCEPT INVITATION
        • CAN I
        • COPY DESCRIPTOR_SOURCE
        • COPY FUNCTION_SOURCE
        • DESCRIBE ENTITY
        • DESCRIBE QUERY
        • DESCRIBE QUERY METRICS
        • DESCRIBE QUERY EVENTS
        • DESCRIBE QUERY STATE
        • DESCRIBE RELATION
        • DESCRIBE RELATION COLUMNS
        • DESCRIBE ROLE
        • DESCRIBE SECURITY INTEGRATION
        • DESCRIBE <statement>
        • DESCRIBE STORE
        • DESCRIBE USER
        • GENERATE COLUMNS
        • GENERATE TEMPLATE
        • GRANT OWNERSHIP
        • GRANT PRIVILEGES
        • GRANT ROLE
        • INVITE USER
        • LIST API_TOKENS
        • LIST DATABASES
        • LIST DESCRIPTORS
        • LIST DESCRIPTOR_SOURCES
        • LIST ENTITIES
        • LIST FUNCTIONS
        • LIST FUNCTION_SOURCES
        • LIST INVITATIONS
        • LIST METRICS INTEGRATIONS
        • LIST ORGANIZATIONS
        • LIST QUERIES
        • LIST REGIONS
        • LIST RELATIONS
        • LIST ROLES
        • LIST SCHEMAS
        • LIST SCHEMA_REGISTRIES
        • LIST SECRETS
        • LIST SECURITY INTEGRATIONS
        • LIST STORES
        • LIST USERS
        • PRINT ENTITY
        • REJECT INVITATION
        • REVOKE INVITATION
        • REVOKE PRIVILEGES
        • REVOKE ROLE
        • SET DEFAULT
        • USE
      • DDL
        • ALTER API_TOKEN
        • ALTER SECURITY INTEGRATION
        • CREATE API_TOKEN
        • CREATE CHANGELOG
        • CREATE DATABASE
        • CREATE DESCRIPTOR_SOURCE
        • CREATE ENTITY
        • CREATE FUNCTION_SOURCE
        • CREATE FUNCTION
        • CREATE INDEX
        • CREATE METRICS INTEGRATION
        • CREATE ORGANIZATION
        • CREATE ROLE
        • CREATE SCHEMA_REGISTRY
        • CREATE SCHEMA
        • CREATE SECRET
        • CREATE SECURITY INTEGRATION
        • CREATE STORE
        • CREATE STREAM
        • CREATE TABLE
        • DROP API_TOKEN
        • DROP CHANGELOG
        • DROP DATABASE
        • DROP DESCRIPTOR_SOURCE
        • DROP ENTITY
        • DROP FUNCTION_SOURCE
        • DROP FUNCTION
        • DROP METRICS INTEGRATION
        • DROP RELATION
        • DROP ROLE
        • DROP SCHEMA
        • DROP SCHEMA_REGISTRY
        • DROP SECRET
        • DROP SECURITY INTEGRATION
        • DROP STORE
        • DROP STREAM
        • DROP USER
        • UPDATE ENTITY
        • UPDATE SCHEMA_REGISTRY
        • UPDATE SECRET
        • UPDATE STORE
      • Query
        • APPLICATION
        • Change Data Capture (CDC)
        • CREATE CHANGELOG AS SELECT
        • CREATE STREAM AS SELECT
        • CREATE TABLE AS SELECT
        • Function
          • Built-in Functions
          • Row Metadata Functions
        • INSERT INTO
        • Materialized View
          • CREATE MATERIALIZED VIEW AS
          • SELECT (FROM MATERIALIZED VIEW)
        • Query Name and Version
        • Resume Query
        • RESTART QUERY
        • SELECT
          • FROM
          • JOIN
          • MATCH_RECOGNIZE
          • WITH (Common Table Expression)
        • TERMINATE QUERY
      • Sandbox
        • START SANDBOX
        • DESCRIBE SANDBOX
        • STOP SANDBOX
      • Row Key Definition
    • Rest API
Powered by GitBook
On this page
  • 1. Create a Store
  • Create a Database
  • Create Streams and Changelogs
  • Run Queries
  • Clean Up
  1. Getting Started

Starting with the Web App

How to get started with streaming analytics using the DeltaStream user interface

PreviousFree Trial Quick StartNextStarting with the CLI

Last updated 2 months ago

This guide steps you through the process of building an end-to-end streaming application with DeltaStream’s Web application. By the end of this guide, you will have hands-on experience with foundational concepts in DeltaStream and be able to build your own similar applications.

Here's what you'll do:

  1. Connect to your streaming store (in this case, Apache Kafka) by creating a store in DeltaStream.

  2. Create your first database.

  3. Create streams and changelogs for your Kafka topics.

  4. Create new streams, changelogs, and materialized views using DeltaStream’s continuous queries.

  5. (Optional) Share your streaming data with other members of your organization.

Before you begin:

  • Create a DeltaStream account and sign in to the Web app.

  • Created a DeltaStream organization or join an existing organization.

Note The examples in this guide use topics in Apache Kafka. But the steps should be the same if you have your data in other streaming stores such as Amazon Kinesis or Redpanda.

1. Create a Store

In DeltaStream, a store is a streaming service, such as Apache Kafka or Amazon Kinesis, where your streaming data resides.

Before you write any queries, you must configure DeltaStream to connect to the store where data resides. This is the first step to take before you process any data using DeltaStream SQL statements.

To create and configure a new store:

  1. Click + Add Store, and from the menu that displays click the store type you want (in this case, Kafka).

  1. Choose a unique Name for the store. Use only alphanumeric characters, dashes, and underscores, and limit the name to a maximum of 255 characters. To follow along here, name the store mskconsumer. This guide uses an Amazon MSK cluster.

  2. Select the Amazon Availability Zone.

    Availability zone information tells DeltaStream where to run the queries that use this store.

Tip In practice, it's best to select the same region as the one where your data is stored. That helps minimize data transfer costs.

  1. Add at least one URI port to which to connect, then click +. Separate multiple entries with “,”

  2. Select a Schema Registry.

  3. Click Add.

Your new store displays in the list of existing stores on the Resources page

To ensure your new store is set up correctly, click on it to expand it and display the Topics section. From here you can view the list of store entities.

In the list of topics, find the topic you want and click on it. Then click Print to see the messages coming in to this topic in real time, as in the example below.

Create a Database

DeltaStream provides a relational model on top of your streaming data. Similar to other relational systems, DeltaStream uses databases and schemas for namespacing and organizing your data.

To create a new Database

  1. At the top of the Catalog pane click + and then click Database.

  2. At the prompt enter a unique name for the database and then click Add.

In this guide the database is labeled TestDB.

Note You can create as many databases as you wish. Any database you create includes a schema labeled public. But you can add more schemas if you wish.

Create Streams and Changelogs

Now use DeltaStream’s DDL statements to create relations on top of your Kafka topics .

Here is the statement to create a pageviews stream; you can copy it and paste it into your work:

CREATE STREAM pageviews (
    viewtime BIGINT, 
    userid VARCHAR, 
    pageid VARCHAR
)WITH (
    'topic'='pageviews', 
    'value.format'='JSON'
);

This stream is created in the currently-used database and schema -- in this case, TestDB and public, respectively. Also, there is no store specified in the WITH clause; instead, DeltaStream uses the default store you declared above as the store that contains the pageviews topic.

Next, declare a changelog for the users topic. A changelog indicates you want to interpret events in an entity as UPSERT events. In this case the events must have a primary key; each event is interpreted as an insert or update for the given primary key.

Use the following statement in the Workspace SQL pane to declare the users changelog:

CREATE CHANGELOG users_log(
    registertime BIGINT, 
    userid VARCHAR, 
    regionid VARCHAR, 
    gender VARCHAR, 
    interests ARRAY<VARCHAR>, 
    contactinfo STRUCT<phone VARCHAR, city VARCHAR, "state" VARCHAR, zipcode VARCHAR>, 
    PRIMARY KEY(userid)
)WITH (
    'topic'='users', 
    'key.format'='json', 
    'key.type'='STRUCT<userid VARCHAR>', 
    'value.format'='json'
);

Run Queries

After you declare streams and changelogs, you can write continuous queries in SQL to process this streaming data in real time.

You can run 2 types of queries:

  1. Interactive queries

  2. Streaming or continuous queries

In interactive queries, the query results stream back to the you. Use these types of queries to inspect your streams and changelogs or build queries iteratively by inspecting the query result.

Let's start with an interactive query. Here's an example: inspect the pageviews stream using the following interactive query:

SELECT * FROM pageviews;

When you run this query, DeltaStream compiles it into a streaming job, then runs the query and streams the result into the Web app. The results resemble this:

Now try a persistent query that joins the pageviews stream with the users changelog to create an enriched pageviews stream that includes user details for each pageview event. While you're at it, use the TO_TIMESTAMP_LTZ function to convert the epoch time to a timestamp with a time zone:

CREATE STREAM csas_enriched_pv AS 
SELECT 
    TO_TIMESTAMP_LTZ(viewtime, 3) AS viewtime,  
    p.userid AS userid, 
    pageid, 
    TO_TIMESTAMP_LTZ(registertime, 3) AS registertime, 
    regionid, 
    gender, 
    interests, 
    contactinfo
FROM pageviews p
    JOIN users_log u ON u.userid = p.userid;

Click RUN. In the background DeltaStream compiles and launches your query as an Apache Flink streaming job, and displays a confirmation similar to the below when the query completes.

To examine the content of the new stream, run the following interactive continuous query from the SQL page in the Web app:

SELECT * FROM csas_enriched_pv;

The following image shows the result of running the above query. The result streams to the client as shown below:

With the pageviews stream enriched, you can build a materialized view to compute the number of pageviews per user. To do this, enter the following statement in the SQL pane of your workspace to stream this materialized view:

CREATE MATERIALIZED VIEW user_view_count AS 
SELECT
    userid, 
    COUNT(*) AS view_count 
FROM csas_enriched_pv 
GROUP BY userid;

Note Materialized views are not available to free trial users.

When you run this query, DeltaStream launches a streaming job that runs the SELECT statement and materializes the query result. You can query this materialized view the same way you would query a materialized view in a traditional relational database. But the difference here is that DeltaStream leverages the streaming job to always keep the data in the materialized view fresh.

The following is a simple query to get the current view count for a user with the userid of User_2.

SELECT * FROM user_view_count WHERE userid = 'User_2';

The result of this query displays in one row, as shown below:

Note that at the time of running the above query the number of pageviews for User_2 is 3. Now run the query again. This time you should see an updated result for the pageview count for the user. This demonstrates that every time you run a query on a materialized view, you receive the most up-to-date result. DeltaStream ensures the data in the view is continuously updated, using the continuous query that declared the materialized view.

Here is an image of the same query run on the materialized view just a few seconds later:

The result is updated again -- in this case, to 11 from the previous value of 3.

Clean Up

It's important to practice good hygiene when you're done! To clean up your DeltaStream environment:

  1. Terminate the queries.

  2. Navigate to the corresponding database and schema and drop the created streams, changelogs, and materialized views.

Tip If there's is a query that uses a stream, changelog, or materialized view, be sure to terminate the query before you drop the relation.

In the lefthand navigation, click Resources ( ) to display the Resources page.

Notes Be sure to match the store type with the streaming store where streaming data resides. Also, in addition to self-hosted services, for Kafka store types you can also configure and in DeltaStream.

Optionally, complete the authentication options as appropriate for the store. For details, see .

In the lefthand navigation click Catalog ( ).

To work with your streaming data in an entity as an append-only stream, in which each event is an independent event in your stream, you define it as a Stream. In this guide you declare a stream on the pageviews topic, since each pageview is an independent event. To define a stream, in the lefthand navigation click Workspace ( ), and in the SQL pane write the DDL statement.

When you declare the pageviews stream and users changelog, the events display in the public schema of the TestDB database. To view them, click Catalog ( ) in the lefthand navigation, then go to TestDB > Public.

In , the query results are stored back in a store or materialized view instead of streaming back to you.

To view the query, along with its status, in the lefthand navigation click Queries ( ). After the query successfully runs there's a new Kafka topic named csas_enriched_pv in your Kafka cluster, and a new stream added to the streams in your database, TestDB.

In the lefthand navigation, click Queries ( ) to display the Queries page.

Confluent Cloud
Amazon MSK
#_streaming_or_continuous_query
Authentication Parameters
Adding a store