LogoLogo
Start Trial
  • Overview
    • What is DeltaStream?
    • Core Concepts
      • Access Control
      • Compute Pools
      • Data Store
      • Database
      • Function
      • Query
      • SQL
      • Visualizing Data Lineage
  • Getting Started
    • Free Trial Quick Start
    • Starting with the Web App
    • Starting with the CLI
  • How do I...?
    • Create and Manage Data Stores
      • Create Data Stores for Streaming Data
      • Explore Data Store and Topic Details
      • Use Multiple Data Stores in Queries
    • Manage Users and User Roles
      • Inviting Users to an Organization
      • Administering Users in your Organization
      • Using the CLI to Manage User Roles
      • Example: Setting Up Custom Roles for Production and Stage
    • Create DeltaStream Objects to Structure Raw Data
    • Use Namespacing for Organizing Data
    • Create and Query Materialized Views
    • Create a Compute Pool to Work with Iceberg
    • Create a Function
    • Secure my Connection to a Data Store
      • Introducing DeltaStream Private Links
      • Creating an AWS Private Link from DeltaStream to your Confluent Kafka Dedicated Cluster
      • Enabling Private Link Connectivity to Confluent Enterprise Cluster and Schema Registry
      • Creating a Private Link from DeltaStream to Amazon MSK
      • Creating a Private Link for RDS Databases
      • Deleting a Private Link
    • Serialize my Data
      • Working with ProtoBuf Serialized Data and DeltaStream Descriptors
      • Working with Avro Serialized Data and Schema Registries
      • Configuring Deserialization Error Handling
  • Integrations
    • Setting up Data Store Integrations
      • AWS S3
      • ClickHouse
      • Confluent Cloud
      • Databricks
      • Iceberg REST Catalog
      • PostgreSQL
      • Snowflake
      • WarpStream
  • Setting up Enterprise Security Integrations
    • Okta SAML Integration
    • Okta SCIM Integration
  • use cases
    • Using an AWS S3 Store as a Source to Feed an MSK Topic
  • Reference
    • Metrics
      • Prometheus Integration
      • Built-In Metrics
      • Custom Metrics in Functions
    • SQL Syntax
      • Data Formats (Serialization)
        • Serializing with JSON
        • Serializing with Primitive Data Types
        • Serializing with Protobuf
      • Data Types
      • Identifiers and Keywords
      • Command
        • ACCEPT INVITATION
        • CAN I
        • COPY DESCRIPTOR_SOURCE
        • COPY FUNCTION_SOURCE
        • DESCRIBE ENTITY
        • DESCRIBE QUERY
        • DESCRIBE QUERY METRICS
        • DESCRIBE QUERY EVENTS
        • DESCRIBE QUERY STATE
        • DESCRIBE RELATION
        • DESCRIBE RELATION COLUMNS
        • DESCRIBE ROLE
        • DESCRIBE SECURITY INTEGRATION
        • DESCRIBE <statement>
        • DESCRIBE STORE
        • DESCRIBE USER
        • GENERATE COLUMNS
        • GENERATE TEMPLATE
        • GRANT OWNERSHIP
        • GRANT PRIVILEGES
        • GRANT ROLE
        • INVITE USER
        • LIST API_TOKENS
        • LIST COMPUTE_POOLS
        • LIST DATABASES
        • LIST DESCRIPTORS
        • LIST DESCRIPTOR_SOURCES
        • LIST ENTITIES
        • LIST FUNCTIONS
        • LIST FUNCTION_SOURCES
        • LIST INVITATIONS
        • LIST METRICS INTEGRATIONS
        • LIST ORGANIZATIONS
        • LIST QUERIES
        • LIST RELATIONS
        • LIST ROLES
        • LIST SCHEMAS
        • LIST SCHEMA_REGISTRIES
        • LIST SECRETS
        • LIST SECURITY INTEGRATIONS
        • LIST STORES
        • LIST USERS
        • PRINT ENTITY
        • REJECT INVITATION
        • REVOKE INVITATION
        • REVOKE PRIVILEGES
        • REVOKE ROLE
        • SET DEFAULT
        • USE
        • START COMPUTE_POOL
        • STOP COMPUTE_POOL
      • DDL
        • ALTER API_TOKEN
        • ALTER SECURITY INTEGRATION
        • CREATE API_TOKEN
        • CREATE CHANGELOG
        • CREATE COMPUTE_POOL
        • CREATE DATABASE
        • CREATE DESCRIPTOR_SOURCE
        • CREATE ENTITY
        • CREATE FUNCTION_SOURCE
        • CREATE FUNCTION
        • CREATE INDEX
        • CREATE METRICS INTEGRATION
        • CREATE ORGANIZATION
        • CREATE ROLE
        • CREATE SCHEMA_REGISTRY
        • CREATE SCHEMA
        • CREATE SECRET
        • CREATE SECURITY INTEGRATION
        • CREATE STORE
        • CREATE STREAM
        • CREATE TABLE
        • DROP API_TOKEN
        • DROP CHANGELOG
        • DROP COMPUTE_POOL
        • DROP DATABASE
        • DROP DESCRIPTOR_SOURCE
        • DROP ENTITY
        • DROP FUNCTION_SOURCE
        • DROP FUNCTION
        • DROP METRICS INTEGRATION
        • DROP RELATION
        • DROP ROLE
        • DROP SCHEMA
        • DROP SCHEMA_REGISTRY
        • DROP SECRET
        • DROP SECURITY INTEGRATION
        • DROP STORE
        • DROP STREAM
        • DROP USER
        • START/STOP COMPUTE_POOL
        • UPDATE COMPUTE_POOL
        • UPDATE ENTITY
        • UPDATE SCHEMA_REGISTRY
        • UPDATE SECRET
        • UPDATE STORE
      • Query
        • APPLICATION
        • Change Data Capture (CDC)
        • CREATE CHANGELOG AS SELECT
        • CREATE STREAM AS SELECT
        • CREATE TABLE AS SELECT
        • Function
          • Built-in Functions
          • Row Metadata Functions
        • INSERT INTO
        • Materialized View
          • CREATE MATERIALIZED VIEW AS
          • SELECT (FROM MATERIALIZED VIEW)
        • Query Name and Version
        • Resume Query
        • RESTART QUERY
        • SELECT
          • FROM
          • JOIN
          • MATCH_RECOGNIZE
          • WITH (Common Table Expression)
        • TERMINATE QUERY
      • Sandbox
        • START SANDBOX
        • DESCRIBE SANDBOX
        • STOP SANDBOX
      • Row Key Definition
    • DeltaStream OpenAPI
      • Deltastream
      • Models
Powered by GitBook
On this page
  • 1. Inspect Data in Your Trial Store
  • 2. Create a Database
  • 3. Create Streams and Changelogs
  • 4. Run Queries
  • 4. Clean Up
  • Modifying Your Workspace
  1. Getting Started

Free Trial Quick Start

How to get started for free with DeltaStream

PreviousVisualizing Data LineageNextStarting with the Web App

Last updated 19 days ago

DeltaStream provides a relational model on top of your streaming data. Similar to other relational systems, DeltaStream uses databases and namespaces for organizing your data.

Using DeltaStream’s free 14-day trial? Follow this guide to build an end-to-end streaming application in minutes. We provide you with a default organization – named after the email address you used to sign on – and a default Kafka store with synthetic data. You’ll use these resources to:

  1. Inspect the data in the streaming trial store.

  2. Create a database.

  3. Create a stream and changelog for your Kafka topics.

  4. Enrich your data and query it.

Note The trial version limits you to 3 queries. Also, user-defined functions aren’t supported, and there are no materialized views. You can add your own external store, but it must be available via the Internet. Contact DeltaStream support if you wish to set up a private store.

1. Inspect Data in Your Trial Store

You receive access to a pre-defined DeltaStream trial_store when you sign in to your trial account. This store is a discrete AWS MSK (Managed Streaming for Kafka) cluster that includes several topics with synthetic data producers; the producers continuously publish messages into these topics.

Note In DeltaStream you define stores to represent each Kafka cluster. DeltaStream , such as AWS Kinesis and Postgres.

The trial store displays in several places:

  • The Welcome page

  • The Workspace page

  • The Resources page

When you log on, DeltaStream displays the Workspace page. This page provides an at-a-glance dashboard view of your overall DeltaStream organization.

To begin exploring your trial store, in the lefthand navigation click Resources ( ). The Resources page displays with the Data Stores tab active and your trial store listed beneath it.

To display the topics contained in the trial store, click anywhere in the trial store row and open the trial_store page.

Now confirm the store connectivity and inspect the data in a topic. To do this:

  1. Click anywhere in the row of the topic you want. The topic Details pane slides open.

  2. Click Print. This displays the live stream of data flowing to the topic.

Here is an image of the data flowing into the pageviews topic.

Tip After you verify that data is streaming into your trial store, you may wish to click Stop to halt the stream.

There's a range of additional information you can view. For more details, please see Explore Data Store and Topic Details.

2. Create a Database

Note DeltaStream objects are the building blocks of user applications and pipelines. qYou must create an object to represent each Kafka topic you wish to include in a query.

To create a new database

  1. Enter the database name. In this guide we name it DemoDB.

  2. Click SAVE.

The newly-created database displays all the topics in the Kafka cluster to which you have access.

For this guide, we named our database DemoDB.

You can create as many databases as you wish. Each new database includes a namespace named public, but you can add more namespaces if you wish.

To add a new namespace

  1. At the promt, enter the namespace name. Then click SAVE.

3. Create Streams and Changelogs

Your goal here is to understand pageviews by users over time. You do this by joining the pageviews and users topics.

Start by creating relations backed by Kafka topics. Use DeltaStream’s DDL statements to define your streaming data in a topic as an append-only stream.

Note In DeltaStream, a stream is simply one type of object.

To create a stream

  1. Copy the SQL DDL statement below, and paste it into the SQL pane (above the Results pane). This creates a discrete stream backed by the pageviews topic from the Kafka cluster; each pageview is an independent event. This stream reflects the view time of each page by user.

  2. Click Run.

CREATE STREAM pageviews (
    viewtime BIGINT, 
    userid VARCHAR, 
    pageid VARCHAR
)WITH (
    'topic'='pageviews', 
    'value.format'='JSON'
);

DeltaStream displays a Success message in the Results pane, followed by details of the stream you just created.

Note The above stream is created in the currently-used database and namespace – DemoDB and public, respectively. DeltaStream uses the default store declared above as the store for the pageviews topic. To specify another store, use the WITH clause.

Next, declare a changelog backed by the users topic and ordered by UserID. A changelog enables you to interpret events in a topic as UPSERT events. (In DeltaStream, changelogs are simply another type of object.) Events require a primary key; DeltaStream interprets each event as an insert or update for the given primary key. In this case, the changelog relation reflects specific details by user, such as gender and interests.

To declare the users changelog, paste the following statement in the SQL pane and then click Run:

CREATE CHANGELOG users_log (
    registertime BIGINT, 
    userid VARCHAR, 
    regionid VARCHAR, 
    gender VARCHAR, 
    interests ARRAY<VARCHAR>, 
    contactinfo STRUCT<phone VARCHAR, city VARCHAR, "state" VARCHAR, zipcode VARCHAR>, 
    PRIMARY KEY(userid)
)WITH (
    'topic'='users', 
    'key.format'='json', 
    'key.type'='STRUCT<userid VARCHAR>', 
    'value.format'='json'
);

4. Run Queries

Now you can write a continuous query in SQL to process this streaming data in real time.

Let’s start with an interactive query, in which the query results stream back to you. You can use such queries to:

  • inspect your streams and changelogs

  • build queries iteratively by inspecting the query result.

Let’s inspect the pageviews stream. To do this, enter the following interactive query and then click Run:

SELECT * FROM pageviews;

DeltaStream compiles your query into a streaming job, runs the job, and streams the result into the Results pane, as per the below image:

Tip After you verify that data is streaming in you may wish to click Stop Query.

While interactive query results stream in, DeltaStream provides persistent queries. These are continuous queries wherein the query results are written continuously either to a store or a materialized view.

Let’s write a persistent query that joins the pageviews stream with the users_log changelog relations. This creates a third object called an enriched pageviews stream that provides user details for each pageview event, including view time of each page by user and detailed user information.

While we’re at it, we also convert the epoch time to the timestamp with a timezone using the TO_TIMESTAMP_LTZ function.

Start by creating a stream called enriched_pv. Then join the pageviews stream with data from the users_log changelog and write the results to the enriched_pv stream.

CREATE STREAM enriched_pv 
AS SELECT
    TO_TIMESTAMP_LTZ(viewtime, 3) AS viewtime,  
    p.userid AS userid, 
    pageid, 
    TO_TIMESTAMP_LTZ(registertime, 3) AS registertime, 
    regionid, 
    gender, 
    interests, 
    contactinfo
FROM pageviews p WITH ( 'starting.position'='latest')
JOIN users_log u WITH ( 'starting.position'='latest')
ON u.userid = p.userid;

Important Topic name prefixes are a requirement only for the trial store we have set. Prefixes are not added if you use any other store, such as your own Apache Kafka or AWS Kinesis.

When the query finishes, you have a new Kafka topic named enriched_pv in your Kafka cluster and a new stream added to the streams in your TestDB database.

Finally, examine the contents of the new stream. Run the following simple query in the SQL pane:

SELECT * FROM enriched_pv;

The result of this interactive continuous query is an enriched pageviews stream that streams to the client as shown below:

That’s it. In just a few steps you’ve used DeltaStream to connect two different Kafka topics and persist the enriched data, either to query in real time or write out to its final destination. In so doing you avoid the extra steps and expense that might be the case in a data warehouse.

4. Clean Up

When your task is completed, it’s time to clean up your environment. To do this:

  1. Follow the instructions in the prompt, and then click Terminate. The system displays a message indicating you've marked that query for termination.

Important If you have a query that uses a stream, changelog, or materialized view, you must terminate the query before dropping the relation.

Modifying Your Workspace

This quickstart guide used simple examples to get you up and running quickly. But the DeltaStream workspace is customizable. If you begin using more extensive queries or a greater number of objects, you may find it more efficient to modify the size of your workspace panes, or even toggle on or off specific sections, to focus on the parts of the workspace that matter most at any given time. You can:

You can also display the topic Details pane by clicking under the Actions column.

Now it’s time to declare a database and and write queries on the streaming data. Databases present a logical organization layer for your streaming data. They make it possible to provide access controls and governance across all your data.

From the lefthand navigation, click Databases ( ) and then click + Add Database.

Click , click the database you want, and towards the right click + Add Namespace.

Navigate to the main workspace by clicking .

Tip You may need to expand the Results pane to see all of the details. To do this, click and drag the pane handle ( ). See below for more tips on .

As with the pageviews stream, the users changelog displays in the DemoDB public schema. To view the streams, in the lefthand navigation click Databases ( ), and in the Databases pane click to expand the DemoDB database and public namespace.

Note The above persistent query creates a new topic in the trial store DB. When you create a new topic, DeltaStream adds a prefix name to the topic name based on your trial email and some unique random characters. For example, for the email , DeltaStream creates a topic prefix like t_testgmailcom_4evmsyg_. Creating the topic enriched_pv in turn creates the topic t_testgmailcom_4evmsyg_enriched_pv. You can view these topics in the trial store topics list.

Note DeltaStream compiles and launches the query as an Apache Flink streaming job. You can view the query along with its status in the Query Management page; to do this, in the lefthand navigation click Queries ( ).

Terminate the queries. To do this, click to display the Queries page. Then, next to the query you wish to terminate, click .

Drop the created streams, changelogs, and materialized views. To do this, click to navigate to the corresponding database and schema, and as with the query, click and follow the prompt to drop the streams, changelogs, and materialized views.

Hide the Results pane. This gives you more room in the SQL pane to work with more extensive SQL. To do this, at the top of your workspace click . Click it a second time to re-display the Results pane.

Hide the SQL pane. This gives you more room to examine query results. To do this, at the top of your workspace click . Click it a second time to re-display the SQL pane.

Hide the lefthand (Database and Stores) panes. This gives you more horizontal screen real estate and creates a cleaner, more expansive workspace. To do this, at the top of your workspace click . Click it a second time to re-display the lefthand pane.

When activated, the icons display in color – for example, .

Tip You can click any two or all three of these icons at once to isolate the precise workspace you wish. For example, if you click and you have almost the entire screen to work with SQL.

Finally, you can manually re-size your panes without hiding them altogether. To resize the SQL and Results panes, or the Database and Stores panes, click and drag . To resize the left and right panes, click and drag .

test@gmail.com
modifying your DeltaStream workspace
DeltaStream objects
also works with other stores