LogoLogo
Start Trial
  • Overview
    • What is DeltaStream?
    • Core Concepts
      • Access Control
      • Compute Pools
      • Data Store
      • Database
      • Function
      • Query
      • SQL
      • Visualizing Data Lineage
  • Getting Started
    • Free Trial Quick Start
    • Starting with the Web App
    • Starting with the CLI
  • How do I...?
    • Create and Manage Data Stores
      • Create Data Stores for Streaming Data
      • Explore Data Store and Topic Details
      • Use Multiple Data Stores in Queries
    • Manage Users and User Roles
      • Inviting Users to an Organization
      • Administering Users in your Organization
      • Using the CLI to Manage User Roles
      • Example: Setting Up Custom Roles for Production and Stage
    • Create DeltaStream Objects to Structure Raw Data
    • Use Namespacing for Organizing Data
    • Create and Query Materialized Views
    • Create a Compute Pool to Work with Iceberg
    • Create a Function
    • Secure my Connection to a Data Store
      • Introducing DeltaStream Private Links
      • Creating an AWS Private Link from DeltaStream to your Confluent Kafka Dedicated Cluster
      • Enabling Private Link Connectivity to Confluent Enterprise Cluster and Schema Registry
      • Creating a Private Link from DeltaStream to Amazon MSK
      • Creating a Private Link for RDS Databases
      • Deleting a Private Link
    • Serialize my Data
      • Working with ProtoBuf Serialized Data and DeltaStream Descriptors
      • Working with Avro Serialized Data and Schema Registries
      • Configuring Deserialization Error Handling
  • Integrations
    • Setting up Data Store Integrations
      • AWS S3
      • ClickHouse
      • Confluent Cloud
      • Databricks
      • Iceberg REST Catalog
      • PostgreSQL
      • Snowflake
      • WarpStream
  • Setting up Enterprise Security Integrations
    • Okta SAML Integration
    • Okta SCIM Integration
  • use cases
    • Using an AWS S3 Store as a Source to Feed an MSK Topic
  • Reference
    • Metrics
      • Prometheus Integration
      • Built-In Metrics
      • Custom Metrics in Functions
    • SQL Syntax
      • Data Formats (Serialization)
        • Serializing with JSON
        • Serializing with Primitive Data Types
        • Serializing with Protobuf
      • Data Types
      • Identifiers and Keywords
      • Command
        • ACCEPT INVITATION
        • CAN I
        • COPY DESCRIPTOR_SOURCE
        • COPY FUNCTION_SOURCE
        • DESCRIBE ENTITY
        • DESCRIBE QUERY
        • DESCRIBE QUERY METRICS
        • DESCRIBE QUERY EVENTS
        • DESCRIBE QUERY STATE
        • DESCRIBE RELATION
        • DESCRIBE RELATION COLUMNS
        • DESCRIBE ROLE
        • DESCRIBE SECURITY INTEGRATION
        • DESCRIBE <statement>
        • DESCRIBE STORE
        • DESCRIBE USER
        • GENERATE COLUMNS
        • GENERATE TEMPLATE
        • GRANT OWNERSHIP
        • GRANT PRIVILEGES
        • GRANT ROLE
        • INVITE USER
        • LIST API_TOKENS
        • LIST COMPUTE_POOLS
        • LIST DATABASES
        • LIST DESCRIPTORS
        • LIST DESCRIPTOR_SOURCES
        • LIST ENTITIES
        • LIST FUNCTIONS
        • LIST FUNCTION_SOURCES
        • LIST INVITATIONS
        • LIST METRICS INTEGRATIONS
        • LIST ORGANIZATIONS
        • LIST QUERIES
        • LIST RELATIONS
        • LIST ROLES
        • LIST SCHEMAS
        • LIST SCHEMA_REGISTRIES
        • LIST SECRETS
        • LIST SECURITY INTEGRATIONS
        • LIST STORES
        • LIST USERS
        • PRINT ENTITY
        • REJECT INVITATION
        • REVOKE INVITATION
        • REVOKE PRIVILEGES
        • REVOKE ROLE
        • SET DEFAULT
        • USE
        • START COMPUTE_POOL
        • STOP COMPUTE_POOL
      • DDL
        • ALTER API_TOKEN
        • ALTER SECURITY INTEGRATION
        • CREATE API_TOKEN
        • CREATE CHANGELOG
        • CREATE COMPUTE_POOL
        • CREATE DATABASE
        • CREATE DESCRIPTOR_SOURCE
        • CREATE ENTITY
        • CREATE FUNCTION_SOURCE
        • CREATE FUNCTION
        • CREATE INDEX
        • CREATE METRICS INTEGRATION
        • CREATE ORGANIZATION
        • CREATE ROLE
        • CREATE SCHEMA_REGISTRY
        • CREATE SCHEMA
        • CREATE SECRET
        • CREATE SECURITY INTEGRATION
        • CREATE STORE
        • CREATE STREAM
        • CREATE TABLE
        • DROP API_TOKEN
        • DROP CHANGELOG
        • DROP COMPUTE_POOL
        • DROP DATABASE
        • DROP DESCRIPTOR_SOURCE
        • DROP ENTITY
        • DROP FUNCTION_SOURCE
        • DROP FUNCTION
        • DROP METRICS INTEGRATION
        • DROP RELATION
        • DROP ROLE
        • DROP SCHEMA
        • DROP SCHEMA_REGISTRY
        • DROP SECRET
        • DROP SECURITY INTEGRATION
        • DROP STORE
        • DROP STREAM
        • DROP USER
        • START/STOP COMPUTE_POOL
        • UPDATE COMPUTE_POOL
        • UPDATE ENTITY
        • UPDATE SCHEMA_REGISTRY
        • UPDATE SECRET
        • UPDATE STORE
      • Query
        • APPLICATION
        • Change Data Capture (CDC)
        • CREATE CHANGELOG AS SELECT
        • CREATE STREAM AS SELECT
        • CREATE TABLE AS SELECT
        • Function
          • Built-in Functions
          • Row Metadata Functions
        • INSERT INTO
        • Materialized View
          • CREATE MATERIALIZED VIEW AS
          • SELECT (FROM MATERIALIZED VIEW)
        • Query Name and Version
        • Resume Query
        • RESTART QUERY
        • SELECT
          • FROM
          • JOIN
          • MATCH_RECOGNIZE
          • WITH (Common Table Expression)
        • TERMINATE QUERY
      • Sandbox
        • START SANDBOX
        • DESCRIBE SANDBOX
        • STOP SANDBOX
      • Row Key Definition
    • DeltaStream OpenAPI
      • Deltastream
      • Models
Powered by GitBook
On this page
  • Create a Materialized View
  • Example 1: Top-K query
  • Example 2. Combine two streams’ records
  1. How do I...?

Create and Query Materialized Views

PreviousUse Namespacing for Organizing DataNextCreate a Compute Pool to Work with Iceberg

Last updated 16 days ago

A is a DeltaStream object that captures the result of a query and is continuously updated as new records are ingested in the query’s source object(s).

This article demonstrates how to create materialized views and run queries on them.

Create a Materialized View

To begin, use the CREATE MATERIALIZED VIEW AS statement to create a new materialized view. Note that with this statement, you can only generate columns with when you use the SELECT query.

Example 1: Top-K query

Create a materialized view using window functions

In this first example, create a called mv_pageviews on the data stored in the ds_pageviews topic in your default data store. Use this DDL statement to create this stream:

CREATE STREAM mv_pageviews (
    viewtime BIGINT,
    userid VARCHAR,
    pageid VARCHAR
) WITH ('topic'='pageviews', 'value.format'='JSON');

Now, if you wish to track how many pages each user visits every 30 seconds, you can use a query with a to create a materialized view. Call this materialized view visits_count. DeltaStream keeps its content up to date in a real-time manner.

CREATE MATERIALIZED VIEW
  visits_count
AS SELECT 
  window_start, 
  window_end,
  userid, 
  COUNT(pageid) AS pgcnt 
FROM TUMBLE(pageviews, size 30 SECONDS) 
GROUP BY window_start, window_end, userid;

Top-k query

When you have created the visits_count materialized view, you can run a top-k query against it. This query retrieves the top 3 users who have visited the most number of pages so far. It uses GROUP BY and the aggregate function SUM to calculate the total number of pages each user has visited across all time windows. The system orders results based on the total page counts and the top 3 records that are returned:

SELECT userid, SUM(pgcnt) AS total
FROM visits_count
GROUP BY userid
ORDER BY total DESC
LIMIT 3;

Here is a sample result for the query:

For more details on query capabilities on materialized views, please see SELECT (FROM MATERIALIZED VIEW)

Example 2. Combine two streams’ records

In this example, you use a materialized view to run different queries on JOIN results. Assume you have a new stream, called users, created using the DDL below. This stream captures some information about each user, such as the time a given user has registered and the user’s interests.

CREATE STREAM users_info (
  registertime BIGINT,
  userid VARCHAR,
  regionid VARCHAR,
  gender VARCHAR,
  interests ARRAY<VARCHAR>,
  contactinfo STRUCT<phone VARCHAR, city VARCHAR, "state" VARCHAR, zipcode VARCHAR>
) WITH ('topic'='ds_users', 'value.format'='json');

Create a materialized view using JOIN

CREATE MATERIALIZED VIEW
  visits_info
AS SELECT p.userid AS uid,
          p.pageid,
          u.registertime,
          u.interests[1] AS top_interest
FROM pageviews p JOIN users u
WITHIN 1 minute
ON u.userid = p.userid;

Run queries on the JOIN results

Now that the visits_info materialized view is created, DeltaStream continually updates its content in real-time as new records are ingested in either the pageviews or the users. You can retrieve the latest records in the materialized view by running queries directly on it.

For example, the below query finds info on pages visited by users whose top interest is gaming.

SELECT *
FROM visits_info
WHERE top_interest = 'Game';

The output looks like this:

As another example, you can find top interests of all the users who have visited a specific page (such as Page_15) so far and sort them according to the total user count per interest.

SELECT top_interest, count(uid) AS cnt
FROM visits_info
WHERE pageid = 'Page_15'
GROUP BY top_interest
ORDER BY cnt DESC;

The query results look like this:

To review the details on the above two objects, in the lefthand navigation click Databases ( ), and drill down on the database to display your streams.

Now suppose you wish to enrich the information already captured in the pageviews stream by combining it with each user’s information from users. To do this, you can run an query and store the result in a materialized view called visits_info.

#_stream
Materialized View
Drilling down from Database to display streams
Result of the query
Output of a sample query
Sample query results
Primitive Data Types
#window-function
#interval-join-stream-stream