Creating and Querying Materialized Views
Last updated
Last updated
A Materialized View is a relation that captures the result of a query and is continuously updated as new records are ingested in the query’s source relation(s).
This tutorial demonstrates how to create materialized views and run queries on them.
To begin, use the CREATE MATERIALIZED VIEW AS statement to create a new materialized view. Note that with this statement, you can only generate columns with when you use the SELECT query.
In this first example, create a called mv_pageviews
on the data stored in the ds_pageviews
topic in your default store. Use this DDL statement to create this stream:
Now, if you wish to track how many pages each user visits every 30 seconds, you can use a query with a #window-function to create a materialized view. Call this materialized view visits_count
. DeltaStream keeps its content up to date in a real-time manner.
When you have created the visits_count
materialized view, you can run a top-k query against it. This query retrieves the top 3 users who have visited the most number of pages so far. It uses GROUP BY
and the aggregate function SUM
to calculate the total number of pages each user has visited across all time windows. Results are ordered based on the total page counts and the top 3 records that are returned:
Here is a sample result for the query:
For more details on query capabilities on materialized views, please see SELECT (FROM MATERIALIZED VIEW)
In the second example, you use a materialized view to run different queries on JOIN
results. Assume you have a new stream, called users
, created using the DDL below. This stream captures some information about each user, such as the time a given user has registered along with the user’s interests.
JOIN
Now suppose you wish to enrich the information already captured in the pageviews
stream by combining it with each user’s information from users
. To do this, you can run an #interval-join-stream-stream query and store the result in a materialized view called visits_info
.
JOIN
resultsNow that the visits_info
materialized view is created, DeltaStream continually updates its content in real-time as new records are ingested in either the pageviews
or the users
. You can retrieve the latest records in the materialized view by running queries directly on it.
For example, the below query finds info on pages visited by users whose top interest is gaming.
The output would look like this:
As another example, you can find top interests of all the users who have visited a specific page (such as Page_15
) so far and sort them according to the total user count per interest.
The query results would look like this:
To review the details on the above two relations, in the lefthand navigation click Catalog ( ), and drill down on the database to display your streams.