Comment on page
CREATE MATERIALIZED VIEW AS
CREATE MATERIALIZED VIEW
materialized_view_name
[WITH (materialized_view_parameter = value [, ... ])]
AS select_statement;
CREATE MATERIALIZED VIEW AS
is a statement that:- Launches a new query to write the results of the SELECT statement into the newly created Materialized View.
The specified
select_statement
generates records with columns of a primitive data type. See SELECT for the statement’s parameters and properties.If the result of the statement is a Stream, new records are added to the Materialized View in the append mode, where new records are appended and there is no update of the existing records. Otherwise, if the statement creates a Changelog, new records are added to the Materialized View in the upsert mode, where a new record updates an existing record if there is one with matching value(s) on the
PRIMARY KEY
column(s). See CREATE CHANGELOG for further information on PRIMARY KEY
column(s).When creating a Materialized View, you can define a retention period for the rows to determine how long the Materialized View retains a row before deleting it. By default, a Materialized View keeps all the rows and never deletes any of them. Therefore, size of the Materialized View keeps growing as new rows are added. Defining an appropriate retention period for the Materialized View allows you to control its size and let the older rows be purged automatically. Once a row is expired (according to the retention period) and purged, it is deleted forever and its purging can not be reverted. Currently, retention period has to be defined at the creation time for a Materialized View and user can not enable/disable purging, or change the retention period later. See Materialized View Parameters to see how retention period can be defined.
This specifies the name of the new Materialized View. Optionally, use
<database_name>.<schema_name>
as the prefix to the name to create the Relation in that scope.Parameter Name | Description |
---|---|
retention.millis | The duration (in milliseconds) to determine how long Materialized View retains rows before they are purged. If not set, the rows will not be purged.
Required: No
Default value: Infinity
Type: String |
timestamp | Name of the column in the Materialized View to use as the timestamp. If not set, the timestamp of the message, coming from (one of) the Materialized view's sources, is used as the timestamp.
Required: No
Default value: Record’s timestamp.
Type: String
Valid values: Must be of type BIGINT or TIMESTAMP . See Data Types. |
| |
For the following examples, assume a Stream named
pageviews
has been created using the below DDL statement:CREATE STREAM pageviews (
viewtime BIGINT,
userid VARCHAR,
pageid VARCHAR
) WITH (
'topic' = 'pageviews',
'value.format' = 'json'
);
The following statement creates a Materialized View, named
visits
, by selecting userid
and pageid
columns from all records in the pageviews
Stream except those for User_5
. Note that the result of the SELECT
statement is a Stream. Therefore, new rows are simply added to the Materialized View with no impact on the existing ones.CREATE MATERIALIZED VIEW
visits
AS SELECT
userid, pageid
FROM pageviews
WHERE userid <> 'User_5';
The
visits
Materialized View will have two columns, userid
and pageid
, both with the VARCHAR
data type. For the details on how this view can be queried, check SELECT (FROM MATERIALIZED VIEW).The following statement creates a Materialized View, named
pagevisits
, by grouping records from the pageviews
Stream using the pageid
column and reporting the total number of records in each group as cnt
. The result of the SELECT
statement is a Changelog with the pageid
column as the Primary Key. Therefore, new rows are added to the Materialized View in the upsert mode. Each pageid
value has only one row in pagevists
and a new row with matching pageid
value overwrites the existing one, if any.CREATE MATERIALIZED VIEW
pagevisits
AS SELECT
pageid, count(userid) as cnt
FROM pageviews
GROUP BY pageid;
The
pagevisits
Materialized View will have two columns, pageid
with the VARCHAR
data type and cnt
with the BIGINT
data type. For the details on how this view can be queried, check SELECT (FROM MATERIALIZED VIEW).The following statement creates a Materialized View, named
visit_stats
, by aggregating the total number of pages a given user visited every 30 seconds. Given that the SELECT
statement generates results as a Changelog with the Primary Key of (window_start, window_end, userid)
, new rows are added to the Materialized View in the upsert mode, and there is only one row for any given userid
in each 30-second interval.CREATE MATERIALIZED VIEW
visit_stats
AS SELECT
window_start,
window_end,
userid,
COUNT(pageid) AS pgcnt
FROM TUMBLE(pageviews, size 30 SECONDS)
GROUP BY window_start, window_end, userid;
The
visit_stats
Materialized View will have four columns, window_start
and window_end
with the TIMESTAMP(3)
data type, as well as userid
with the VARCHAR
data type and pgcnt
with the BIGINT
data type. For the details on how this view can be queried, check SELECT (FROM MATERIALIZED VIEW).The following statement creates a Materialized View named
visits_rate
, by aggregating the total number of pages a given user visits every 10 seconds. Moreover, the retention.millis
parameter is set to 600000
milliseconds (i.e. 10 minutes). This means the retention period for any row is 10 minutes and the row is purged after that. This will help controlling the total size of visits_rate
by deleting older rows which are no longer needed.CREATE MATERIALIZED VIEW
visits_rate
WITH ('retention.millis' = '600000')
AS SELECT
window_start AS start_time,
window_end AS end_time,
userid,
count(pageid) AS cnt
FROM Tumble(pageviews, size 10 seconds)
GROUP BY userid, window_start, window_end;
Last modified 2mo ago