INSERT INTO
Last updated
Last updated
For an existing Relation, INSERT INTO
runs the given query (i.e. a SELECT
statement) and adds its results into the sink Relation. The list of columns of the sink Relation and the SELECT
columns list in the query should be data type compatible. Moreover, the Relation type of a sink Relation should match with the Relation type of query results. For example, the results of a query that uses grouping aggregation cannot be inserted into a as the result type for a query with GROUP BY
as a .
INSERT INTO
does not support MATERIALIZED VIEW
as the sink Relation.
This specifies the name of the Relation to add results to. Relation names can be specified as fully/partially qualified names via the specifying database_name and/or schema_name in the format [<database_name>.<schema_name>.]<relation_name>
(such as db1.public.pageviews
). Otherwise, the current Database and Schema will be used to identify the Relation. For case-sensitive names, the name must be wrapped in double quotes; otherwise, the lowercase name will be used.
This clause specifies the SELECT
statement to run; see SELECT for more information.
Optionally, this clause allows the user to set the partition key of records according to their values for a given set of columns. The PARTITION BY
clause in the statement defines a list of one or more columns (separated by commas) as partitioning columns. By default, the key for the sink's records will have a data format equal to the sink's value data format. To set a specific key format, Set the key.format
Stream parameter to specify a different key format. PARTITION BY
is supported for CREATE STREAM AS SELECT and INSERT INTO queries where the sink is a Stream. Currently, PARTITION BY
only applies for queries whose sink Stream is backed by a Kafka store.
When using PARTITION BY
in an INSERT INTO
query, the key type produced by the PARTITION BY
clause must match the key type in the sink's output Stream.
INSERT INTO
The following copies all data from the source Relation and inserts it into a preexisting Relation.
INSERT INTO
with grouping and aggregationThe following runs a query that finds the average ViewTime
in a 5 second window and inserts the results into the already existing Relation Aggr Pageviews2
.
INSERT INTO
INSERT INTO
can be used to combine the results of multiple queries into a single sink Relation, as long as:
Every query has the same sink Relation type.
The SELECT
columns list in every query has the same number of columns, with similar data types, in the same order.
For example, assume two Changelogs are created from the users
Stream to collect stats on the total number of users in different cities in Europe and the U.S.
Moreover, assume we are interested in keeping track of cities in Europe or the U.S. with more than a thousand users, in a single Relation. We can create a third Changelog, named total_users
, with the below DDL and use the following two INSERT INTO
statements to combine results from the above Changelogs and add them to the total_users
Changelog:
The below DDL statements create two Streams, pageviews
and keyed_pageviews
. Notice that keyed_pageviews
has the key.format
and key.type
Stream properties set while pageviews
doesn't. These DDLs are followed by an INSERT INTO
query which has a PARTITION BY
clause that sets the key for the sink Stream. In order for the query to be accepted, the key type generated by the PARTITION BY
clause must match the key.type
specified from the CREATE STREAM keyed_pageviews
DDL, which it does as shown in the example below.
Given this input for pageviews
:
We can expect the following output in keyed_pageviews
: