ClickHouse
ClickHouse is a fast and resource-efficient real-time data warehouse and open-source database.
This document walks through setting up ClickHouse to be used as a data store in DeltaStream.
Step through the following demo to get an overview. Detailed instructions are below the demo.
Setting up a ClickHouse Account
To set up your ClickHouse Cloud account, please review the ClickHouse Cloud Quick Start guide.
Adding ClickHouse as a DeltaStream Data Store
Open DeltaStream. In the lefthand navigation, click Resources (
) and towards the top right click + Add Data Store.
When the Choose a Data Store window opens, click ClickHouse and then click Next.

3. The Add Data Store window displays.

Enter the authentication and connection parameters. These include:
Name – A name that identifies your ClickHouse store.
URL – JDBC URL from the ClickHouse Cloud console. Be sure to preface this URL with a
clickhouse://
scheme:

Username – User login name for the ClickHouse account.
Password – Password for the login name of the ClickHouse account.
4. Click Add to save and create the store.
For instructions on creating the store using DSQL, see CREATE STORE.
Process Streaming Data and Sink it to ClickHouse
Important Before you proceed, you must already have:
A stream defined in DeltaStream labeled
pageviews,
which is backed by a topic in an Apache Kafka Store.A ClickHouse store labeled
clickhouse_store
(see ClickHouse).
With these two items in place you can perform a simple filter on the pageviews
stream and then sink the results into ClickHouse.
Inspect the ClickHouse Store
In the lefthand navigation, click Resources (
). This displays a list of the existing data stores.
Click
clickhouse_store.
The store page opens, displaying a list of the existing databases in your ClickHouse account:(Optional) Create a new database. To do this:
Click + Add Database. When prompted, enter a name for the new database and click Add. The new database displays in the list.
To see the tables that exist under a particular database, click the database name.
Write a CTAS (CREATE TABLE AS SELECT) Query to Sink Data into ClickHouse
In the lefthand navigation, click Workspace (
).
In the SQL pane of your workspace, write the CREATE TABLE AS SELECT (CTAS) query to ingest from pageviews and output to a new table titled ch_pv_table.
Click Run.
View the existing queries, including the query from the step immediately prior. To do this, in the lefthand navigation click Queries (
). It may take a few moments for the query to transition into the Running state.
To see more details about the status of the query, click the query row :
View the results
In the lefthand navigation, click Resources (
). This displays a list of the existing data stores.
To view the new table created by the above CTAS, navigate to
clickhouse_store
→Databases
→abc
→ch_pv_table
.To view a sample of the data in your ClickHouse table, click Print.

Data Type Handling
At DeltaStream, all data types are nullable by default. But in ClickHouse, data types are not-null by default. For a smooth conversion between SQL and ClickHouse data types, DeltaStream provides a reasonable default for nullable columns to prevent Nullable()
ClickHouse data types.
Data type defaults are as followed:
BOOLEAN
Boolean
false
CHAR
String
''
VARCHAR
String
''
VARBINARY
String
''
TINYINT
Int8
0
SMALLINT
Int16
0
INTEGER
Int32
0
BIGINT
Int64
0
DECIMAL
Decimal(P, S)
0.00x
FLOAT
Float32
0.0
DOUBLE
Float64
0.0
DATE
Date32
toDate('1970-01-01')
TIMESTAMP
DateTime64(P)
toDateTime64('1970-01-01 00:00:00', P)
TIMESTAMP_WITH_LOCAL_TIME_ZONE
DateTime64(P)
toDateTime64('1970-01-01 00:00:00', P)
ARRAY
Array(T)
[]
MAP
Map(K, V)
map()
STRUCT
Tuple(T1, T2, ...)
[]
If you must use a specific default that is different than what's provided in the table above, you can use a CASE
function to alter the value at query execution:
In the above query, whenever the original pageviews
column pageid
is null
, the value Page_0
is projected to the results. This effectively eliminates all null
values for the column and also allows you to define the resulting data type for the column as pageid VARCHAR NOT NULL
.
Last updated