# ClickHouse

[ClickHouse](https://clickhouse.com/) is a fast and resource-efficient real-time data warehouse and open-source database.

This document walks through setting up ClickHouse to be used as a data store in DeltaStream.

Step through the following demo to get an overview. Detailed instructions are below the demo.

{% @storylane/embed subdomain="deltastream" url="<https://deltastream.storylane.io/share/qhiujgcjlwmq>" linkValue="qhiujgcjlwmq" %}

## Setting up a ClickHouse Account

To set up your ClickHouse Cloud account, please review the [ClickHouse Cloud Quick Start](https://clickhouse.com/docs/cloud/get-started/cloud-quick-start) guide.

## Adding ClickHouse as a DeltaStream Data Store

1. Open DeltaStream. In the lefthand navigation, click **Resources (** ![](/files/Zwq1BBdRyaRsv55N3KNm) ) and towards the top right click + **Add Data Store**.
2. When the **Choose a Data Store** window opens, click **ClickHouse** and then click **Next**.

<figure><img src="/files/DIoKKcvbTndDMPvoYgIg" alt="" width="307"><figcaption></figcaption></figure>

3\. The **Add Data Store** window displays.

<figure><img src="/files/jwenjjb9hSkQg2cMD32R" alt="" width="461"><figcaption></figcaption></figure>

4. Enter the authentication and connection parameters. These include:

* **Name** – A name that identifies your ClickHouse store.
* **URL** – JDBC URL from the ClickHouse Cloud console. Be sure to preface this URL with a `clickhouse://` scheme:

<figure><img src="/files/rowGA1iAtLjUGv89A7TC" alt="" width="375"><figcaption></figcaption></figure>

* **Username** – User login name for the ClickHouse account.
* **Password** – Password for the login name of the ClickHouse account.

4\. Click **Add** to save and create the store.

*For instructions on creating the store using DSQL, see* [CREATE STORE](/reference/sql-syntax/ddl/create-store.md).

## Process Streaming Data and Sink it to ClickHouse

{% hint style="warning" %}
**Important** Before you proceed, you must already have:

* A [stream](/overview/core-concepts/databases.md#stream) defined in DeltaStream labeled `pageviews,` which is backed by a topic in an Apache Kafka Store.
* A ClickHouse store labeled `clickhouse_store` (see [#adding-clickhouse-as-a-deltastream-store](#adding-clickhouse-as-a-deltastream-store "mention")).

With these two items in place you can perform a simple filter on the `pageviews` stream and then sink the results into ClickHouse.
{% endhint %}

### Inspect the ClickHouse Store

1. In the lefthand navigation, click **Resources (** ![](/files/Zwq1BBdRyaRsv55N3KNm) ). This displays a list of the existing data stores.<br>

   <figure><img src="/files/CMPH5Xad2RgJWBtFQeCt" alt="" width="563"><figcaption></figcaption></figure>
2. Click `clickhouse_store.` The store page opens, displaying a list of the existing databases in your ClickHouse account:<br>

   <figure><img src="/files/N7NURbjrvATB5UxK1TWo" alt=""><figcaption></figcaption></figure>
3. (Optional) Create a new database. To do this:
   * Click **+ Add Database**. When prompted, enter a name for the new database and click **Add**. The new database displays in the list.
4. To see the tables that exist under a particular database, click the database name.

### Write a CTAS (CREATE TABLE AS SELECT) Query to Sink Data into ClickHouse

1. In the lefthand navigation, click **Workspace** ( ![](/files/ZXcAkgugP7AuG9QFRXKO) ).
2. In the SQL pane of your workspace, write the [CREATE TABLE AS SELECT](/reference/sql-syntax/query/create-table-as.md) (CTAS) query to ingest from **pageviews** and output to a new table titled **ch\_pv\_table**.<br>

   ```sql
   CREATE TABLE ch_pv_table WITH (
     'store' = 'clickhouse_store',
     'clickhouse.db.name' = 'abc'
   ) AS
   SELECT
     viewtime,
     pageid,
     userid
   FROM
     pageviews
   WHERE
     pageid != 'Page_3';
   ```
3. Click **Run**.
4. View the existing queries, including the query from the step immediately prior. To do this, in the lefthand navigation click **Queries** ( ![](/files/HOEvY09XthGMf2h6wEx6) ). It may take a few moments for the query to transition into the **Running** state.
5. To see more details about the status of the query, click the query row :<br>

   <figure><img src="/files/Ke9qcKz8KL3Yg1fvD3kl" alt=""><figcaption></figcaption></figure>

### View the results

1. In the lefthand navigation, click **Resources (** ![](/files/Zwq1BBdRyaRsv55N3KNm) ). This displays a list of the existing data stores.
2. To view the new table created by the above CTAS, navigate to `clickhouse_store` → `Databases` → `abc` → `ch_pv_table`.
3. To view a sample of the data in your ClickHouse table, click **Print**.

<figure><img src="/files/cRpjbkGccnAPXWa8wXyD" alt="" width="563"><figcaption></figcaption></figure>

### Data Type Handling

At DeltaStream, all data types are nullable by default. But in ClickHouse, data types are not-null by default. For a smooth conversion between SQL and ClickHouse data types, DeltaStream provides a reasonable default for nullable columns to prevent `Nullable()` ClickHouse data types.

Data type defaults are as followed:

<table><thead><tr><th width="331.209228515625" align="center">SQL Data Type</th><th width="147.1939697265625" align="center">ClickHouse Data Type</th><th align="center">Nullable Column Default</th></tr></thead><tbody><tr><td align="center">BOOLEAN</td><td align="center">Boolean</td><td align="center">false</td></tr><tr><td align="center">CHAR</td><td align="center">String</td><td align="center">''</td></tr><tr><td align="center">VARCHAR</td><td align="center">String</td><td align="center">''</td></tr><tr><td align="center">VARBINARY</td><td align="center">String</td><td align="center">''</td></tr><tr><td align="center">TINYINT</td><td align="center">Int8</td><td align="center">0</td></tr><tr><td align="center">SMALLINT</td><td align="center">Int16</td><td align="center">0</td></tr><tr><td align="center">INTEGER</td><td align="center">Int32</td><td align="center">0</td></tr><tr><td align="center">BIGINT</td><td align="center">Int64</td><td align="center">0</td></tr><tr><td align="center">DECIMAL</td><td align="center">Decimal(P, S)</td><td align="center">0.00x</td></tr><tr><td align="center">FLOAT</td><td align="center">Float32</td><td align="center">0.0</td></tr><tr><td align="center">DOUBLE</td><td align="center">Float64</td><td align="center">0.0</td></tr><tr><td align="center">DATE</td><td align="center">Date32</td><td align="center">toDate('1970-01-01')</td></tr><tr><td align="center">TIMESTAMP</td><td align="center">DateTime64(P)</td><td align="center">toDateTime64('1970-01-01 00:00:00', P)</td></tr><tr><td align="center">TIMESTAMP_WITH_LOCAL_TIME_ZONE</td><td align="center">DateTime64(P)</td><td align="center">toDateTime64('1970-01-01 00:00:00', P)</td></tr><tr><td align="center">ARRAY</td><td align="center">Array(T)</td><td align="center">[]</td></tr><tr><td align="center">MAP</td><td align="center">Map(K, V)</td><td align="center">map()</td></tr><tr><td align="center">STRUCT</td><td align="center">Tuple(T1, T2, ...)</td><td align="center">[]</td></tr></tbody></table>

If you must use a specific default that is different than what's provided in the table above, you can use a `CASE` function to alter the value at query execution:

```sql
SELECT
  viewtime,
  userid,
  CASE
    WHEN pageid IS NULL
    THEN 'Page_0'
    END AS pageid_my_default
FROM pageviews;
```

In the above query, whenever the original `pageviews` column `pageid` is `null`, the value `Page_0` is projected to the results. This effectively eliminates all `null` values for the column and also allows you to define the resulting data type for the column as `pageid VARCHAR NOT NULL`.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.deltastream.io/integrations/setting-up-data-store-integrations/clickhouse.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
