LogoLogo
Start Trial
  • Overview
    • What is DeltaStream?
    • Core Concepts
      • Access Control
      • Compute Pools
      • Data Store
      • Database
      • Function
      • Query
      • SQL
      • Visualizing Data Lineage
  • Getting Started
    • Free Trial Quick Start
    • Starting with the Web App
    • Starting with the CLI
  • How do I...?
    • Create and Manage Data Stores
      • Create Data Stores for Streaming Data
      • Explore Data Store and Topic Details
      • Use Multiple Data Stores in Queries
    • Manage Users and User Roles
      • Inviting Users to an Organization
      • Administering Users in your Organization
      • Using the CLI to Manage User Roles
      • Example: Setting Up Custom Roles for Production and Stage
    • Create DeltaStream Objects to Structure Raw Data
    • Use Namespacing for Organizing Data
    • Create and Query Materialized Views
    • Create a Compute Pool to Work with Iceberg
    • Create a Function
    • Secure my Connection to a Data Store
      • Introducing DeltaStream Private Links
      • Creating an AWS Private Link from DeltaStream to your Confluent Kafka Dedicated Cluster
      • Enabling Private Link Connectivity to Confluent Enterprise Cluster and Schema Registry
      • Creating a Private Link from DeltaStream to Amazon MSK
      • Creating a Private Link for RDS Databases
      • Deleting a Private Link
    • Serialize my Data
      • Working with ProtoBuf Serialized Data and DeltaStream Descriptors
      • Working with Avro Serialized Data and Schema Registries
      • Configuring Deserialization Error Handling
  • Integrations
    • Setting up Data Store Integrations
      • AWS S3
      • ClickHouse
      • Confluent Cloud
      • Databricks
      • Iceberg REST Catalog
      • PostgreSQL
      • Snowflake
      • WarpStream
  • Setting up Enterprise Security Integrations
    • Okta SAML Integration
    • Okta SCIM Integration
  • use cases
    • Using an AWS S3 Store as a Source to Feed an MSK Topic
  • Reference
    • Metrics
      • Prometheus Integration
      • Built-In Metrics
      • Custom Metrics in Functions
    • SQL Syntax
      • Data Formats (Serialization)
        • Serializing with JSON
        • Serializing with Primitive Data Types
        • Serializing with Protobuf
      • Data Types
      • Identifiers and Keywords
      • Command
        • ACCEPT INVITATION
        • CAN I
        • COPY DESCRIPTOR_SOURCE
        • COPY FUNCTION_SOURCE
        • DESCRIBE ENTITY
        • DESCRIBE QUERY
        • DESCRIBE QUERY METRICS
        • DESCRIBE QUERY EVENTS
        • DESCRIBE QUERY STATE
        • DESCRIBE RELATION
        • DESCRIBE RELATION COLUMNS
        • DESCRIBE ROLE
        • DESCRIBE SECURITY INTEGRATION
        • DESCRIBE <statement>
        • DESCRIBE STORE
        • DESCRIBE USER
        • GENERATE COLUMNS
        • GENERATE TEMPLATE
        • GRANT OWNERSHIP
        • GRANT PRIVILEGES
        • GRANT ROLE
        • INVITE USER
        • LIST API_TOKENS
        • LIST COMPUTE_POOLS
        • LIST DATABASES
        • LIST DESCRIPTORS
        • LIST DESCRIPTOR_SOURCES
        • LIST ENTITIES
        • LIST FUNCTIONS
        • LIST FUNCTION_SOURCES
        • LIST INVITATIONS
        • LIST METRICS INTEGRATIONS
        • LIST ORGANIZATIONS
        • LIST QUERIES
        • LIST RELATIONS
        • LIST ROLES
        • LIST SCHEMAS
        • LIST SCHEMA_REGISTRIES
        • LIST SECRETS
        • LIST SECURITY INTEGRATIONS
        • LIST STORES
        • LIST USERS
        • PRINT ENTITY
        • REJECT INVITATION
        • REVOKE INVITATION
        • REVOKE PRIVILEGES
        • REVOKE ROLE
        • SET DEFAULT
        • USE
        • START COMPUTE_POOL
        • STOP COMPUTE_POOL
      • DDL
        • ALTER API_TOKEN
        • ALTER SECURITY INTEGRATION
        • CREATE API_TOKEN
        • CREATE CHANGELOG
        • CREATE COMPUTE_POOL
        • CREATE DATABASE
        • CREATE DESCRIPTOR_SOURCE
        • CREATE ENTITY
        • CREATE FUNCTION_SOURCE
        • CREATE FUNCTION
        • CREATE INDEX
        • CREATE METRICS INTEGRATION
        • CREATE ORGANIZATION
        • CREATE ROLE
        • CREATE SCHEMA_REGISTRY
        • CREATE SCHEMA
        • CREATE SECRET
        • CREATE SECURITY INTEGRATION
        • CREATE STORE
        • CREATE STREAM
        • CREATE TABLE
        • DROP API_TOKEN
        • DROP CHANGELOG
        • DROP COMPUTE_POOL
        • DROP DATABASE
        • DROP DESCRIPTOR_SOURCE
        • DROP ENTITY
        • DROP FUNCTION_SOURCE
        • DROP FUNCTION
        • DROP METRICS INTEGRATION
        • DROP RELATION
        • DROP ROLE
        • DROP SCHEMA
        • DROP SCHEMA_REGISTRY
        • DROP SECRET
        • DROP SECURITY INTEGRATION
        • DROP STORE
        • DROP STREAM
        • DROP USER
        • START/STOP COMPUTE_POOL
        • UPDATE COMPUTE_POOL
        • UPDATE ENTITY
        • UPDATE SCHEMA_REGISTRY
        • UPDATE SECRET
        • UPDATE STORE
      • Query
        • APPLICATION
        • Change Data Capture (CDC)
        • CREATE CHANGELOG AS SELECT
        • CREATE STREAM AS SELECT
        • CREATE TABLE AS SELECT
        • Function
          • Built-in Functions
          • Row Metadata Functions
        • INSERT INTO
        • Materialized View
          • CREATE MATERIALIZED VIEW AS
          • SELECT (FROM MATERIALIZED VIEW)
        • Query Name and Version
        • Resume Query
        • RESTART QUERY
        • SELECT
          • FROM
          • JOIN
          • MATCH_RECOGNIZE
          • WITH (Common Table Expression)
        • TERMINATE QUERY
      • Sandbox
        • START SANDBOX
        • DESCRIBE SANDBOX
        • STOP SANDBOX
      • Row Key Definition
    • DeltaStream OpenAPI
      • Deltastream
      • Models
Powered by GitBook
On this page
  • Setting up a ClickHouse Account
  • Adding ClickHouse as a DeltaStream Data Store
  • Process Streaming Data and Sink it to ClickHouse
  • Inspect the ClickHouse Store
  • Write a CTAS (CREATE TABLE AS SELECT) Query to Sink Data into ClickHouse
  • View the results
  • Data Type Handling
  1. Integrations
  2. Setting up Data Store Integrations

ClickHouse

PreviousAWS S3NextConfluent Cloud

Last updated 2 days ago

is a fast and resource-efficient real-time data warehouse and open-source database.

This document walks through setting up ClickHouse to be used as a data store in DeltaStream.

Step through the following demo to get an overview. Detailed instructions are below the demo.

Setting up a ClickHouse Account

To set up your ClickHouse Cloud account, please review the guide.

Adding ClickHouse as a DeltaStream Data Store

  1. Open DeltaStream. In the lefthand navigation, click Resources ( ) and towards the top right click + Add Data Store.

  2. When the Choose a Data Store window opens, click ClickHouse and then click Next.

3. The Add Data Store window displays.

  1. Enter the authentication and connection parameters. These include:

  • Name – A name that identifies your ClickHouse store.

  • URL – JDBC URL from the ClickHouse Cloud console. Be sure to preface this URL with a clickhouse:// scheme:

  • Username – User login name for the ClickHouse account.

  • Password – Password for the login name of the ClickHouse account.

4. Click Add to save and create the store.

For instructions on creating the store using DSQL, see CREATE STORE.

Process Streaming Data and Sink it to ClickHouse

Important Before you proceed, you must already have:

  • A ClickHouse store labeled clickhouse_store (see #adding-clickhouse-as-a-deltastream-store).

With these two items in place you can perform a simple filter on the pageviews stream and then sink the results into ClickHouse.

Inspect the ClickHouse Store

  1. Click clickhouse_store. The store page opens, displaying a list of the existing databases in your ClickHouse account:

  2. (Optional) Create a new database. To do this:

    • Click + Add Database. When prompted, enter a name for the new database and click Add. The new database displays in the list.

  3. To see the tables that exist under a particular database, click the database name.

Write a CTAS (CREATE TABLE AS SELECT) Query to Sink Data into ClickHouse

  1. In the SQL pane of your workspace, write the CREATE TABLE AS SELECT (CTAS) query to ingest from pageviews and output to a new table titled ch_pv_table.

    CREATE TABLE ch_pv_table WITH (
      'store' = 'clickhouse_store',
      'clickhouse.db.name' = 'abc'
    ) AS
    SELECT
      viewtime,
      pageid,
      userid
    FROM
      pageviews
    WHERE
      pageid != 'Page_3';
  2. Click Run.

  3. To see more details about the status of the query, click the query row :

View the results

  1. To view the new table created by the above CTAS, navigate to clickhouse_store → Databases → abc → ch_pv_table.

  2. To view a sample of the data in your ClickHouse table, click Print.

Data Type Handling

At DeltaStream, all data types are nullable by default. But in ClickHouse, data types are not-null by default. For a smooth conversion between SQL and ClickHouse data types, DeltaStream provides a reasonable default for nullable columns to prevent Nullable() ClickHouse data types.

Data type defaults are as followed:

SQL Data Type
ClickHouse Data Type
Nullable Column Default

BOOLEAN

Boolean

false

CHAR

String

''

VARCHAR

String

''

VARBINARY

String

''

TINYINT

Int8

0

SMALLINT

Int16

0

INTEGER

Int32

0

BIGINT

Int64

0

DECIMAL

Decimal(P, S)

0.00x

FLOAT

Float32

0.0

DOUBLE

Float64

0.0

DATE

Date32

toDate('1970-01-01')

TIMESTAMP

DateTime64(P)

toDateTime64('1970-01-01 00:00:00', P)

TIMESTAMP_WITH_LOCAL_TIME_ZONE

DateTime64(P)

toDateTime64('1970-01-01 00:00:00', P)

ARRAY

Array(T)

[]

MAP

Map(K, V)

map()

STRUCT

Tuple(T1, T2, ...)

[]

If you must use a specific default that is different than what's provided in the table above, you can use a CASE function to alter the value at query execution:

SELECT
  viewtime,
  userid,
  CASE
    WHEN pageid IS NULL
    THEN 'Page_0'
    END AS pageid_my_default
FROM pageviews;

In the above query, whenever the original pageviews column pageid is null, the value Page_0 is projected to the results. This effectively eliminates all null values for the column and also allows you to define the resulting data type for the column as pageid VARCHAR NOT NULL.

A defined in DeltaStream labeled pageviews, which is backed by a topic in an Apache Kafka Store.

In the lefthand navigation, click Resources ( ). This displays a list of the existing data stores.

In the lefthand navigation, click Workspace ( ).

View the existing queries, including the query from the step immediately prior. To do this, in the lefthand navigation click Queries ( ). It may take a few moments for the query to transition into the Running state.

In the lefthand navigation, click Resources ( ). This displays a list of the existing data stores.

stream
ClickHouse
ClickHouse Cloud Quick Start