Snowflake

Snowflake is a fully managed cloud-based data platform for storing and processing data. Snowflake uses its own SQL query engine to provide its users all of the functionality of an enterprise analytic database, along with many additional special features and unique capabilities.

This document will walk through setting up Snowflake to be used as a data Store in DeltaStream.

Setting up Snowflake Account

Prerequisites

  1. Unix-like environment (Linux, Mac OS X)

Add Snowflake Warehouse

  1. Under Admin in the left side panel, select Warehouses. You'll see a list of existing warehouses in your Snowflake account. Snowflake creates a starter warehouse for you called COMPUTE_WH.

  2. To create a new Warehouse, select the + Warehouse button in the top right of the screen. To edit an existing warehouse, click the 3 horizontal dots on the right of each listed warehouse and click Edit.

  3. Configure your warehouse with your preferred specifications. To learn more about configuring your warehouse, take a look at the Snowflake warehouse documentation.

Create Snowflake Key Pair Authentication

  1. In a terminal window, generate a new private key

    1. Without encryption:

      $ openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out rsa_key.p8 -nocrypt

      A new file called rsa_key.p8 should be generated which contains the private key in PEM format.

    2. With encryption:

      $ openssl genrsa 2048 | openssl pkcs8 -topk8 -traditional -inform PEM -out rsa_key.p8

      Enter a password for this new private key when prompted, remember this password as it is necessary in the following steps.

      A new file called rsa_key.p8 should be generated which contains the encrypted private key in PEM format.

  2. Generate a public key

    $ openssl rsa -in rsa_key.p8 -pubout -out rsa_key.pub

    A new file called rsa_key.pub should be generated which contains the public key in PEM format.

  3. Ensure these new private and public keys are stored securely. If these files are kept on local disk, then consider restricting read and write access to these files or securing them behind a password.

  4. Assign the public key to a Snowflake user

    1. Start the snowsql REPL in your terminal. Here's the example of starting snowsql for account ABCDEF-ABC12345 with the user jsmith and warehouse COMPUTE_WH. For more information about using snowsql, see Snowflake's documentation.

      $ snowSQL -a ABCDEF-ABC12345 -u jsmith -w COMPUTE_WH
    2. Alter a user to set the RSA_PUBLIC_KEY property If your public key looks like this:

      $ cat rsa_key.pub
      -----BEGIN PUBLIC KEY-----
      MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAwViztpm3l7IfLN/prlBb
      KJi/yTMngmTBKFp5YOyWqswlXwVlH6zehbCgFYxjo36RxxDtwqTSzaEalraalvjE
      aRKKb+esi8TYvbd1eiJGJqoq5WjTIHs0lwRMwLkCn6mHX9I4J0Wp6mR2DhjWvey6
      IE4aKT0kXrKChs/GCOGFi3lAIRIXy9FQusYW2OyNoG1iogFLoWPH1Od5bmVl6TJv
      xXXKeUj0omTq+3CKZxELxgr5nYU44MQgq/NH1X+nEjIiA0W2RiS6heKNgwRUg2EQ
      8UXbrHCjhgnONRnnHGukh9CpH4KTTmQc2hifTbH7g39f31hrEWBfVTECESUAY+qz
      kwIDAQAB
      -----END PUBLIC KEY-----

      Then the command to set the RSA_PUBLIC_KEY for user jsmith is the following:

      ALTER USER jsmith SET RSA_PUBLIC_KEY='MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAwViztpm3l7IfLN/prlBb
      KJi/yTMngmTBKFp5YOyWqswlXwVlH6zehbCgFYxjo36RxxDtwqTSzaEalraalvjE
      aRKKb+esi8TYvbd1eiJGJqoq5WjTIHs0lwRMwLkCn6mHX9I4J0Wp6mR2DhjWvey6
      IE4aKT0kXrKChs/GCOGFi3lAIRIXy9FQusYW2OyNoG1iogFLoWPH1Od5bmVl6TJv
      xXXKeUj0omTq+3CKZxELxgr5nYU44MQgq/NH1X+nEjIiA0W2RiS6heKNgwRUg2EQ
      8UXbrHCjhgnONRnnHGukh9CpH4KTTmQc2hifTbH7g39f31hrEWBfVTECESUAY+qz
      kwIDAQAB';
    3. Note: Only security administrators (i.e. users with the SECURITYADMIN role) or higher can alter a user.

See Snowflake documentation for setting up key pair authentication for more infromation.

Adding Snowflake as a DeltaStream Store

  1. Navigate to the DeltaStream web console

  2. Select Stores in the left panel then select + New Store in the top right

  3. Enter the following information

    • Store Name – A name to identify your Snowflake Store (See Store)

    • Store TypeSnowflake

    • Access Region – DeltaStream access region to associate with the Store (See Region)

    • URL – URL for Snowflake account in the format https://<account_id>.snowflakecomputing.com (See Snowflake Account Identifiers documentation)

    • Account ID – Account identifier in the form <orgname>-<account_name> (See Snowflake Account Identifiers documentation)

    • Role Name – The name of the access control role to use for the Store operations after connecting to Snowflake (See Snowflake docs on Roles). Roles can be found by clicking on the dropdown in the top user profile icon.

    • Username – User login name for the Snowflake account

    • Warehouse Name – The name for a Snowflake compute Warehouse to use for queries and other store operations that require compute resources (See Add Snowflake Warehouse)

    • Client Key File – Snowflake account's private key in PEM format (See Create Snowflake Key Pair Authentication)

    • Key Passphrase – If applicable, passphrase for decrypting the Snowflake account's private key (See Create Snowflake Key Pair Authentication)

  4. Click Save to create the Store

For instructions on creating the Store using DSQL, see CREATE STORE.

Process Streaming Data and Sink to Snowflake

For the steps below, let's assume you already have a Stream defined called pageviews which is backed by a topic in an Apache Kafka Store. We'll also assume that there is a Snowflake Store called snowflake_store (See Adding Snowflake as a DeltaStream Store). We are going to perform a simple filter on the pageviews Stream and sink the results into Snowflake.

If you are curious about how to set up a Stream or a Kafka Store, see Starting with the Web App or Starting with the CLI.

Inspect the Snowflake store

  1. Navigate to the Stores tab in the left side panel. This will display a list of the existing Stores:

  2. Select the Store called snowflake_store then select Entities. This will bring up a list of the existing databases in your Snowflake account for the role that was specified as part of the Store's creation:

  3. (Optional) Create a new database

    1. Click on the 3 vertical dots next to your store name and select Create Entity:

    2. In the popup, enter a name for the new database and select Create. You should now be able to see the new database in the entities list.

  4. Click on a database name to see the schemas that exist under that database

  5. (Optional) Create a new schema

    1. Select + New Entity to create a new schema

    2. In the popup, enter a name for the new schema and select Create. You should now be able to see the new schema in the entities list.

  6. Click on a schema name to see the tables that exist under that schema

Write a CTAS query to sink data into Snowflake

  1. Navigate to the "SQL Editor" tab in the left side panel:

CREATE TABLE pv_table WITH (
  'store' = 'snowflake_store', 
  'snowflake.db.name' = 'new_db',
  'snowflake.schema.name' = 'new_schema'
) AS 
SELECT 
  viewtime, 
  pageid, 
  userid 
FROM 
  pageviews 
WHERE 
  pageid != 'Page_3';
  1. Select Run

  2. Navigate to the Queries tab in the left side panel to see the existing queries, including the query from step 2

  3. Refresh until you see the query is in the Running state:

  1. Navigate to the Stores tab in the left side panel

  2. Navigate to the new Table created by the above CTAS: snowflake_store --> Entities --> new_database --> new_schema --> pv_table. NOTE: If you wrote your CTAS such that the store/database/schema/table names are different, then navigate accordingly.

  3. Select Print to see a sample of the data from your Snowflake table, pv_table:

Last updated