Data Types
DeltaStream's data types are similar to SQL standard's data types. They are used to declare the value type for columns in relations or input/output parameters in functions.
Primitive Data Types
Boolean
BOOLEAN
Boolean value representing true
or false.
Character string
VARCHAR
VARCHAR(n)
Variable-length character string.
Type parameter n
represents maximum length of string (number of code points) and has to be an integer between 1
and 2,147,483,647
.
If no length specified, maximum length is picked.
Numeric
TINYINT
1-byte signed integer.
It can hold an integer value between -128
and 127
.
SMALLINT
2-byte signed integer.
It can hold an integer value between -32,768
and 32,767
.
INTEGER
4-byte signed integer.
It can hold an integer value between -2^31
and 2^31-1
.
BIGINT
8-byte signed integer.
It can hold an integer value between -2^63
and 2^63-1
FLOAT
4-byte variable precision floating point number.
DOUBLE
8-byte variable precision floating point number.
DECIMAL
DECIMAL(p, s)
Decimal number with fixed precision and scale.
Decimal type accepts two type parameters: p
: for precision and s
for scale.
Precision is the maximum total number of digits to be stored in the number (an integer value between 1
and 38
).
Scale is the number of decimal digits to the right of the decimal point (an integer value between 0
and p
).
If no type parameters specified, default precision of 10
and default scale of 0
are picked.
Date and Time
DATE
Calendar date consisting of year-month-day
with values between 0000-01-01
and 9999-12-31
.
TIME
TIME(p)
Time of day, without time zone, consisting of hour:minute:second[.fractional]
(up to nanosecond precision) with values between 00:00:00.000000000
and 23:59:59.999999999
.
Type parameter p
represents precision
which is the number of digits of fractional seconds (an integer between 0
and 9
).
If no precision specified, default precision of 0
is picked.
TIMESTAMP
TIMESTAMP(p)
Timestamp, without time zone, consisting of year-month-day hour:minute:second[.fractional]
(up to nanosecond precision) with values between 0000-01-01 00:00:00.000000000
and 9999-12-31 23:59:59.999999999
.
Type parameter p
represents precision
which is the number of digits of fractional seconds (an integer between 0
and 9
).
If no precision specified, default precision of 6
is picked.
There are two types of formats for timestamps that are supported – sql
and iso8601
.
TIMESTAMP WITH LOCAL TIME ZONE
TIMESTAMP(p) WITH LOCAL TIME ZONE
TIMESTAMP_LTZ
TIMESTAMP_LTZ(p)
Timestamp with local time zone consisting of year-month-day hour:minute:second[.fractional] zone
(up to nanosecond precision) with values between 0000-01-01 00:00:00.000000000Z
to 9999-12-31 23:59:59.999999999Z
.
Values for this type assume Instant
semantics in the UTC time zone.
Type parameter p
represents precision
which is the number of digits of fractional seconds (an integer between 0
and 9
).
If no precision specified, default precision of 6
is picked. TIMESTAMP_LTZ
is a synonym for TIMESTAMP WITH LOCAL TIME ZONE
.
There are two types of formats for timestamps that are supported – sql
and iso8601
.
Binary string
VARBINARY
VARBINARY(n)
BYTES
Variable-length sequence of bytes.
Type parameter n
represents maximum length of sequence (number of bytes) and has to be an integer between 1
and 2,147,483,647
.
If no length specified, default length of 1
is picked.
BYTES
is a synonym for VARBINARY
with maximum length (VARBINARY(2147483647)
).
Constructed Data Types
ARRAY<t>
Array of elements with same data type.
Type parameter t
is required and represents data type of array elements.
The maximum number of elements of an array can not be specified and is fixed at 2,147,483,647
.
Similar to SQL, arrays are one-indexed (first element is at index 1
) and a given element can be accessed using the []
operator with the index.
Example. In a relation DDL statement, the expression employees ARRAY<VARCHAR>
declares a column named employees
with values of type array holding character strings. employees[6]
is used to access the 6th element in a given value from this column.
MAP<k,v>
Associative array that maps keys to values.
No duplicate keys can exist in a map and each key can map to at most one value.
Type parameters k
and v
are required and they represent data type of the key elements and the value elements, respectively, throughout the map.
Order of keys within a MAP
is not guaranteed, and may use one of ARRAY
or STRUCT
to guaranteed ordering within a set of queries.
For a given key, the associated value can be accessed using the []
operator with the key.
Example. In a relation DDL statement, the expression manager MAP<VARCHAR, VARCHAR>
declares a column named manager
with values of type map holding key value pairs of character string types. manager['HR']
is used to access the value associated with the key HR
in a given value instance from this column.
STRUCT<fn ft, ...>
Strongly typed structured data type that represents an ordered collection of one or more fields
.
A field
consists of a name and a data type and captures a specific value in a given instance of struct.
Sequence of type parameter pair(s) fn ft
represent name and data type for each and every field, in order.
A given field of a struct instance is accessed using the ->
operator with the field name.
Example. In a relation DDL statement, the expression address STRUCT<city VARCHAR, zipcode VARCHAR>
declares a column named address
with values of type struct holding two fields of character string types, named city
and zipcode
. address->city
is used to access the city
field in a given value instance from this column.
Last updated