Built-in Functions
The built-in functions described in this page are usable from DeltaStream SQL streaming queries.
Aggregate Functions
COUNT([ ALL | DISTINCT] expression)
By default or with ALL, returns the number of distinct input rows for which the expression
is not NULL
. Use DISTINCT for one unique instance of each value.
COUNT(*)
Returns the number of input rows.
AVG([ ALL | DISTINCT] expression)
SUM([ ALL | DISTINCT] expression)
By default or with ALL, returns the sum of the expression
across all input rows. Use DISTINCT for one unique instance of each value.
MAX([ ALL | DISTINCT] expression)
By default or with ALL, returns the maximum value of the expression
across all input rows. Use DISTINCT for one unique instance of each value.
MIN([ ALL | DISTINCT] expression)
By default or with ALL, returns the minimum value of the expression
across all input rows. Use DISTINCT for one unique instance of each value.
STDDEV_POP([ ALL | DISTINCT] expression)
By default or with ALL, returns the population standard deviation of the expression
across all input rows. Use DISTINCT for one unique instance of each value.
STDDEV_SAMP([ ALL | DISTINCT] expression)
By default or with ALL, returns the sample standard deviation of the expression
across all input rows. Use DISTINCT for one unique instance of each value.
VAR_POP([ ALL | DISTINCT] expression)
By default or with ALL, returns the population variance (square of the population standard deviation) of the expression
across all input rows. Use DISTINCT for one unique instance of each value.
VAR_SAMP([ ALL | DISTINCT] expression)
By default or with ALL, returns the sample variance (square of the sample standard deviation) of the expression
across all input rows. Use DISTINCT for one unique instance of each value.
VARIANCE([ ALL | DISTINCT] expression)
Synonyms for VAR_SAMP()
.
FIRST_VALUE(expression)
Returns the first value in an ordered set of values.
LAST_VALUE(expression)
Returns the last value in an ordered set of values.
LISTAGG(expression [, separator])
Concatenates the values of string expression
and places separator values between them. The separator
is not added at the end of string. The default value of separator
is ,
.
DS_LAG_<TYPE>(value <TYPE>, integer)
Returns the value
at integer
number of rows before the current row in the window. The type of the value must match the DS_LAG_<TYPE>
function and must not be NULL
.
If no such value can be found, NULL
is returned.
Note: if your value
is nullable, one workaround is to use the IFNULL
function to change nulls to some other value.
Example: DS_LAG_VARCHAR(IFNULL(col1, 'EMPTY'), 2)
Arithmetic Functions
+ numeric
Returns numeric
.
- numeric
Returns negative numeric
.
numeric1 + numeric2
Returns numeric1
plus numeric2
.
numeric1 - numeric2
Return numeric1
minus numeric2
.
numeric1 * numberic2
Returns numeric1
multiplied by numeric2
.
numeric1 / numeric2
Returns numeric1
divided by numeric2
.
numeric1 % numeric2
Returns the remainder (modulus) of numeric1
divided by numeric2
. The result is negative only if numeric1
is negative.
POWER(numeric1, numeric2)
numeric1
to the power of numeric2
.
ABS(numeric)
Returns the absolute value of numeric
.
SQRT(numeric)
Returns the square root of numeric
.
LN(numeric)
Returns the natural logarithm (base e) of numeric
.
LOG10(numeric)
Returns the base 10 logarithm of numeric
.
LOG2(numeric)
Returns the base 2 logarithm of numeric
.
LOG(numeric1 [, numeric2])
When called with one argument, returns the natural logarithm of numeric2
. When called with two arguments, this function returns the logarithm of numeric2
to the base numeric1
. numeric2
must be greater than 0
and numeric1
must be greater than 1
.
EXP(numeric)
Returns e raised to the power of numeric
.
CEIL(numeric) CEILING(numeric)
Rounds numeric
up, and returns the smallest number that is greater than or equal to numeric
.
FLOOR(numeric)
Rounds numeric
down, and returns the largest number that is less than or equal to numeric
.
SIN(numeric)
Returns the sine of numeric
.
SINH(numeric)
Returns the hyperbolic sine of numeric
. The return type is DOUBLE
.
COS(numeric)
Returns the cosine of numeric
.
TAN(numeric)
Returns the tangent of numeric
.
TANH(numeric)
Returns the hyperbolic tangent of numeric
. The return type is DOUBLE
.
COT(numeric)
Returns the cotangent of a numeric
.
ASIN(numeric)
Returns the arc sine of numeric
.
ACOS(numeric)
Returns the arc cosine of numeric
.
ATAN(numeric)
Returns the arc tangent of numeric
.
ATAN2(numeric1, numeric2)
Returns the arc tangent of a coordinate (numeric1, numeric2)
.
COSH(numeric)
Returns the hyperbolic cosine of numeric
. Return value type is DOUBLE
.
DEGREES(numeric)
Returns the degree representation of a radian numeric
.
RADIANS(numeric)
Returns the radian representation of a degree numeric
.
SIGN(numeric)
Returns the signum of numeric
.
ROUND(numeric, integer)
Returns a number rounded to integer
decimal places for numeric
.
PI()
Returns a value that is closer than any other values to pi.
E()
Returns a value that is closer than any other values to e.
RAND()
Returns a pseudorandom double value in the range [0.0, 1.0)
RAND(integer)
Returns a pseudorandom double value in the range [0.0, 1.0) with an initial seed integer
. Two RAND
functions will return identical sequences of numbers if they have the same initial seed.
RAND_INTEGER(integer)
Returns a pseudorandom integer value in the range [0, integer
).
RAND_INTEGER(integer1, integer2)
Returns a pseudorandom integer value in the range [0, integer2
) with an initial seed integer1
. Two RAND_INTGER
functions will return identical sequences of numbers if they have the same initial seed and bound.
UUID()
Returns a Universally Unique Identifier (UUID) string (e.g., 3d3c68f7-f608-473f-b60c-b0c44ad4cc4e
) according to RFC 4122 type 4 (pseudo randomly generated) UUID. The UUID is generated using a cryptographically strong pseudo random number generator.
BIN(integer)
Returns a string representation of integer
in binary format. Returns NULL
, if integer
is NULL
.
HEX(numeric) HEX(string)
Returns a string representation of an integer numeric
value or a string
in hex format. Returns NULL
, if the argument is NULL
.
TRUNCATE(numeric1 [, integer2])
Returns a numeric
of truncated to integer2
decimal places. Returns NULL
, if numeric1
or integer2
is NULL
. If integer2
is 0
, the result has no decimal point or fractional part. integer2
can be negative to cause integer2
digits left of the decimal point of the value to become zero.
This function can also pass in only one numeric1
parameter and not set Integer2
to use. If Integer2
is not set, the function truncates as if Integer2
were 0
.
Collection Functions
CARDINALITY(array)
Comparison Functions
value1 = value2
Returns TRUE
, if value1 is equal to value2
; returns NULL
, if value1
or value2
is NULL
.
value1 <> value2
Returns TRUE
, if value1
is not equal to value2
; returns NULL
, if value1
or value2
is NULL
.
value1 > value2
Returns TRUE
, if value1
is greater than value2
; returns NULL
, if value1
or value2
is NULL
.
value1 >= value2
Returns TRUE
, if value1
is greater than or equal to value2
; returns NULL
, if value1
or value2
is NULL
.
value1 < value2
Returns TRUE
, if value1
is less than value2
; returns NULL
, if value1
or value2
is NULL
.
value1 <= value2
Returns TRUE
, if value1
is less than or equal to value2
; returns NULL
, if value1
or value2
is NULL
.
value IS NULL
Returns TRUE
, if value is NULL
.
value IS NOT NULL
Returns TRUE
, if value is NULL
.
value1 IS DISTINCT FROM value2
Returns TRUE
, if two values are different. NULL
values are treated as identical.
value1 IS NOT DISTINCT FROM value2
Returns TRUE
, if two values are equal. NULL
values are treated as identical.
value1 BETWEEN value2 AND value3
Returns TRUE
, if value1
is greater than or equal to value2
and less than or equal to value3
. When either value2
or value3
is NULL
, returns FALSE
or NULL
.
value1 NOT BETWEEN value2 AND value3
Returns TRUE
, if value1
is less than value2
or greater than value3
. When either value2
or value3
is NULL
, returns TRUE
or NULL
.
string1 LIKE string2
Returns TRUE
, if string1
matches pattern string2
; returns NULL
, if string1
or string2
is NULL
.
string1 NOT LIKE string2
Returns TRUE
, if string1
does not match pattern string2
; returns NULL
if string1
or string2
is NULL
.
value1 IN (value2 [, value3]* )
Returns TRUE
, if value1
exists in the given list (value2, value3, …)
. When (value2, value3, …)
. contains NULL
, returns TRUE
, if the element can be found and NULL
otherwise. Always returns NULL
if value1
is NULL
.
value1 NOT IN (value2 [, value3]* )
Returns TRUE
if value1 does not exist in the given list (value2, value3, …)
. When (value2, value3, …)
. contains NULL
, returns FALSE
if value1
can be found and NULL
otherwise. Always returns NULL
if value1
is NULL
.
Conditional Functions
CASE value WHEN value1_1 [, value1_2]* THEN RESULT1 (WHEN value2_1 [, value2_2 ]* THEN result_2)* (ELSE result_z) END
Returns resultX
when the first time value is contained in (valueX_1
, valueX_2
, …). When no value matches, returns result_z
, if it is provided and returns NULL
otherwise.
CASE WHEN condition1 THEN result1 (WHEN condition2 THEN result2)* (ELSE result_z) END
Returns resultX
when the first conditionX
is met. When no condition is met, returns result_z
, if it is provided and returns NULL
otherwise.
NULLIF(value1, value2)
Returns NULL
if value1
is equal to value2
; returns value1
otherwise.
COALESCE(value1 [, value2]*)
Returns the first argument that is not NULL
.
If all arguments are NULL
, the query fails. The return type is the least restrictive and common type of all of its arguments. The return type is nullable if all arguments are nullable as well.
IF(condition, true_value, false_value)
Returns the true_value
if the condition is met, otherwise false_value
.
IFNULL(input, null_replacement)
Returns null_replacement if input is NULL
; otherwise input is returned.
Compared to COALESCE
or CASE WHEN
, this function returns a data type that is very specific in terms of nullability. The returned type is the common type of both arguments but only nullable if the null_replacement is nullable.
The function allows to pass nullable columns into a function or table that is declared with a NOT NULL
constraint.
IS_ALPHA(string)
Returns TRUE
if all characters in string
are letter; otherwise FALSE
.
IS_DECIMAL(string)
Returns TRUE
if string
can be parsed to a valid numeric; otherwise FALSE
.
IS_DIGIT(string)
Returns TRUE
, if all characters in string
are digits; otherwise FALSE
.
GREATEST(value1[, value2]*)
Returns the greatest value of the list of arguments. Returns NULL
, if any argument is NULL
.
LEAST(value1[, value2]*)
Returns the least value of the list of arguments. Returns NULL
, if any argument is NULL
.
Hash Functions
MD5(string)
Returns the MD5 hash of string
as a string of 32 hexadecimal digits; returns NULL
, if string
is NULL
.
SHA1(string)
Returns the SHA-1 hash of string
as a string of 40 hexadecimal digits; returns NULL
, if string
is NULL
.
SHA224(string)
Returns the SHA-224 hash of string
as a string of 56 hexadecimal digits; returns NULL
, if string
is NULL
.
SHA256(string)
Returns the SHA-256 hash of string
as a string of 64 hexadecimal digits; returns NULL
, if string
is NULL
.
SHA384(string)
Returns the SHA-384 hash of string
as a string of 96 hexadecimal digits; returns NULL
, if string
is NULL
.
SHA512(string)
Returns the SHA-512 hash of string
as a string of 128 hexadecimal digits; returns NULL
, if string
is NULL
.
SHA2(string, hashLength)
Returns the hash of string
using the SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, or SHA-512). hashLength
is the bit length of the result (224
, 256
, 384
, or 512
). Returns NULL
, if string
or hashLength
is NULL
.
DS_HASHUUID(string)
Returns the hash of string
to a UUID (returned as a String) generated from the string
bytes.
Logical Functions
boolean1 OR boolean2
Returns TRUE
, if boolean1
is TRUE
or boolean2
is TRUE
. Supports three-valued logic.
boolean1 AND boolean2
Returns TRUE
, if boolean1
and boolean2
are both TRUE
. Supports three-valued logic.
NOT boolean
Returns TRUE
, if boolean
is FALSE
; returns FALSE
, if boolean
is TRUE
; returns NULL
, if boolean
is NULL
.
Offset Functions
These functions are used in the scope of the MATCH_RECOGNIZE operator:
LAST(variable.column [, n])
Returns the value of the column
from the record that was mapped to the n
-th last element of the variable
. The counting starts at the last element mapped.
If n
is not provided, the last occurance of the variable definition is returned.
FIRST(variable.column [, n])
Returns the value of the column
from the record that was mapped to the n
-th element of the variable
. The counting starts at the first element mapped.
If n
is not provided, the first occurance of the variable definition is returned.
String Functions
string1 || string2
Returns the concatenation of string1
and string2
.
CHAR_LENGTH(string) CHARACTER_LENGTH(string)
Returns the number of characters in string
.
UPPER(string)
Returns string
in uppercase.
LOWER(string)
Returns string
in lowercase.
POSITION(string1 IN string2)
Returns the position (start from 1) of the first occurrence of string1
in string2
; returns 0
if string1
cannot be found in string2
.
TRIM([ BOTH | LEADING | TRAILING ] string1 FROM string2)
Returns a string that removes leading and/or trailing characters string2
from string2
. By default, whitespaces at both sides are removed.
LTRIM(string)
Returns a string that removes the left whitespaces from string
.
RTRIM(string)
Returns a string that removes the right whitespaces from string
.
REPEAT(string, integer)
Returns a string that repeats the base string
, integer
times.
REGEXP_REPLACE(string1, string2, string3)
Returns a string from string1
with all the substrings that match a regular expression string2
consecutively being replaced with string3
.
OVERLAY(string1 PLACING string2 FROM integer1 [ FOR integer2 ])
Returns a string that replaces integer2
(string2
’s length by default) characters of string1
with string2
from position integer1
.
SUBSTRING(string FROM integer1 [ FOR integer2 ])
Returns a substring of string
starting from position integer1
with length integer2
(to the end by default).
REPLACE(string1, string2, string3)
Returns a new string which replaces all the occurrences of string2
with string3
(non-overlapping) from string1
.
REGEXP_EXTRACT(string1, string2[, integer])
Returns a string from string1
, which extracted with a specified regular expression string2
and a regex match group index integer
.
The regex match group index starts from 1
and 0
means matching the whole regex. In addition, the regex match group index should not exceed the number of the defined groups.
INITCAP(string)
Returns a new form of string
with the first character of each word converted to uppercase and the rest characters to lowercase. Here a word means a sequence of alphanumeric characters.
CONCAT(string1, string2,...)
Returns a string that concatenates string1
, string2
, … Returns NULL
, if any argument is NULL
.
CONCAT_WS(string1, string2, string3,...)
Returns a string that concatenates string2
, string3
, … with a separator string1
. The separator is added between the strings to be concatenated. Returns NULL
, If string1
is NULL
. Compared with CONCAT()
, CONCAT_WS()
automatically skips NULL
arguments.
LPAD(string1, integer, string2)
Returns a new string from string1
left-padded with string2
to a length of integer
characters. If the length of string1
is shorter than integer
, returns string1
shortened to integer
characters.
RPAD(string1, integer, string2)
Returns a new string from string1
right-padded with string2
to a length of integer
characters. If the length of string1
is shorter than integer
, returns string1
shortened to integer
characters.
FROM_BASE64(string)
Returns the base64-decoded result from string
; returns NULL
, if string is NULL
.
TO_BASE64(string)
Returns the base64-encoded result from string
; returns NULL
, if string is NULL
.
DS_FROMBASE64(string)
Returns the decoded bytes as a value of the data type BYTES
for a base64-encoded string
; returns NULL
, if string is NULL
.
DS_TOBASE64(bytes)
Returns the base64-encoded string
from BYTES
; returns NULL
, if input bytes argument is NULL
.
ASCII(string)
Returns the numeric value of the first character of string. Returns NULL
if string is NULL
.
CHR(integer)
Returns the ASCII character having the binary equivalent to an integer
. If an integer
is larger than 255, we will get the modulus of integer divided by 255 first, and returns CHR
of the modulus. Returns NULL
, if integer is NULL
.
DECODE(binary, string)
Decodes binary
into a STRING
using the string
character set. Valid character sets are US-ASCII
, ISO-8859-1
, UTF-8
, UTF-16BE
, UTF-16LE
, UTF-16
. If either argument is NULL
, the result will also be NULL
.
ENCODE(string1, string2)
Encodes string1
into a BINARY
using string2
character set. Valid character sets are US-ASCII
, ISO-8859-1
, UTF-8
, UTF-16BE
, UTF-16LE
, UTF-16
. If either argument is NULL
, the result will also be NULL
.
INSTR(string1, string2)
Returns the position of the first occurrence of string2
in string1
. Returns NULL
, if any of the arguments is NULL
.
LEFT(string, integer)
Returns the leftmost integer
characters from the string
. Returns EMPTY
String, if integer
is negative. Returns NULL
, if any argument is NULL
.
RIGHT(string, integer)
Returns the rightmost integer
characters from the string
. Returns EMPTY
String, if integer
is negative. Returns NULL
if any argument is NULL
.
LOCATE(string1, string2[, integer])
Returns the position of the first occurrence of string1
in string2
after position integer
. Returns 0
, if not found. Returns NULL
, if any argument is NULL
.
PARSE_URL(string1, string2[, string3])
Returns the specified part from the URL. Valid values for string2
include HOST
, PATH
, QUERY
, REF
, PROTOCOL
, AUTHORITY
, FILE
, and USERINFO
. Returns NULL
if any argument is NULL
.
A value of a particular key in string1
can also be extracted by providing the key as the third argument string3
.
REGEXP(string1, string2)
Returns TRUE
if any (possibly empty) substring of string1
matches the Java regular expression string2
; otherwise FALSE
. Returns NULL
, if an argument is NULL
.
REVERSE(string)
Returns the reversed string. Returns NULL
if string is NULL
.
SPLIT_INDEX(string1, string2, integer)
Splits string1
by the delimiter string2
, returns the integer
index (zero-based) string of the split strings. Returns NULL
, if integer
is negative. Returns NULL
, if any argument is NULL
.
STR_TO_MAP(string1[, string2, string3])
Returns a map after splitting the string1
into key/value pairs using delimiters. string2
is the pair delimiter, default is ,
. And string3
is the key-value delimiter, default is =
. Both the pair delimiter and key-value delimiter are treated as regular expressions. So special characters (e.g. <([{\^-=$!|]})?*+.>
) need to be properly escaped before using as a delimiter literally.
SUBSTR(string, integer1[, integer2])
Returns a substring of string
starting from position integer1
with length integer2
(to the end by default).
Temporal Functions
DATE string
Returns a SQL date parsed from string
in form of yyyy-MM-dd
.
TIME string
Returns a SQL time parsed from string
in form of HH:mm:ss
.
NOW()
Returns the current SQL timestamp in the local time zone; this is a synonym of CURRENT_TIMESTAMP()
.
CURRENT_ROW_TIMESTAMP()
Returns the current SQL timestamp in the local time zone; the return type is TIMESTAMP_LTZ(3)
. It is evaluated for each record no matter in batch or streaming mode.
EXTRACT(timeinteravlunit FROM temporal)
Returns a long value extracted from the timeintervalunit
part of temporal
.
YEAR(date)
Returns the year from the SQL date. Equivalent to EXTRACT(YEAR FROM date)
.
QUARTER(date)
Returns the quarter of a year (an integer between 1 and 4) from the SQL date. Equivalent to EXTRACT(QUARTER FROM date)
.
MONTH(date)
Returns the month of a year (an integer between 1 and 12) from the SQL date. Equivalent to EXTRACT(MONTH FROM date)
.
WEEK(date)
Returns the week of a year (an integer between 1 and 53) from the SQL date. Equivalent to EXTRACT(WEEK FROM date)
.
DAYOFYEAR(date)
Returns the day of a year (an integer between 1 and 366) from the SQL date. Equivalent to EXTRACT(DOY FROM date)
.
DAYOFMONTH(date)
Returns the day of a month (an integer between 1 and 31) from the SQL date. Equivalent to EXTRACT(DAY FROM date)
.
DAYOFWEEK(date)
Returns the day of a week (an integer between 1 and 7) from the SQL date. Equivalent to EXTRACT(DOW FROM date)
.
HOUR(timestamp)
Returns the hour of a day (an integer between 0 and 23) from the SQL timestamp. Equivalent to EXTRACT(HOUR FROM timestamp)
.
MINUTE(timestamp)
Returns the minute of an hour (an integer between 0 and 59) from the SQL timestamp. Equivalent to EXTRACT(MINUTE FROM timestamp)
.
SECOND(timestamp)
Returns the second of a minute (an integer between 0 and 59) from the SQL timestamp. Equivalent to EXTRACT(SECOND FROM timestamp)
.
FLOOR(timepoint TO timeintervalunit)
Returns a value that rounds the timepoint
down to the time unit timeintervalunit
.
CEIL(timepoint TO timeintervaluntit)
Returns a value that rounds the timepoint
up to the time unit timeintervalunit
.
CONVERT_TZ(string1, string2, string3)
Converts a datetime string1
(with default ISO timestamp format yyyy-MM-dd HH:mm:ss
) from time zone string2
to time zone string3
. The format of the time zone should be either an abbreviation such as PST
, a full name such as America/Los_Angeles
, or a custom ID such as GMT-08:00
.
FROM_UNIXTIME(numeric[, string])
Returns a representation of the numeric
argument as a value in string
format (default is yyyy-MM-dd HH:mm:ss
). numeric
is an internal timestamp value representing seconds since 1970-01-01 00:00:00
UTC, such as produced by the UNIX_TIMESTAMP()
function. The return value is expressed in the session time zone.
UNIX_TIMESTAMP()
Gets current Unix timestamp in seconds. This function is not deterministic, which means the value would be recalculated for each record.
UNIX_TIMESTAMP(string1[, string2])
Converts date time string string1
in format string2
(by default: yyyy-MM-dd HH:mm:ss
, if not specified) to a Unix timestamp (in seconds), using the specified timezone in table config.
TO_DATE(string1[, string2])
Converts a date string string1
with format string2
(by default yyyy-MM-dd
) to a date.
TO_TIMESTAMP_LTZ(numeric, precision)
Converts epoch seconds or epoch milliseconds to a TIMESTAMP_LTZ
; the valid precision is 0
or 3
, and the 0
represents TO_TIMESTAMP_LTZ(epochSeconds, 0)
, the 3
represents TO_TIMESTAMP_LTZ(epochMilliseconds, 3)
.
TO_TIMESTAMP(string1[, string2])
Converts date time string string1
with format string2
(by default: yyyy-MM-dd HH:mm:ss
) under the UTC+0
time zone to a timestamp.
DS_TOEPOCH(<TIMESTAMP_LTZ>)
Converts a TIMESTAMP_LTZ
instance (in the UTC time zone) to its epoch milliseconds.
Type Conversion Functions
CAST(expression AS datatype)
Returns a new value being cast to datatype
. A CAST()
error throws an exception and stops a query. When performing a cast operation that may fail, like STRING
to INTEGER
, one should rather use TRY_CAST()
, in order to handle errors.
TRY_CAST(expression AS datatype)
Like CAST()
, but in case of error, returns NULL
rather than failing the query.
Value Construction Functions
STRUCT(expression [, expression, ...])
Returns a new STRUCT<ft1, ft2, ...>
created from a list of items (expression1item, expression2item, …)
from (expression1, expression2, ...)
, respectively.
See CAST()
and TRY_CAST()
for user-defined field names for constructing a new STRUCT<fn ft, ...>
value.
MAP[expression := expression [, expression := expression, ...]]
Returns a MAP<k,v>
created from a list of key-value pairs ((k1, v1), (k2, v2), …).
ARRAY[expression [, expression, ...]]
We use Apache Flink® as our processing engine, and most of these functions come fully-supported with Flink® out of the box. We'd like to thank and give credit to the Flink® contributors and community for providing a great open source project.
Last updated