Builtin Functions

Numeric Functions

abs

  • Syntax:

    abs(numeric_value)
    
  • Computes the absolute value of the argument.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • The absolute value of the argument with the same type as the input argument,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": abs(2013), "v2": abs(-4036), "v3": abs(0), "v4": abs(float("-2013.5")), "v5": abs(double("-2013.593823748327284")) };
    
  • The expected result is:

    { "v1": 2013, "v2": 4036, "v3": 0, "v4": 2013.5, "v5": 2013.5938237483274 }
    

acos

  • Syntax:

    acos(numeric_value)
    
  • Computes the arc cosine value of the argument.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • the double arc cosine in radians for the argument, if the argument is in the range of -1 (inclusive) to 1 (inclusive),
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error,
    • “NaN” for other legitimate numeric values.
  • Example:

    { "v1": acos(1), "v2": acos(2), "v3": acos(0), "v4": acos(float("0.5")), "v5": acos(double("-0.5")) };
    
  • The expected result is:

    { "v1": 0.0, "v2": "NaN", "v3": 1.5707963267948966, "v4": 1.0471975511965979, "v5": 2.0943951023931957 }
    

asin

  • Syntax:

    asin(numeric_value)
    
  • Computes the arc sine value of the argument.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • the double arc sin in radians for the argument, if the argument is in the range of -1 (inclusive) to 1 (inclusive),
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error,
    • “NaN” for other legitimate numeric values.
  • Example:

    { "v1": asin(1), "v2": asin(2), "v3": asin(0), "v4": asin(float("0.5")), "v5": asin(double("-0.5")) };
    
  • The expected result is:

    { "v1": 1.5707963267948966, "v2": "NaN", "v3": 0.0, "v4": 0.5235987755982989, "v5": -0.5235987755982989 }
    

atan

  • Syntax:

    atan(numeric_value)
    
  • Computes the arc tangent value of the argument.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • the double arc tangent in radians for the argument,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": atan(1), "v2": atan(2), "v3": atan(0), "v4": atan(float("0.5")), "v5": atan(double("1000")) };
    
  • The expected result is:

    { "v1": 0.7853981633974483, "v2": 1.1071487177940904, "v3": 0.0, "v4": 0.4636476090008061, "v5": 1.5697963271282298 }
    

atan2

  • Syntax:

    atan2(numeric_value1, numeric_value2)
    
  • Computes the arc tangent value of numeric_value2/numeric_value1.

  • Arguments:
    • numeric_value1: a tinyint/smallint/integer/bigint/float/double value,
    • numeric_value2: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • the double arc tangent in radians for numeric_value1 and numeric_value2,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": atan2(1, 2), "v2": atan2(0, 4), "v3": atan2(float("0.5"), double("-0.5")) };
    
  • The expected result is:

    { "v1": 0.4636476090008061, "v2": 0.0, "v3": 2.356194490192345 }
    

ceil

  • Syntax:

    ceil(numeric_value)
    
  • Computes the smallest (closest to negative infinity) number with no fractional part that is not less than the value of the argument. If the argument is already equal to mathematical integer, then the result is the same as the argument.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • The ceiling value for the given number in the same type as the input argument,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    {
      "v1": ceil(2013),
      "v2": ceil(-4036),
      "v3": ceil(0.3),
      "v4": ceil(float("-2013.2")),
      "v5": ceil(double("-2013.893823748327284"))
    };
    
  • The expected result is:

    { "v1": 2013, "v2": -4036, "v3": 1.0, "v4": -2013.0, "v5": -2013.0 }
    

cos

  • Syntax:

    cos(numeric_value)
    
  • Computes the cosine value of the argument.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • the double cosine value for the argument,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": cos(1), "v2": cos(2), "v3": cos(0), "v4": cos(float("0.5")), "v5": cos(double("1000")) };
    
  • The expected result is:

    { "v1": 0.5403023058681398, "v2": -0.4161468365471424, "v3": 1.0, "v4": 0.8775825618903728, "v5": 0.562379076290703 }
    

cosh

  • Syntax:

    cosh(numeric_value)
    
  • Computes the hyperbolic cosine value of the argument.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • the double hyperbolic cosine value for the argument,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": cosh(1), "v2": cosh(2), "v3": cosh(0), "v4": cosh(float("0.5")), "v5": cosh(double("8")) };
    
  • The expected result is:

    { "v1": 1.5430806348152437, "v2": 3.7621956910836314, "v3": 1.0, "v4": 1.1276259652063807, "v5": 1490.479161252178 }
    

degrees

  • Syntax:

    degrees(numeric_value)
    
  • Converts radians to degrees

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • The degrees value for the given radians value. The returned value has type double,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": degrees(pi()) };
    
  • The expected result is:

    { "v1": 180.0 }
    

e

  • Syntax:

    e()
    
  • Return Value:

    • e (base of the natural logarithm)
  • Example:

    { "v1": e() };
    
  • The expected result is:

    { "v1": 2.718281828459045 }
    

exp

  • Syntax:

    exp(numeric_value)
    
  • Computes enumeric_value.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • enumeric_value,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": exp(1), "v2": exp(2), "v3": exp(0), "v4": exp(float("0.5")), "v5": exp(double("1000")) };
    
  • The expected result is:

    { "v1": 2.718281828459045, "v2": 7.38905609893065, "v3": 1.0, "v4": 1.6487212707001282, "v5": "Infinity" }
    

floor

  • Syntax:

    floor(numeric_value)
    
  • Computes the largest (closest to positive infinity) number with no fractional part that is not greater than the value. If the argument is already equal to mathematical integer, then the result is the same as the argument.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • The floor value for the given number in the same type as the input argument,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    {
      "v1": floor(2013),
      "v2": floor(-4036),
      "v3": floor(0.8),
      "v4": floor(float("-2013.2")),
      "v5": floor(double("-2013.893823748327284"))
    };
    
  • The expected result is:

    { "v1": 2013, "v2": -4036, "v3": 0.0, "v4": -2014.0, "v5": -2014.0 }
    

ln

  • Syntax:

    ln(numeric_value)
    
  • Computes logenumeric_value.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • logenumeric_value,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": ln(1), "v2": ln(2), "v3": ln(0), "v4": ln(float("0.5")), "v5": ln(double("1000")) };
    
  • The expected result is:

    { "v1": 0.0, "v2": 0.6931471805599453, "v3": "-Infinity", "v4": -0.6931471805599453, "v5": 6.907755278982137 }
    

log

  • Syntax:

    log(numeric_value)
    
  • Computes log10numeric_value.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • log10numeric_value,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": log(1), "v2": log(2), "v3": log(0), "v4": log(float("0.5")), "v5": log(double("1000")) };
    
  • The expected result is:

    { "v1": 0.0, "v2": 0.3010299956639812, "v3": "-Infinity", "v4": -0.3010299956639812, "v5": 3.0 }
    

pi

  • Syntax:

    pi()
    
  • Return Value:

    • Pi
  • Example:

    { "v1": pi() };
    
  • The expected result is:

    { "v1": 3.141592653589793 }
    

power

  • Syntax:

    power(numeric_value1, numeric_value2)
    
  • Computes numeric_value1numeric_value2.

  • Arguments:
    • numeric_value1: a tinyint/smallint/integer/bigint/float/double value,
    • numeric_value2: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • numeric_value1numeric_value2,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": power(1, 2), "v3": power(0, 4), "v4": power(float("0.5"), double("-0.5")) };
    
  • The expected result is:

    { "v1": 1, "v3": 0, "v4": 1.4142135623730951 }
    

radians

  • Syntax:

    radians(numeric_value)
    
  • Converts degrees to radians

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • The radians value for the given degrees value. The returned value has type double,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": radians(180) };
    
  • The expected result is:

    { "v1": 3.141592653589793 }
    

round

  • Syntax:

    round(numeric_value[, round_digit])
    
  • Rounds the value to the given number of integer digits to the right of the decimal point, or to the left of the decimal point if the number of digits is negative.

  • Arguments:

    • numeric_value: a tinyint/smallint/integer/bigint/float/double value that represents the numeric value to be rounded.
    • round_digit: (Optional) a tinyint/smallint/integer/bigint/float/double value that specifies the digit to round to. This argument may be positive or negative; positive indicating that rounding needs to be to the right of the decimal point, and negative indicating that rounding needs to be to the left of the decimal point. Values such as 1.0 and 2.0 are acceptable, but values such as 1.3 and 1.5 result in a null. If omitted, the default is 0.
  • Return Value:
    • The rounded value for the given number. The returned value has the following type:
      • bigint if the input value has type tinyint, smallint, integer or bigint,
      • float if the input value has type float,
      • double if the input value has type double;
    • missing if the input value is a missing value,
    • null if the input value is a null value,
    • any other non-numeric input value will return a null value.
  • Example:

    {
      "v1": round(2013),
      "v2": round(-4036),
      "v3": round(0.8),
      "v4": round(float("-2013.256")),
      "v5": round(double("-2013.893823748327284"))
      "v6": round(123456, -1),
      "v7": round(456.456, 2),
      "v8": round(456.456, -1),
      "v9": round(-456.456, -2)
    };
    
  • The expected result is:

    { "v1": 2013, "v2": -4036, "v3": 1.0, "v4": -2013.0, "v5": -2014.0, "v6": 123460, "v7": 456.46, "v8": 460, "v9": -500 }
    

sign

  • Syntax:

    sign(numeric_value)
    
  • Computes the sign of the argument.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • the sign (a tinyint) of the argument, -1 for negative values, 0 for 0, and 1 for positive values,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": sign(1), "v2": sign(2), "v3": sign(0), "v4": sign(float("0.5")), "v5": sign(double("-1000")) };
    
  • The expected result is:

    { "v1": 1, "v2": 1, "v3": 0, "v4": 1, "v5": -1 }
    

sin

  • Syntax:

    sin(numeric_value)
    
  • Computes the sine value of the argument.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • the double sine value for the argument,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": sin(1), "v2": sin(2), "v3": sin(0), "v4": sin(float("0.5")), "v5": sin(double("1000")) };
    
  • The expected result is:

    { "v1": 0.8414709848078965, "v2": 0.9092974268256817, "v3": 0.0, "v4": 0.479425538604203, "v5": 0.8268795405320025 }
    

sinh

  • Syntax:

    sinh(numeric_value)
    
  • Computes the hyperbolic sine value of the argument.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • the double hyperbolic sine value for the argument,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": sinh(1), "v2": sinh(2), "v3": sinh(0), "v4": sinh(float("0.5")), "v5": sinh(double("8")) };
    
  • The expected result is:

    { "v1": 1.1752011936438014, "v2": 3.626860407847019, "v3": 0.0, "v4": 0.5210953054937474, "v5": 1490.4788257895502 }
    

sqrt

  • Syntax:

    sqrt(numeric_value)
    
  • Computes the square root of the argument.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • the double square root value for the argument,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": sqrt(1), "v2": sqrt(2), "v3": sqrt(0), "v4": sqrt(float("0.5")), "v5": sqrt(double("1000")) };
    
  • The expected result is:

    { "v1": 1.0, "v2": 1.4142135623730951, "v3": 0.0, "v4": 0.7071067811865476, "v5": 31.622776601683793 }
    

tan

  • Syntax:

    tan(numeric_value)
    
  • Computes the tangent value of the argument.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • the double tangent value for the argument,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": tan(1), "v2": tan(2), "v3": tan(0), "v4": tan(float("0.5")), "v5": tan(double("1000")) };
    
  • The expected result is:

    { "v1": 1.5574077246549023, "v2": -2.185039863261519, "v3": 0.0, "v4": 0.5463024898437905, "v5": 1.4703241557027185 }
    

tanh

  • Syntax:

    tanh(numeric_value)
    
  • Computes the hyperbolic tangent value of the argument.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
  • Return Value:
    • the double hyperbolic tangent value for the argument,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": tanh(1), "v2": tanh(2), "v3": tanh(0), "v4": tanh(float("0.5")), "v5": tanh(double("8")) };
    
  • The expected result is:

    { "v1": 0.7615941559557649, "v2": 0.964027580075817, "v3": 0.0, "v4": 0.4621171572600098, "v5": 0.999999774929676 }
    

trunc

  • Syntax:

    trunc(numeric_value, number_digits)
    
  • Truncates the number to the given number of integer digits to the right of the decimal point (left if digits is negative). Digits is 0 if not given.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value,
    • number_digits: a tinyint/smallint/integer/bigint value.
  • Return Value:
    • the double tangent value for the argument,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is missing,
    • a type error will be raised if:
      • the first argument is any other non-numeric value,
      • the second argument is any other non-tinyint, non-smallint, non-integer, and non-bigint value.
  • Example:

    { "v1": trunc(1, 1), "v2": trunc(2, -2), "v3": trunc(0.122, 2), "v4": trunc(float("11.52"), -1), "v5": trunc(double("1000.5252"), 3) };
    
  • The expected result is:

    { "v1": 1, "v2": 2, "v3": 0.12, "v4": 10.0, "v5": 1000.525 }
    

round_half_to_even

  • Syntax:

    round_half_to_even(numeric_value, [precision])
    
  • Computes the closest numeric value to numeric_value that is a multiple of ten to the power of minus precision. precision is optional and by default value 0 is used.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint/float/double value.
    • precision: an optional tinyint/smallint/integer/bigint field representing the number of digits in the fraction of the the result
  • Return Value:
    • The rounded value for the given number in the same type as the input argument,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first argument is any other non-numeric value,
      • or, the second argument is any other non-tinyint, non-smallint, non-integer, or non-bigint value.
  • Example:

    {
      "v1": round_half_to_even(2013),
      "v2": round_half_to_even(-4036),
      "v3": round_half_to_even(0.8),
      "v4": round_half_to_even(float("-2013.256")),
      "v5": round_half_to_even(double("-2013.893823748327284")),
      "v6": round_half_to_even(double("-2013.893823748327284"), 2),
      "v7": round_half_to_even(2013, 4),
      "v8": round_half_to_even(float("-2013.256"), 5)
    };
    
  • The expected result is:

    { "v1": 2013, "v2": -4036, "v3": 1.0, "v4": -2013.0, "v5": -2014.0, "v6": -2013.89, "v7": 2013, "v8": -2013.256 }
    

String Functions

concat

  • Syntax:

    concat(string1, string2, ...)
    
  • Returns a concatenated string from arguments.

  • Arguments:
    • string1: a string value,
    • string2: a string value,
    • ….
  • Return Value:
    • a concatenated string from arguments,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-string input value will cause a type error.
  • Example:

    concat("test ", "driven ", "development");
    
  • The expected result is:

    "test driven development"
    

contains

  • Syntax:

    contains(string, substring_to_contain)
    
  • Checks whether the string string contains the string substring_to_contain

  • Arguments:
    • string : a string that might contain the given substring,
    • substring_to_contain : a target string that might be contained.
  • Return Value:
    • a boolean value, true if string contains substring_to_contain,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-string input value will cause a type error,
    • false otherwise.
  • Note: an n_gram index can be utilized for this function.

  • Example:
    { "v1": contains("I like x-phone", "phone"), "v2": contains("one", "phone") };
    
  • The expected result is:

    { "v1": true, "v2": false }
    

ends_with

  • Syntax:

    ends_with(string, substring_to_end_with)
    
  • Checks whether the string string ends with the string substring_to_end_with.

  • Arguments:
    • string : a string that might end with the given string,
    • substring_to_end_with : a string that might be contained as the ending substring.
  • Return Value:
    • a boolean value, true if string contains substring_to_contain,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-string input value will cause a type error,
    • false otherwise.
  • Example:

    {
      "v1": ends_with(" love product-b its shortcut_menu is awesome:)", ":)"),
      "v2": ends_with(" awsome:)", ":-)")
    };
    
  • The expected result is:

    { "v1": true, "v2": false }
    

initcap (or title)

  • Syntax:

    initcap(string)
    
  • Converts a given string string so that the first letter of each word is uppercase and every other letter is lowercase. The function has an alias called “title”.

  • Arguments:
    • string : a string to be converted.
  • Return Value:
    • a string as the title form of the given string,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-string input value will cause a type error.
  • Example:

    { "v1": initcap("ASTERIXDB is here!"), "v2": title("ASTERIXDB is here!") };
    
  • The expected result is:

    { "v1": "Asterixdb Is Here!", "v2": "Asterixdb Is Here!" }
    

length

  • Syntax:

    length(string)
    
  • Returns the length of the string string. Note that the length is in the unit of code point. See the following examples for more details.

  • Arguments:
    • string : a string or null that represents the string to be checked.
  • Return Value:
    • an bigint that represents the length of string,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-string input value will cause a type error.
  • Example:

    length("test string");
    
  • The expected result is:

    11
    
  • Example:

    length("👩‍👩‍👧‍👦");
    
  • The expected result is (the emoji character 👩‍👩‍👧‍👦 has 7 code points):

    7
    

lower

  • Syntax:

    lower(string)
    
  • Converts a given string string to its lowercase form.

  • Arguments:
    • string : a string to be converted.
  • Return Value:
    • a string as the lowercase form of the given string,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-string input value will cause a type error.
  • Example:

    lower("ASTERIXDB");
    
  • The expected result is:

    "asterixdb"
    

ltrim

  • Syntax:

    ltrim(string[, chars]);
    
  • Returns a new string with all leading characters that appear in chars removed. By default, white space is the character to trim. Note that here one character means one code point. For example, the emoji 4-people-family notation “👩‍👩‍👧‍👦” contains 7 code points, and it is possible to trim a few code points (such as a 2-people-family “👨‍👦”) from it. See the following example for more details.

  • Arguments:
    • string : a string to be trimmed,
    • chars : a string that contains characters that are used to trim.
  • Return Value:
    • a trimmed, new string,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-string input value will cause a type error.
  • Related functions: see trim(), rtrim()
  • Example:

    ltrim("me like x-phone", "eml");
    
  • The expected result is:

    " like x-phone"
    
  • Example with multi-codepoint notation (trim the man and boy from the family of man, woman, girl and boy):

    ltrim("👨‍👩‍👧‍👦", "👨‍👦")
    
  • The expected result is (only woman, girl and boy are left in the family):

    "👩‍👧‍👦"
    

position

  • Syntax:

    position(string, string_pattern)
    
  • Returns the first position of string_pattern within string. The result is counted in the unit of code points. See the following example for more details.

  • The function returns the 0-based position. Another version of the function returns the 1-based position. Below are the aliases for each version:

    • 0-based: position, pos, position0, pos0.
    • 1-based: position1, pos1.
  • Arguments:

    • string : a string that might contain the pattern.
    • string_pattern : a pattern string to be matched.
  • Return Value:
    • the first position that string_pattern appears within string (starting at 0), or -1 if it does not appear,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-string input value will return a null.
  • Example:

    {
      "v1": position("ppphonepp", "phone"),
      "v2": position("hone", "phone"),
      "v3": position1("ppphonepp", "phone"),
      "v4": position1("hone", "phone")
    };
    
  • The expected result is:

    { "v1": 2, "v2": -1, v3": 3, "v4": -1 }
    
  • Example of multi-code-point character:

    position("👩‍👩‍👧‍👦🏀", "🏀");
    
  • The expected result is (the emoji family character has 7 code points):

    7
    

regexp_contains

  • Syntax:

    regexp_contains(string, string_pattern[, string_flags])
    
  • Checks whether the strings string contains the regular expression pattern string_pattern (a Java regular expression pattern).

  • Aliases:

    • regexp_contains, regex_contains, contains_regexp, contains_regex.
  • Arguments:

    • string : a string that might contain the pattern.
    • string_pattern : a pattern string to be matched.
    • string_flag : (Optional) a string with flags to be used during regular expression matching.
      • The following modes are enabled with these flags: dotall (s), multiline (m), case_insensitive (i), and comments and whitespace (x).
  • Return Value:
    • a boolean, returns true if string contains the pattern string_pattern, false otherwise.
    • missing if any argument is a missing value.
    • null if any argument is a null value but no argument is a missing value.
    • any other non-string input value will return a null.
  • Example:

    {
      "v1": regexp_contains("pphonepp", "p*hone"),
      "v2": regexp_contains("hone", "p+hone")
    };
    
  • The expected result is:

    { "v1": true, "v2": false }
    

regexp_like

  • Syntax:

    regexp_like(string, string_pattern[, string_flags])
    
  • Checks whether the string string exactly matches the regular expression pattern string_pattern (a Java regular expression pattern).

  • Aliases:

    • regexp_like, regex_like.
  • Arguments:

    • string : a string that might contain the pattern.
    • string_pattern : a pattern string that might be contained.
    • string_flag : (Optional) a string with flags to be used during regular expression matching.
      • The following modes are enabled with these flags: dotall (s), multiline (m), case_insensitive (i), and comments and whitespace (x).
  • Return Value:
    • a boolean value, true if string contains the pattern string_pattern, false otherwise.
    • missing if any argument is a missing value.
    • null if any argument is a null value but no argument is a missing value.
    • any other non-string input value will return a null.
  • Example:

    {
      "v1": regexp_like(" can't stand acast the network is horrible:(", ".*acast.*"),
      "v2": regexp_like("acast", ".*acst.*")
    };
    
  • The expected result is:

    { "v1": true, "v2": false }
    

regexp_position

  • Syntax:

    regexp_position(string, string_pattern[, string_flags])
    
  • Returns first position of the regular expression string_pattern (a Java regular expression pattern) within string. The function returns the 0-based position. Another version of the function returns the 1-based position. Below are the aliases for each version:

  • Aliases:

    • 0-Based: regexp_position, regexp_pos, regexp_position0, regexp_pos0, regex_position, regex_pos, regex_position0, regex_pos0.
    • 1-Based: regexp_position1, regexp_pos1, regex_position1 regex_pos1.
  • Arguments:

    • string : a string that might contain the pattern.
    • string_pattern : a pattern string to be matched.
    • string_flag : (Optional) a string with flags to be used during regular expression matching.
      • The following modes are enabled with these flags: dotall (s), multiline (m), case_insensitive (i), and comments and whitespace (x).
  • Return Value:
    • the first position that the regular expression string_pattern appears in string (starting at 0), or -1 if it does not appear.
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-string input value will return a null.
  • Example:

    {
      "v1": regexp_position("pphonepp", "p*hone"),
      "v2": regexp_position("hone", "p+hone"),
      "v3": regexp_position1("pphonepp", "p*hone"),
      "v4": regexp_position1("hone", "p+hone")
    };
    
  • The expected result is:

    { "v1": 0, "v2": -1, "v3": 1, "v4": -1 }
    

regexp_replace

  • Syntax:

    regexp_replace(string, string_pattern, string_replacement[, string_flags])
    regexp_replace(string, string_pattern, string_replacement[, replacement_limit])
    
  • Checks whether the string string matches the given regular expression pattern string_pattern (a Java regular expression pattern), and replaces the matched pattern string_pattern with the new pattern string_replacement.

  • Aliases:

    • regexp_replace, regex_replace.
  • Arguments:

    • string : a string that might contain the pattern.
    • string_pattern : a pattern string to be matched.
    • string_replacement : a pattern string to be used as the replacement.
    • string_flag : (Optional) a string with flags to be used during replace.
      • The following modes are enabled with these flags: dotall (s), multiline (m), case_insensitive (i), and comments and whitespace (x).
    • replacement_limit: (Optional) an integer specifying the maximum number of replacements to make (if negative then all occurrences will be replaced)
  • Return Value:
    • Returns a string that is obtained after the replacements.
    • missing if any argument is a missing value.
    • null if any argument is a null value but no argument is a missing value.
    • any other non-string input value will return a null.
  • Example:

    regexp_replace(" like x-phone the voicemail_service is awesome", " like x-phone", "like product-a");
    
  • The expected result is:

    "like product-a the voicemail_service is awesome"
    

repeat

  • Syntax:

    repeat(string, n)
    
  • Returns a string formed by repeating the input string n times.

  • Arguments:
    • string : a string to be repeated,
    • n : an tinyint/smallint/integer/bigint value - how many times the string should be repeated.
  • Return Value:
    • a string that repeats the input string n times,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first argument is any other non-string value,
      • or, the second argument is not a tinyint, smallint, integer, or bigint.
  • Example:

    repeat("test", 3);
    
  • The expected result is:

    "testtesttest"
    

replace

  • Syntax:

    replace(string, search_string, replacement_string[, limit])
    
  • Finds occurrences of the given substring search_string in the input string string and replaces them with the new substring replacement_string.

  • Arguments:
    • string : an input string,
    • search_string : a string substring to be searched for,
    • replacement_string : a string to be used as the replacement,
    • limit : (Optional) an integer - maximum number of occurrences to be replaced. If not specified or negative then all occurrences will be replaced
  • Return Value:
    • Returns a string that is obtained after the replacements,
    • missing if any argument is a missing value,
    • any other non-string input value or non-integer limit will cause a type error,
    • null if any argument is a null value but no argument is a missing value.
  • Example:

    {
      "v1": replace(" like x-phone the voicemail_service is awesome", " like x-phone", "like product-a"),
      "v2": replace("x-phone and x-phone", "x-phone", "product-a", 1)
    };
    
  • The expected result is:

    {
      "v1": "like product-a the voicemail_service is awesome",
      "v2": "product-a and x-phone"
    }
    

reverse

  • Syntax:

    reverse(string)
    
  • Returns a string formed by reversing characters in the input string. For characters of multiple code points, code point is the minimal unit to reverse. See the following examples for more details.

  • Arguments:
    • string : a string to be reversed
  • Return Value:
    • a string containing characters from the the input string in the reverse order,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first argument is any other non-string value
  • Example:

    reverse("hello");
    
  • The expected result is:

    "olleh"
    
  • Example of multi-code-point character (Korean):

    reverse("한글");
    
  • The expected result is (the Korean characters are splitted into code points and then the code points are reversed):

    "ᆯᅳᄀᆫᅡᄒ"
    

rtrim

  • Syntax:

    rtrim(string[, chars]);
    
  • Returns a new string with all trailing characters that appear in chars removed. By default, white space is the character to trim. Note that here one character means one code point. For example, the emoji 4-people-family notation “👩‍👩‍👧‍👦” contains 7 code points, and it is possible to trim a few code points (such as a 2-people-family “👨‍👦”) from it. See the following example for more details.

  • Arguments:
    • string : a string to be trimmed,
    • chars : a string that contains characters that are used to trim.
  • Return Value:
    • a trimmed, new string,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-string input value will cause a type error.
  • Related functions: see trim(), ltrim()
  • Example:

    {
      "v1": rtrim("i like x-phone", "x-phone"),
      "v2": rtrim("i like x-phone", "onexph")
    };
    
  • The expected result is:

    { "v1": "i like ", "v2": "i like x-" }
    
  • Example with multi-codepoint notation (trim the man and boy from the family of man, woman, girl and boy):

    rtrim("👨‍👩‍👧‍👦", "👨‍👦")
    
  • The expected result is (only man, woman and girl are left in the family):

    "👨‍👩‍👧"
    

split

  • Syntax:

    split(string, sep)
    
  • Splits the input string into an array of substrings separated by the string sep.

  • Arguments:
    • string : a string to be split.
  • Return Value:
    • an array of substrings by splitting the input string by sep,
    • in case of two consecutive seps in the string, the result of splitting the two consecutive seps will be the empty string "",
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-string input value will cause a type error.
  • Example:

    split("test driven development", " ");
    
  • The expected result is:

    [ "test", "driven", "development" ]
    
  • Example with two consecutive seps in the string:

    split("123//456", "/");
    
  • The expected result is:

    [ "123", "", "456" ]
    

starts_with

  • Syntax:

    starts_with(string, substring_to_start_with)
    
  • Checks whether the string string starts with the string substring_to_start_with.

  • Arguments:
    • string : a string that might start with the given string.
    • substring_to_start_with : a string that might be contained as the starting substring.
  • Return Value:
    • a boolean, returns true if string starts with the string substring_to_start_with,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-string input value will cause a type error,
    • false otherwise.
  • Example:

    {
      "v1" : starts_with(" like the plan, amazing", " like"),
      "v2" : starts_with("I like the plan, amazing", " like")
    };
    
  • The expected result is:

    { "v1": true, "v2": false }
    

substr

  • Syntax:

    substr(string, offset[, length])
    
  • Returns the substring from the given string string based on the given start offset offset with the optional length. Note that both of the offset and length are in the unit of code point (e.g. the emoji family 👨‍👩‍👧‍👦 has 7 code points). The function uses the 0-based position. Another version of the function uses the 1-based position. Below are the aliases for each version:

  • Aliases:

    • 0-Based: substring, substr, substring0, substr0.
    • 1-Based: substring1, substr1.
  • Arguments:

    • string : a string to be extracted.
    • offset : an tinyint/smallint/integer/bigint value as the starting offset of the substring in string (starting at 0). If negative then counted from the end of the string.
    • length : (Optional) an an tinyint/smallint/integer/bigint value as the length of the substring.
  • Return Value:
    • a string that represents the substring,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value, or if the substring could not be obtained because the starting offset is not within string bounds or length is negative.
    • a null will be returned if:
      • the first argument is any other non-string value.
      • the second argument is not a tinyint, smallint, integer, or bigint.
      • the third argument is not a tinyint, smallint, integer, or bigint if the argument is present.
  • Example:

    { "v1": substr("test string", 6, 3), "v2": substr1("test string", 6, 3) };
    
  • The expected result is:

    { "v1": "tri", "v2": "str" }
    

The function has an alias substring.

trim

  • Syntax:

    trim(string[, chars]);
    
  • Returns a new string with all leading and trailing characters that appear in chars removed. By default, white space is the character to trim. Note that here one character means one code point. For example, the emoji 4-people-family notation “👩‍👩‍👧‍👦” contains 7 code points, and it is possible to trim a few code points (such as a 2-people-family “👨‍👦”) from it. See the following example for more details.

  • Arguments:
    • string : a string to be trimmed,
    • chars : a string that contains characters that are used to trim.
  • Return Value:
    • a trimmed, new string,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-string input value will cause a type error.
  • Related functions: see ltrim(), rtrim()
  • Example:

    trim("i like x-phone", "xphoen");
    
  • The expected result is:

    " like "
    
  • Example with multi-codepoint notation (trim the man and boy from the family of man, woman, girl and boy):

    trim(“👨‍👩‍👧‍👦”, “👨‍👦”)

  • The expected result is (only woman and girl are left in the family):

     "👩‍👧"
    

upper

  • Syntax:

    upper(string)
    
  • Converts a given string string to its uppercase form.

  • Arguments:
    • string : a string to be converted.
  • Return Value:
    • a string as the uppercase form of the given string,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-string input value will cause a type error.
  • Example:

    upper("hello")
    
  • The expected result is:

    "HELLO"
    

string_concat

  • Syntax:

    string_concat(array)
    
  • Concatenates an array of strings array into a single string.

  • Arguments:
    • array : an array or multiset of strings (could be null or missing) to be concatenated.
  • Return Value:
    • the concatenated string value,
    • missing if the argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • missing if any element in the input array is missing,
    • null if any element in the input array is null but no element in the input array is missing,
    • any other non-array input value or non-integer element in the input array will cause a type error.
  • Example:

    string_concat(["ASTERIX", " ", "ROCKS!"]);
    
  • The expected result is:

    "ASTERIX ROCKS!"
    

string_join

  • Syntax:

    string_join(array, string)
    
  • Joins an array or multiset of strings array with the given separator string into a single string.

  • Arguments:
    • array : an array or multiset of strings (could be null) to be joined.
    • string : a string to serve as the separator.
  • Return Value:
    • the joined string,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • missing if the first argument array contains a missing,
    • null if the first argument array contains a null but does not contain a missing,
    • a type error will be raised if:
      • the first argument is any other non-array value, or contains any other non-string value,
      • or, the second argument is any other non-string value.
  • Example:

    string_join(["ASTERIX", "ROCKS~"], "!! ");
    
  • The expected result is:

    "ASTERIX!! ROCKS~"
    

string_to_codepoint

  • Syntax:

    string_to_codepoint(string)
    
  • Converts the string string to its code_based representation.

  • Arguments:
    • string : a string that will be converted.
  • Return Value:
    • an array of the code points for the string string,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-string input value will cause a type error.
  • Example:

    string_to_codepoint("Hello ASTERIX!");
    
  • The expected result is:

    [ 72, 101, 108, 108, 111, 32, 65, 83, 84, 69, 82, 73, 88, 33 ]
    

codepoint_to_string

  • Syntax:

    codepoint_to_string(array)
    
  • Converts the ordered code_based representation array to the corresponding string.

  • Arguments:
    • array : an array of integer code_points.
  • Return Value:
    • a string representation of array.
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • missing if any element in the input array is missing,
    • null if any element in the input array is null but no element in the input array is missing,
    • any other non-array input value or non-integer element in the input array will cause a type error.
  • Example:

    codepoint_to_string([72, 101, 108, 108, 111, 32, 65, 83, 84, 69, 82, 73, 88, 33]);
    
  • The expected result is:

    "Hello ASTERIX!"
    

substring_before

  • Syntax:

    substring_before(string, string_pattern)
    
  • Returns the substring from the given string string before the given pattern string_pattern.

  • Arguments:
    • string : a string to be extracted.
    • string_pattern : a string pattern to be searched.
  • Return Value:
    • a string that represents the substring,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-string input value will cause a type error.
  • Example:

    substring_before(" like x-phone", "x-phone");
    
  • The expected result is:

    " like "
    

substring_after

  • Syntax:

    substring_after(string, string_pattern);

  • Returns the substring from the given string string after the given pattern string_pattern.

  • Arguments:
    • string : a string to be extracted.
    • string_pattern : a string pattern to be searched.
  • Return Value:
    • a string that represents the substring,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-string input value will cause a type error.
  • Example:

    substring_after(" like x-phone", "xph");
    
  • The expected result is:

    "one"
    

Binary Functions

parse_binary

  • Syntax:

    parse_binary(string, encoding)

  • Creates a binary from an string encoded in encoding format.

  • Arguments:
    • string : an encoded string,
    • encoding : a string notation specifies the encoding type of the given string. Currently we support hex and base64 format.
  • Return Value:
    • a binary that is decoded from the given string,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-string input value will cause a type error.
  • Example:

    [ parse_binary(“ABCDEF0123456789”,“hex”), parse_binary(“abcdef0123456789”,“HEX”), parse_binary(‘QXN0ZXJpeAE=’,“base64”) ];

  • The expected result is:

    [ hex(“ABCDEF0123456789”), hex(“ABCDEF0123456789”), hex(“4173746572697801”) ]

print_binary

  • Syntax:

    print_binary(binary, encoding)

  • Prints a binary to the required encoding string format.

  • Arguments:
    • binary : a binary data need to be printed.
    • encoding : a string notation specifies the expected encoding type. Currently we support hex and base64 format.
  • Return Value:
    • a string that represents the encoded format of a binary,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-string input value will cause a type error.
  • Example:

    [ print_binary(hex("ABCDEF0123456789"), "base64"), print_binary(base64("q83vASNFZ4k="), "hex") ]
    
  • The expected result are:

    [ "q83vASNFZ4k=", "ABCDEF0123456789" ]
    

binary_length

  • Syntax:

    binary_length(binary)

  • Returns the number of bytes storing the binary data.

  • Arguments:
    • binary : a binary value to be checked.
  • Return Value:
    • an bigint that represents the number of bytes,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-binary input value will cause a type error.
  • Example:

    binary_length(hex("00AA"))
    
  • The expected result is:

    2

sub_binary

  • Syntax:

    sub_binary(binary, offset[, length])

  • Returns the sub binary from the given binary based on the given start offset with the optional length.

  • Arguments:
    • binary : a binary to be extracted,
    • offset : a tinyint, smallint, integer, or bigint value as the starting offset of the sub binary in binary (starting at 0),
    • length : (Optional) a tinyint, smallint, integer, or bigint value as the length of the sub binary.
  • Return Value:
    • a binary that represents the sub binary,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first argument is any other non-binary value,
      • or, the second argument is any other non-integer value,
      • or, the third argument is any other non-integer value, if it is present.
  • Example:

    sub_binary(hex("AABBCCDD"), 4);
    
  • The expected result is

    hex("DD")
    

binary_concat

  • Syntax:

    binary_concat(array)

  • Concatenates a binary array or multiset into a single binary.

  • Arguments:
    • array : an array or multiset of binaries (could be null or missing) to be concatenated.
  • Return Value :
    • the concatenated binary value,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • missing if any element in the input array is missing,
    • null if any element in the input array is null but no element in the input array is missing,
    • any other non-array input value or non-binary element in the input array will cause a type error.
  • Example:

    binary_concat([hex(“42”), hex(""), hex(‘42’)]);

  • The expected result is

    hex(“4242”)

Spatial Functions

create_point

  • Syntax:

    create_point(x, y)
    
  • Creates the primitive type point using an x and y value.

  • Arguments:
  • x : a double that represents the x-coordinate,
  • y : a double that represents the y-coordinate.
  • Return Value:
  • a point representing the ordered pair (x, y),
  • missing if any argument is a missing value,
  • null if any argument is a null value but no argument is a missing value,
  • any other non-double input value will cause a type error.
  • Example:

    { "point": create_point(30.0,70.0) };
    
  • The expected result is:

    { "point": point("30.0,70.0") }
    

create_line

  • Syntax:

    create_line(point1, point2)
    
  • Creates the primitive type line using point1 and point2.

  • Arguments:
    • point1 : a point that represents the start point of the line.
    • point2 : a point that represents the end point of the line.
  • Return Value:
    • a spatial line created using the points provided in point1 and point2,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-point input value will cause a type error.
  • Example:

    { "line": create_line(create_point(30.0,70.0), create_point(50.0,90.0)) };
    
  • The expected result is:

    { "line": line("30.0,70.0 50.0,90.0") }
    

create_rectangle

  • Syntax:

    create_rectangle(point1, point2)
    
  • Creates the primitive type rectangle using point1 and point2.

  • Arguments:
    • point1 : a point that represents the lower_left point of the rectangle.
    • point2 : a point that represents the upper_right point of the rectangle.
  • Return Value:
    • a spatial rectangle created using the points provided in point1 and point2,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-point input value will cause a type error.
  • Example:

    { "rectangle": create_rectangle(create_point(30.0,70.0), create_point(50.0,90.0)) };
    
  • The expected result is:

    { "rectangle": rectangle("30.0,70.0 50.0,90.0") }
    

create_circle

  • Syntax:

    create_circle(point, radius)
    
  • Creates the primitive type circle using point and radius.

  • Arguments:
    • point : a point that represents the center of the circle.
    • radius : a double that represents the radius of the circle.
  • Return Value:
    • a spatial circle created using the center point and the radius provided in point and radius.
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first argument is any other non-point value,
      • or, the second argument is any other non-double value.
  • Example:

    { "circle": create_circle(create_point(30.0,70.0), 5.0) }
    
  • The expected result is:

    { "circle": circle("30.0,70.0 5.0") }
    

create_polygon

  • Syntax:

    create_polygon(array)
    
  • Creates the primitive type polygon using the double values provided in the argument array. Each two consecutive double values represent a point starting from the first double value in the array. Note that at least six double values should be specified, meaning a total of three points.

  • Arguments:
    • array : an array of doubles representing the points of the polygon.
  • Return Value:
    • a polygon, represents a spatial simple polygon created using the points provided in array.
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • missing if any element in the input array is missing,
    • null if any element in the input array is null but no element in the input array is missing,
    • any other non-array input value or non-double element in the input array will cause a type error.
  • Example:

    { "polygon": create_polygon([1.0,1.0,2.0,2.0,3.0,3.0,4.0,4.0]) };
    
  • The expected result is:

    { "polygon": polygon("1.0,1.0 2.0,2.0 3.0,3.0 4.0,4.0") }
    

get_x/get_y

  • Syntax:

    get_x(point) or get_y(point)
    
  • Returns the x or y coordinates of a point point.

  • Arguments:
    • point : a point.
  • Return Value:
    • a double representing the x or y coordinates of the point point,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-point input value will cause a type error.
  • Example:

    { "x_coordinate": get_x(create_point(2.3,5.0)), "y_coordinate": get_y(create_point(2.3,5.0)) };
    
  • The expected result is:

    { "x_coordinate": 2.3, "y_coordinate": 5.0 }
    

get_points

  • Syntax:

    get_points(spatial_object)
    
  • Returns an ordered array of the points forming the spatial object spatial_object.

  • Arguments:
    • spatial_object : a point, line, rectangle, circle, or polygon.
  • Return Value:
    • an array of the points forming the spatial object spatial_object,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-spatial-object input value will cause a type error.
  • Example:

    get_points(create_polygon([1.0,1.0,2.0,2.0,3.0,3.0,4.0,4.0]))
    
  • The expected result is:

    [ point("1.0,1.0"), point("2.0,2.0"), point("3.0,3.0"), point("4.0,4.0") ]
    

get_center/get_radius

  • Syntax:

    get_center(circle_expression) or get_radius(circle_expression)
    
  • Returns the center and the radius of a circle circle_expression, respectively.

  • Arguments:
    • circle_expression : a circle.
  • Return Value:
    • a point or double, represent the center or radius of the circle circle_expression.
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-circle input value will cause a type error.
  • Example:

    {
      "circle_radius": get_radius(create_circle(create_point(6.0,3.0), 1.0)),
      "circle_center": get_center(create_circle(create_point(6.0,3.0), 1.0))
    };
    
  • The expected result is:

    { "circle_radius": 1.0, "circle_center": point("6.0,3.0") }
    

spatial_distance

  • Syntax:

    spatial_distance(point1, point2)
    
  • Returns the Euclidean distance between point1 and point2.

  • Arguments:
    • point1 : a point.
    • point2 : a point.
  • Return Value:
    • a double as the Euclidean distance between point1 and point2.
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-point input value will cause a type error.
  • Example:

    spatial_distance(point("47.44,80.65"), create_point(30.0,70.0));
    
  • The expected result is:

    20.434678857275934
    

spatial_area

  • Syntax:

    spatial_area(spatial_2d_expression)
    
  • Returns the spatial area of spatial_2d_expression.

  • Arguments:
    • spatial_2d_expression : a rectangle, circle, or polygon.
  • Return Value:
    • a double representing the area of spatial_2d_expression.
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-2d-spatial-object will cause a type error.
  • Example:

    spatial_area(create_circle(create_point(0.0,0.0), 5.0));
    
  • The expected result is:

    78.53981625
    

spatial_intersect

  • Syntax:

    spatial_intersect(spatial_object1, spatial_object2)
    
  • Checks whether @arg1 and @arg2 spatially intersect each other.

  • Arguments:
    • spatial_object1 : a point, line, rectangle, circle, or polygon.
    • spatial_object2 : a point, line, rectangle, circle, or polygon.
  • Return Value:
    • a boolean representing whether spatial_object1 and spatial_object2 spatially overlap with each other,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-spatial-object input value will cause a type error.
  • Example:

    spatial_intersect(point("39.28,70.48"), create_rectangle(create_point(30.0,70.0), create_point(40.0,80.0)));
    
  • The expected result is:

    true
    

spatial_cell

  • Syntax:

    spatial_cell(point1, point2, x_increment, y_increment)
    
  • Returns the grid cell that point1 belongs to.

  • Arguments:
    • point1 : a point representing the point of interest that its grid cell will be returned.
    • point2 : a point representing the origin of the grid.
    • x_increment : a double, represents X increments.
    • y_increment : a double, represents Y increments.
  • Return Value:
    • a rectangle representing the grid cell that point1 belongs to,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first or second argument is any other non-point value,
      • or, the second or third argument is any other non-double value.
  • Example:

    spatial_cell(point("39.28,70.48"), create_point(20.0,50.0), 5.5, 6.0);
    
  • The expected result is:

    rectangle("36.5,68.0 42.0,74.0");
    

Similarity Functions

AsterixDB supports queries with different similarity functions, including edit distance and Jaccard.

edit_distance

  • Syntax:

    edit_distance(expression1, expression2)
    
  • Returns the edit distance of expression1 and expression2.

  • Arguments:
    • expression1 : a string or a homogeneous array of a comparable item type.
    • expression2 : The same type as expression1.
  • Return Value:
    • an bigint that represents the edit distance between expression1 and expression2,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-string input value will cause a type error.
  • Note: an n_gram index can be utilized for this function.
  • Example:
    edit_distance("SuzannaTillson", "Suzanna Tilson");
    
  • The expected result is:

    2
    

edit_distance_check

  • Syntax:

    edit_distance_check(expression1, expression2, threshold)
    
  • Checks whether the edit distance of expression1 and expression2 is within a given threshold.

  • Arguments:

    • expression1 : a string or a homogeneous array of a comparable item type.
    • expression2 : The same type as expression1.
    • threshold : a bigint that represents the distance threshold.
  • Return Value:
    • an array with two items:
      • The first item contains a boolean value representing whether the edit distance of expression1 and expression2 is within the given threshold.
      • The second item contains an integer that represents the edit distance of expression1 and expression2 if the first item is true.
      • If the first item is false, then the second item is set to 2147483647.
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first or second argument is any other non-string value,
      • or, the third argument is any other non-bigint value.
  • Note: an n_gram index can be utilized for this function.
  • Example:
    edit_distance_check("happy","hapr",2);
    
  • The expected result is:

    [ true, 2 ]
    

edit_distance_contains

  • Syntax:

    edit_distance_contains(expression1, expression2, threshold)
    
  • Checks whether expression1 contains expression2 with an edit distance within a given threshold.

  • Arguments:

    • expression1 : a string or a homogeneous array of a comparable item type.
    • expression2 : The same type as expression1.
    • threshold : a bigint that represents the distance threshold.
  • Return Value:
    • an array with two items:
      • The first item contains a boolean value representing whether expression1 can contain expression2.
      • The second item contains an integer that represents the required edit distance for expression1 to contain expression2 if the first item is true.
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first or second argument is any other non-string value,
      • or, the third argument is any other non-bigint value.
  • Note: an n_gram index can be utilized for this function.
  • Example:
    edit_distance_contains("happy","hapr",2);
    
  • The expected result is:

    [ true, 1 ]
    

similarity_jaccard

  • Syntax:

    similarity_jaccard(array1, array2)
    
  • Returns the Jaccard similarity of array1 and array2.

  • Arguments:
    • array1 : an array or multiset.
    • array2 : an array or multiset.
  • Return Value:
    • a float that represents the Jaccard similarity of array1 and array2,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • missing if any element in any input array is missing,
    • null if any element in any input array is null but no element in the input array is missing,
    • any other non-array input value or non-integer element in any input array will cause a type error.
  • Note: a keyword index can be utilized for this function.

  • Example:
    similarity_jaccard([1,5,8,9], [1,5,9,10]);
    
  • The expected result is:

    0.6
    

similarity_jaccard_check

  • Syntax:

    similarity_jaccard_check(array1, array2, threshold)
    
  • Checks whether array1 and array2 have a Jaccard similarity greater than or equal to threshold. Again, the “check” version of Jaccard is faster than the “non_check” version.

  • Arguments:

    • array1 : an array or multiset.
    • array2 : an array or multiset.
    • threshold : a double that represents the similarity threshold.
  • Return Value:
    • an array with two items:
      • The first item contains a boolean value representing whether array1 and array2 are similar.
      • The second item contains a float that represents the Jaccard similarity of array1 and array2 if it is greater than or equal to the threshold, or 0 otherwise.
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • missing if any element in any input array is missing,
    • null if any element in any input array is null but no element in the input array is missing,
    • a type error will be raised if:
      • the first or second argument is any other non-array value,
        • or, the third argument is any other non-double value.
  • Note: a keyword index can be utilized for this function.

  • Example:
    similarity_jaccard_check([1,5,8,9], [1,5,9,10], 0.6);
    
  • The expected result is:

    [ false, 0.0 ]
    

Tokenizing Functions

word_tokens

  • Syntax:

    word_tokens(string)
    
  • Returns an array of word tokens of string using non_alphanumeric characters as delimiters.

  • Arguments:
    • string : a string that will be tokenized.
  • Return Value:
    • an array of string word tokens,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-string input value will cause a type error.
  • Example:

    word_tokens("I like the phone, awesome!");
    
  • The expected result is:

    [ "i", "like", "the", "phone", "awesome" ]
    

Temporal Functions

get_year/get_month/get_day/get_hour/get_minute/get_second/get_millisecond

  • Syntax:

    get_year/get_month/get_day/get_hour/get_minute/get_second/get_millisecond(temporal_value)
    
  • Accessors for accessing fields in a temporal value

  • Arguments:
    • temporal_value : a temporal value represented as one of the following types: date, datetime, time, and duration.
  • Return Value:
    • an bigint value representing the field to be extracted,
    • missing if the argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-interval input value will cause a type error.
  • Example:

    {
      "year": get_year(date("2010-10-30")),
      "month": get_month(datetime("1987-11-19T23:49:23.938")),
      "day": get_day(date("2010-10-30")),
      "hour": get_hour(time("12:23:34.930+07:00")),
      "min": get_minute(duration("P3Y73M632DT49H743M3948.94S")),
      "second": get_second(datetime("1987-11-19T23:49:23.938")),
      "ms": get_millisecond(duration("P3Y73M632DT49H743M3948.94S"))
    };
    
  • The expected result is:

    { "year": 2010, "month": 11, "day": 30, "hour": 5, "min": 28, "second": 23, "ms": 94 }
    

adjust_datetime_for_timezone

  • Syntax:

    adjust_datetime_for_timezone(datetime, string)
    
  • Adjusts the given datetime datetime by applying the timezone information string.

  • Arguments:
    • datetime : a datetime value to be adjusted.
    • string : a string representing the timezone information.
  • Return Value:
    • a string value representing the new datetime after being adjusted by the timezone information,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first argument is any other non-datetime value,
      • or, the second argument is any other non-string value.
  • Example:

    adjust_datetime_for_timezone(datetime("2008-04-26T10:10:00"), "+08:00");
    
  • The expected result is:

    "2008-04-26T18:10:00.000+08:00"
    

adjust_time_for_timezone

  • Syntax:

    adjust_time_for_timezone(time, string)
    
  • Adjusts the given time time by applying the timezone information string.

  • Arguments:
    • time : a time value to be adjusted.
    • string : a string representing the timezone information.
  • Return Value:
    • a string value representing the new time after being adjusted by the timezone information,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first argument is any other non-time value,
      • or, the second argument is any other non-string value.
  • Example:

    adjust_time_for_timezone(get_time_from_datetime(datetime("2008-04-26T10:10:00")), "+08:00");
    
  • The expected result is:

    "18:10:00.000+08:00"
    

calendar_duration_from_datetime

  • Syntax:

    calendar_duration_from_datetime(datetime, duration_value)
    
  • Gets a user_friendly representation of the duration duration_value based on the given datetime datetime.

  • Arguments:
    • datetime : a datetime value to be used as the reference time point.
    • duration_value : a duration value to be converted.
  • Return Value:
    • a duration value with the duration as duration_value but with a user_friendly representation,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first argument is any other non-datetime value,
      • or, the second argument is any other non-duration input value.
  • Example:

    calendar_duration_from_datetime(
          datetime("2016-03-26T10:10:00"),
          datetime("2016-03-26T10:10:00") - datetime("2011-01-01T00:00:00")
    );
    
  • The expected result is:

    duration("P5Y2M24DT10H10M")
    

get_year_month_duration/get_day_time_duration

  • Syntax:

    get_year_month_duration/get_day_time_duration(duration_value)
    
  • Extracts the correct duration subtype from duration_value.

  • Arguments:
    • duration_value : a duration value to be converted.
  • Return Value:
    • a year_month_duration value or a day_time_duration value,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-duration input value will cause a type error.
  • Example:

    get_year_month_duration(duration("P12M50DT10H"));
    
  • The expected result is:

    year_month_duration("P1Y")
    

months_from_year_month_duration/ms_from_day_time_duration

  • Syntax:

    months_from_year_month_duration/ms_from_day_time_duration(duration_value)
    
  • Extracts the number of months or the number of milliseconds from the duration subtype.

  • Arguments:
    • duration_value : a duration of the correct subtype.
  • Return Value:
    • a bigint representing the number of months/milliseconds,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-duration input value will cause a type error.
  • Example:

    {
        "months": months_from_year_month_duration(get_year_month_duration(duration("P5Y7MT50M"))),
        "milliseconds": ms_from_day_time_duration(get_day_time_duration(duration("P5Y7MT50M")))
    };
    
  • The expected result is:

    {"months": 67, "milliseconds": 3000000}
    

duration_from_months/duration_from_ms

  • Syntax:

    duration_from_months/duration_from_ms(number_value)
    
  • Creates a duration from number_value.

  • Arguments:
    • number_value : a bigint representing the number of months/milliseconds
  • Return Value:
    • a duration containing number_value value for months/milliseconds,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-duration input value will cause a type error.
  • Example:

    duration_from_months(8);
    
  • The expected result is:

    duration("P8M")
    

duration_from_interval

  • Syntax:

    duration_from_interval(interval_value)
    
  • Creates a duration from interval_value.

  • Arguments:
    • interval_value : an interval value
  • Return Value:
    • a duration representing the time in the interval_value
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-duration input value will cause a type error.
  • Example:

    {
      "dr1" : duration_from_interval(interval(date("2010-10-30"), date("2010-12-21"))),
      "dr2" : duration_from_interval(interval(datetime("2012-06-26T01:01:01.111"), datetime("2012-07-27T02:02:02.222"))),
      "dr3" : duration_from_interval(interval(time("12:32:38"), time("20:29:20"))),
      "dr4" : duration_from_interval(null)
    };
    
  • The expected result is:

    {
      "dr1": day_time_duration("P52D"),
      "dr2": day_time_duration("P31DT1H1M1.111S"),
      "dr3": day_time_duration("PT7H56M42S"),
      "dr4": null
    }
    

current_date

  • Syntax:

    current_date()
    
  • Gets the current date.

  • Arguments: None
  • Return Value:
    • a date value of the date when the function is called.

current_time

  • Syntax:

    current_time()
    
  • Get the current time

  • Arguments: None
  • Return Value:
    • a time value of the time when the function is called.

current_datetime

  • Syntax:

    current_datetime()
    
  • Get the current datetime

  • Arguments: None
  • Return Value:
    • a datetime value of the datetime when the function is called.

get_date_from_datetime

  • Syntax:

    get_date_from_datetime(datetime)
    
  • Gets the date value from the given datetime value datetime.

  • Arguments:
    • datetime: a datetime value to be extracted from.
  • Return Value:
    • a date value from the datetime,
    • any other non-datetime input value will cause a type error.

get_time_from_datetime

  • Syntax:

    get_time_from_datetime(datetime)
    
  • Get the time value from the given datetime value datetime

  • Arguments:
    • datetime: a datetime value to be extracted from.
  • Return Value:
    • a time value from the datetime.
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-datetime input value will cause a type error.
  • Example:

    get_time_from_datetime(datetime("2016-03-26T10:10:00"));
    
  • The expected result is:

    time("10:10:00.000Z")
    

day_of_week

  • Syntax:

    day_of_week(date)
    
  • Finds the day of the week for a given date (1_7)

  • Arguments:
    • date: a date value (Can also be a datetime)
  • Return Value:
    • an tinyint representing the day of the week (1_7),
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-date input value will cause a type error.
  • Example:

    day_of_week(datetime("2012-12-30T12:12:12.039Z"));
    
  • The expected result is:

    7
    

date_from_unix_time_in_days

  • Syntax:

    date_from_unix_time_in_days(numeric_value)
    
  • Gets a date representing the time after numeric_value days since 1970_01_01.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint value representing the number of days.
  • Return Value:
    • a date value as the time after numeric_value days since 1970-01-01,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.

datetime_from_unix_time_in_ms

  • Syntax:

    datetime_from_unix_time_in_ms(numeric_value)
    
  • Gets a datetime representing the time after numeric_value milliseconds since 1970_01_01T00:00:00Z.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint value representing the number of milliseconds.
  • Return Value:
    • a datetime value as the time after numeric_value milliseconds since 1970-01-01T00:00:00Z,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.

datetime_from_unix_time_in_secs

  • Syntax:

    datetime_from_unix_time_in_secs(numeric_value)
    
  • Gets a datetime representing the time after numeric_value seconds since 1970_01_01T00:00:00Z.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint value representing the number of seconds.
  • Return Value:
    • a datetime value as the time after numeric_value seconds since 1970_01_01T00:00:00Z,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.

datetime_from_date_time

  • Syntax:

datetime_from_date_time(date,time)

  • Gets a datetime representing the combination of date and time
    • Arguments:
    • date: a date value
    • time a time value
  • Return Value:
    • a datetime value by combining date and time,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if
      • the first argument is any other non-date value,
      • or, the second argument is any other non-time value.

time_from_unix_time_in_ms

  • Syntax:

    time_from_unix_time_in_ms(numeric_value)
    
  • Gets a time representing the time after numeric_value milliseconds since 00:00:00.000Z.

  • Arguments:
    • numeric_value: a tinyint/smallint/integer/bigint value representing the number of milliseconds.
  • Return Value:
    • a time value as the time after numeric_value milliseconds since 00:00:00.000Z,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    {
      "date": date_from_unix_time_in_days(15800),
      "datetime": datetime_from_unix_time_in_ms(1365139700000),
      "time": time_from_unix_time_in_ms(3748)
    };
    
  • The expected result is:

    { "date": date("2013-04-05"), "datetime": datetime("2013-04-05T05:28:20.000Z"), "time": time("00:00:03.748Z") }
    

unix_time_from_date_in_days

  • Syntax:

    unix_time_from_date_in_days(date_value)
    
  • Gets an integer value representing the number of days since 1970_01_01 for date_value.

  • Arguments:
    • date_value: a date value.
  • Return Value:
    • a bigint value representing the number of days,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-date input value will cause a type error.

unix_time_from_datetime_in_ms

  • Syntax:

    unix_time_from_datetime_in_ms(datetime_value)
    
  • Gets an integer value representing the time in milliseconds since 1970_01_01T00:00:00Z for datetime_value.

  • Arguments:
    • datetime_value : a datetime value.
  • Return Value:
    • a bigint value representing the number of milliseconds,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-datetime input value will cause a type error.

unix_time_from_datetime_in_secs

  • Syntax:

    unix_time_from_datetime_in_secs(datetime_value)
    
  • Gets an integer value representing the time in seconds since 1970_01_01T00:00:00Z for datetime_value.

  • Arguments:
    • datetime_value : a datetime value.
  • Return Value:
    • a bigint value representing the number of seconds,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-datetime input value will cause a type error.

unix_time_from_time_in_ms

  • Syntax:

    unix_time_from_time_in_ms(time_value)
    
  • Gets an integer value representing the time the milliseconds since 00:00:00.000Z for time_value.

  • Arguments:
    • time_value : a time value.
  • Return Value:
    • a bigint value representing the number of milliseconds,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-datetime input value will cause a type error.
  • Example:

    {
      "date": date_from_unix_time_in_days(15800),
      "datetime": datetime_from_unix_time_in_ms(1365139700000),
      "time": time_from_unix_time_in_ms(3748)
    }
    
  • The expected result is:

    { "date": date("2013-04-05"), "datetime": datetime("2013-04-05T05:28:20.000Z"), "time": time("00:00:03.748Z") }
    

parse_date/parse_time/parse_datetime

  • Syntax:

parse_date/parse_time/parse_datetime(date,formatting_expression)

  • Creates a date/time/date_time value by treating date with formatting formatting_expression
  • Arguments:
    • date: a string value representing the date/time/datetime.
    • formatting_expression a string value providing the formatting for date_expression.Characters used to create date expression:
    • h hours
    • m minutes
    • s seconds
    • n milliseconds
    • a am/pm
    • z timezone
    • Y year
    • M month
    • D day
    • W weekday
    • _, ', /, ., ,, T seperators for both time and date
  • Return Value:
    • a date/time/date_time value corresponding to date,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
    • the first argument is any other non-date value,
    • the second argument is any other non-string value.
  • Example:

    parse_time("30:30","m:s");
    
  • The expected result is:

    time("00:30:30.000Z")
    

print_date/print_time/print_datetime

  • Syntax:

    print_date/print_time/print_datetime(date,formatting_expression)
    
  • Creates a string representing a date/time/date_time value of the date using the formatting formatting_expression

  • Arguments:
    • date: a date/time/datetime value.
    • formatting_expression a string value providing the formatting for date_expression. Characters used to create date expression:
    • h hours
    • m minutes
    • s seconds
    • n milliseconds
    • a am/pm
    • z timezone
    • Y year
    • M month
    • D day
    • W weekday
    • _, ', /, ., ,, T seperators for both time and date
  • Return Value:
    • a string value corresponding to date,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first argument is any other non-date value,
      • the second argument is any other non-string value.
  • Example:

    print_time(time("00:30:30.000Z"),"m:s");
    
  • The expected result is:

    "30:30"
    

get_interval_start, get_interval_end

  • Syntax:

    get_interval_start/get_interval_end(interval)
    
  • Gets the start/end of the given interval.

  • Arguments:
    • interval: the interval to be accessed.
  • Return Value:
    • a time, date, or datetime (depending on the time instances of the interval) representing the starting or ending time,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-interval value will cause a type error.
  • Example:

    {
      "start": get_interval_start(interval_start_from_date("1984-01-01", "P1Y")),
      "end": get_interval_end(interval_start_from_date("1984-01-01", "P1Y"))
    };
    
  • The expected result is:

    { "start": date("1984_01_01"), "end": date("1985_01_01") }
    

get_interval_start_date/get_interval_start_datetimeget_interval_start_time, get_interval_end_date/get_interval_end_datetime/get_interval_end_time

  • Syntax:

    get_interval_start_date/get_interval_start_datetime/get_interval_start_time/get_interval_end_date/get_interval_end_datetime/get_interval_end_time(interval)
    
  • Gets the start/end of the given interval for the specific date/datetime/time type.

  • Arguments:
    • interval: the interval to be accessed.
  • Return Value:
    • a time, date, or datetime (depending on the function) representing the starting or ending time,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-interval value will cause a type error.
  • Example:

    {
      "start1": get_interval_start_date(interval_start_from_date("1984-01-01", "P1Y")),
      "end1": get_interval_end_date(interval_start_from_date("1984-01-01", "P1Y")),
      "start2": get_interval_start_datetime(interval_start_from_datetime("1984-01-01T08:30:00.000", "P1Y1H")),
      "end2": get_interval_end_datetime(interval_start_from_datetime("1984-01-01T08:30:00.000", "P1Y1H")),
      "start3": get_interval_start_time(interval_start_from_time("08:30:00.000", "P1H")),
      "end3": get_interval_end_time(interval_start_from_time("08:30:00.000", "P1H"))
    };
    
  • The expected result is:

    {
      "start1": date("1984-01-01"),
      "end1": date("1985-01-01"),
      "start2": datetime("1984-01-01T08:30:00.000Z"),
      "end2": datetime("1985-01-01T09:30:00.000Z"),
      "start3": time("08:30:00.000Z"),
      "end3": time("09:30:00.000Z")
    }
    

get_overlapping_interval

  • Syntax:

    get_overlapping_interval(interval1, interval2)
    
  • Gets the start/end of the given interval for the specific date/datetime/time type.

  • Arguments:
    • interval1: an interval value
    • interval2: an interval value
  • Return Value:
    • an interval that is overlapping interval1 and interval2. If interval1 and interval2 do not overlap null is returned. Note each interval must be of the same type.
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-interval input value will cause a type error.
  • Example:

    { "overlap1": get_overlapping_interval(interval(time("11:23:39"), time("18:27:19")), interval(time("12:23:39"), time("23:18:00"))),
      "overlap2": get_overlapping_interval(interval(time("12:23:39"), time("18:27:19")), interval(time("07:19:39"), time("09:18:00"))),
      "overlap3": get_overlapping_interval(interval(date("1980-11-30"), date("1999-09-09")), interval(date("2013-01-01"), date("2014-01-01"))),
      "overlap4": get_overlapping_interval(interval(date("1980-11-30"), date("2099-09-09")), interval(date("2013-01-01"), date("2014-01-01"))),
      "overlap5": get_overlapping_interval(interval(datetime("1844-03-03T11:19:39"), datetime("2000-10-30T18:27:19")), interval(datetime("1989-03-04T12:23:39"), datetime("2009-10-10T23:18:00"))),
      "overlap6": get_overlapping_interval(interval(datetime("1989-03-04T12:23:39"), datetime("2000-10-30T18:27:19")), interval(datetime("1844-03-03T11:19:39"), datetime("1888-10-10T23:18:00")))
    };
    
  • The expected result is:

    { "overlap1": interval(time("12:23:39.000Z"), time("18:27:19.000Z")),
      "overlap2": null,
      "overlap3": null,
      "overlap4": interval(date("2013-01-01"), date("2014_01_01")),
      "overlap5": interval(datetime("1989-03-04T12:23:39.000Z"), datetime("2000-10-30T18:27:19.000Z")),
      "overlap6": null
    }
    

interval_bin

  • Syntax:

    interval_bin(time_to_bin, time_bin_anchor, duration_bin_size)
    
  • Returns the interval value representing the bin containing the time_to_bin value.

  • Arguments:
    • time_to_bin: a date/time/datetime value representing the time to be binned.
    • time_bin_anchor: a date/time/datetime value representing an anchor of a bin starts. The type of this argument should be the same as the first time_to_bin argument.
    • duration_bin_size: the duration value representing the size of the bin, in the type of year_month_duration or day_time_duration. The type of this duration should be compatible with the type of time_to_bin, so that the arithmetic operation between time_to_bin and duration_bin_size is well_defined. Currently AsterixDB supports the following arithmetic operations:
      • datetime +|_ year_month_duration
      • datetime +|_ day_time_duration
      • date +|_ year_month_duration
      • date +|_ day_time_duration
      • time +|_ day_time_duration
  • Return Value:
    • a interval value representing the bin containing the time_to_bin value. Note that the internal type of this interval value should be the same as the time_to_bin type,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first argument or the second argument is any other non-date/non-time/non-datetime value,
      • or, the second argument is any other non-year_month_duration/non-day_time_duration value.
  • Example:

    {
      "bin1": interval_bin(date("2010-10-30"), date("1990-01-01"), year_month_duration("P1Y")),
      "bin2": interval_bin(datetime("1987-11-19T23:49:23.938"), datetime("1990-01-01T00:00:00.000Z"), year_month_duration("P6M")),
      "bin3": interval_bin(time("12:23:34.930+07:00"), time("00:00:00"), day_time_duration("PT1M")),
      "bin4": interval_bin(datetime("1987-11-19T23:49:23.938"), datetime("2013-01-01T00:00:00.000"), day_time_duration("PT24H"))
    };
    
  • The expected result is:

    {
      "bin1": interval(date("2010-01-01"),date("2011-01-01")),
      "bin2": interval(datetime("1987-07-01T00:00:00.000Z"), datetime("1988-01-01T00:00:00.000Z")),
      "bin3": interval(time("05:23:00.000Z"), time("05:24:00.000Z")),
      "bin4": interval(datetime("1987-11-19T00:00:00.000Z"), datetime("1987-11-20T00:00:00.000Z"))
    }
    

interval_start_from_date/time/datetime

  • Syntax:

    interval_start_from_date/time/datetime(date/time/datetime, duration)
    
  • Construct an interval value by the given starting date/time/datetime and the duration that the interval lasts.

  • Arguments:
    • date/time/datetime: a string representing a date, time or datetime, or a date/time/datetime value, representing the starting time point.
    • duration: a string or duration value representing the duration of the interval. Note that duration cannot be negative value.
  • Return Value:
    • an interval value representing the interval starting from the given time point with the length of duration,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first argument or the second argument is any other non-date/non-time/non-datetime value,
      • or, the second argument is any other non-duration value.
  • Example:

    {
      "interval1": interval_start_from_date("1984-01-01", "P1Y"),
      "interval2": interval_start_from_time(time("02:23:28.394"), "PT3H24M"),
      "interval3": interval_start_from_datetime("1999-09-09T09:09:09.999", duration("P2M30D"))
    };
    
  • The expectecd result is:

    {
      "interval1": interval(date("1984-01-01"), date("1985-01-01")),
      "interval2": interval(time("02:23:28.394Z"), time("05:47:28.394Z")),
      "interval3": interval(datetime("1999-09-09T09:09:09.999Z"), datetime("1999-12-09T09:09:09.999Z"))
    }
    

overlap_bins

  • Return Value:

    • a interval value representing the bin containing the time_to_bin value. Note that the internal type of this interval value should be the same as the time_to_bin type.
  • Syntax:

    overlap_bins(interval, time_bin_anchor, duration_bin_size)
    
  • Returns an ordered list of interval values representing each bin that is overlapping the interval.

  • Arguments:
    • interval: an interval value
    • time_bin_anchor: a date/time/datetime value representing an anchor of a bin starts. The type of this argument should be the same as the first time_to_bin argument.
    • duration_bin_size: the duration value representing the size of the bin, in the type of year_month_duration or day_time_duration. The type of this duration should be compatible with the type of time_to_bin, so that the arithmetic operation between time_to_bin and duration_bin_size is well_defined. Currently AsterixDB supports the following arithmetic operations:
      • datetime +|_ year_month_duration
      • datetime +|_ day_time_duration
      • date +|_ year_month_duration
      • date +|_ day_time_duration
      • time +|_ day_time_duration
  • Return Value:
    • a ordered list of interval values representing each bin that is overlapping the interval. Note that the internal type as time_to_bin and duration_bin_size.
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first arugment is any other non-interval value,
      • or, the second argument is any other non-date/non-time/non-datetime value,
      • or, the second argument is any other non-year_month_duration/non-day_time_duration value.
  • Example:

    {
      "timebins": overlap_bins(interval(time("17:23:37"), time("18:30:21")), time("00:00:00"), day_time_duration("PT30M")),
      "datebins": overlap_bins(interval(date("1984-03-17"), date("2013-08-22")), date("1990-01-01"), year_month_duration("P10Y")),
      "datetimebins": overlap_bins(interval(datetime("1800-01-01T23:59:48.938"), datetime("2015-07-26T13:28:30.218")),
                                  datetime("1900-01-01T00:00:00.000"), year_month_duration("P100Y"))
    };
    
  • The expected result is:

    {
      "timebins": [
                    interval(time("17:00:00.000Z"), time("17:30:00.000Z")),
                    interval(time("17:30:00.000Z"), time("18:00:00.000Z")),
                    interval(time("18:00:00.000Z"), time("18:30:00.000Z")),
                    interval(time("18:30:00.000Z"), time("19:00:00.000Z"))
                  ],
      "datebins": [
                    interval(date("1980-01-01"), date("1990-01-01")),
                    interval(date("1990-01-01"), date("2000-01-01")),
                    interval(date("2000-01-01"), date("2010-01-01")),
                    interval(date("2010-01-01"), date("2020-01-01"))
                  ],
      "datetimebins": [
                        interval(datetime("1800-01-01T00:00:00.000Z"), datetime("1900-01-01T00:00:00.000Z")),
                        interval(datetime("1900-01-01T00:00:00.000Z"), datetime("2000-01-01T00:00:00.000Z")),
                        interval(datetime("2000-01-01T00:00:00.000Z"), datetime("2100-01-01T00:00:00.000Z"))
                       ]
    };
    

interval_before, interval_after

  • Syntax:

    interval_before(interval1, interval2)
    interval_after(interval1, interval2)
    
  • These two functions check whether an interval happens before/after another interval.

  • Arguments:
    • interval1, interval2: two intervals to be compared
  • Return Value:
    • a boolean value. Specifically, interval_before(interval1, interval2) is true if and only if interval1.end < interval2.start, and interval_after(interval1, interval2) is true if and only if interval1.start > interval2.end.
    • missing if the argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-interval input value will cause a type error.
  • Examples:

    {
      "interval_before": interval_before(interval(date("2000-01-01"), date("2005-01-01")),
                                         interval(date("2005-05-01"), date("2012-09-09"))),
      "interval_after": interval_after(interval(date("2005-05-01"), date("2012-09-09")),
                                       interval(date("2000-01-01"), date("2005-01-01")))
    };
    
  • The expected result is:

    { "interval_before": true, "interval_after": true }
    

interval_covers, interval_covered_by

  • Syntax:

    interval_covers(interval1, interval2)
    interval_covered_by(interval1, interval2)
    
  • These two functions check whether one interval covers the other interval.

  • Arguments:
    • interval1, interval2: two intervals to be compared
  • Return Value:
    • a boolean value. Specifically, interval_covers(interval1, interval2) is true if and only if

      interval1.start <= interval2.start AND interval1.end >= interval2.end

      interval_covered_by(interval1, interval2) is true if and only if

      interval2.start <= interval1.start AND interval2.end >= interval1.end

    • missing if the argument is a missing value,

    • null if any argument is a null value but no argument is a missing value,
    • any other non-interval input value will cause a type error.
  • Examples:

    {
      "interval_covers": interval_covers(interval(date("2000-01-01"), date("2005-01-01")),
                                         interval(date("2000-03-01"), date("2004-09-09"))),
      "interval_covered_by": interval_covered_by(interval(date("2006-08-01"), date("2007-03-01")),
                                                 interval(date("2004-09-10"), date("2012-08-01")))
    };
    
  • The expected result is:

    { "interval_covers": true, "interval_covered_by": true }
    

interval_overlaps, interval_overlapped_by

  • Syntax:

    interval_overlaps(interval1, interval2)
    interval_overlapped_by(interval1, interval2)
    
  • These functions check whether two intervals overlap with each other.

  • Arguments:
    • interval1, interval2: two intervals to be compared
  • Return Value:
    • a boolean value. Specifically, interval_overlaps(interval1, interval2) is true if and only if

      interval1.start < interval2.start AND interval2.end > interval1.end AND interval1.end > interval2.start

    interval_overlapped_by(interval1, interval2) is true if and only if

    interval2.start < interval1.start
    AND interval1.end > interval2.end
    AND interval2.end > interval1.start
    
    • missing if the argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-interval input value will cause a type error.

    Note that interval_overlaps and interval_overlapped_by are following the Allen’s relations on the definition of overlap.

  • Examples:

    {
      "overlaps": interval_overlaps(interval(date("2000-01-01"), date("2005-01-01")),
                                    interval(date("2004-05-01"), date("2012-09-09"))),
      "overlapped_by": interval_overlapped_by(interval(date("2006-08-01"), date("2007-03-01")),
                                              interval(date("2004-05-01"), date("2012-09-09"))))
    };
    
  • The expected result is:

    { "overlaps": true, "overlapped_by": true }
    

interval_overlapping

Note that interval_overlapping is not an Allen’s Relation, but syntactic sugar we added for the case that the intersect of two intervals is not empty. Basically this function returns true if any of these functions return true: interval_overlaps, interval_overlapped_by, interval_covers, or interval_covered_by.

  • Syntax:

    interval_overlapping(interval1, interval2)
    
  • This functions check whether two intervals share any points with each other.

  • Arguments:
    • interval1, interval2: two intervals to be compared
  • Return Value:
    • a boolean value. Specifically, interval_overlapping(interval1, interval2) is true if

      interval1.start < interval2.end AND interval1.end > interval2.start

    • missing if the argument is a missing value,

    • null if any argument is a null value but no argument is a missing value,
    • any other non-interval input value will cause a type error.
  • Examples:

    {
      "overlapping1": interval_overlapping(interval(date("2000-01-01"), date("2005-01-01")),
                                           interval(date("2004-05-01"), date("2012-09-09"))),
      "overlapping2": interval_overlapping(interval(date("2006-08-01"), date("2007-03-01")),
                                           interval(date("2004-09-10"), date("2006-12-31")))
    };
    
  • The expected result is:

    { "overlapping1": true, "overlapping2": true }
    

interval_meets, interval_met_by

  • Syntax:

    interval_meets(interval1, interval2)
    interval_met_by(interval1, interval2)
    
  • These two functions check whether an interval meets with another interval.

  • Arguments:
    • interval1, interval2: two intervals to be compared
  • Return Value:
    • a boolean value. Specifically, interval_meets(interval1, interval2) is true if and only if interval1.end = interval2.start, and interval_met_by(interval1, interval2) is true if and only if interval1.start = interval2.end. If any of the two inputs is null, null is returned.
    • missing if the argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-interval input value will cause a type error.
  • Examples:

    {
      "meets": interval_meets(interval(date("2000-01-01"), date("2005-01-01")),
                              interval(date("2005-01-01"), date("2012-09-09"))),
      "metby": interval_met_by(interval(date("2006-08-01"), date("2007-03-01")),
                               interval(date("2004-09-10"), date("2006-08-01")))
    };
    
  • The expected result is:

    { "meets": true, "metby": true }
    

interval_starts, interval_started_by

  • Syntax:

    interval_starts(interval1, interval2)
    interval_started_by(interval1, interval2)
    
  • These two functions check whether one interval starts with the other interval.

  • Arguments:
    • interval1, interval2: two intervals to be compared
  • Return Value:
    • a boolean value. Specifically, interval_starts(interval1, interval2) returns true if and only if

      interval1.start = interval2.start AND interval1.end <= interval2.end

    interval_started_by(interval1, interval2) returns true if and only if

    interval1.start = interval2.start
    AND interval2.end <= interval1.end
    
    • missing if the argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-interval input value will cause a type error.
  • Examples:

    {
      "interval_starts": interval_starts(interval(date("2000-01-01"), date("2005-01-01")),
                                         interval(date("2000-01-01"), date("2012-09-09"))),
      "interval_started_by": interval_started_by(interval(date("2006-08-01"), date("2007-03-01")),
                                                 interval(date("2006-08-01"), date("2006-08-02")))
    };
    
  • The expected result is:

    { "interval_starts": true, "interval_started_by": true }
    

interval_ends, interval_ended_by

  • Syntax:

    interval_ends(interval1, interval2)
    interval_ended_by(interval1, interval2)
    
  • These two functions check whether one interval ends with the other interval.

  • Arguments:
    • interval1, interval2: two intervals to be compared
  • Return Value:
    • a boolean value. Specifically, interval_ends(interval1, interval2) returns true if and only if

      interval1.end = interval2.end AND interval1.start >= interval2.start

      interval_ended_by(interval1, interval2) returns true if and only if

      interval2.end = interval1.end AND interval2.start >= interval1.start

    • missing if the argument is a missing value,

    • null if any argument is a null value but no argument is a missing value,
    • any other non-interval input value will cause a type error.
  • Examples:

    {
      "interval_ends": interval_ends(interval(date("2000-01-01"), date("2005-01-01")),
                                     interval(date("1998-01-01"), date("2005-01-01"))),
      "interval_ended_by": interval_ended_by(interval(date("2006-08-01"), date("2007-03-01")),
                                             interval(date("2006-09-10"), date("2007-03-01")))
    };
    
  • The expected result is:

    { "interval_ends": true, "interval_ended_by": true }
    

Object Functions

get_object_fields

  • Syntax:

    get_object_fields(input_object)
    
  • Access the object field names, type and open status for a given object.

  • Arguments:
    • input_object : a object value.
  • Return Value:
    • an array of object values that include the field_name string, field_type string, is_open boolean (used for debug purposes only: true if field is open and false otherwise), and optional nested orderedList for the values of a nested object,
    • missing if the argument is a missing value,
    • null if the argument is a null value,
    • any other non-object input value will cause a type error.
  • Example:

    get_object_fields(
                      {
                        "id": 1,
                        "project": "AsterixDB",
                        "address": {"city": "Irvine", "state": "CA"},
                        "related": ["Hivestrix", "Preglix", "Apache VXQuery"]
                      }
                     );
    
  • The expected result is:

    [
      { "field-name": "id", "field-type": "INT64", "is-open": false },
      { "field-name": "project", "field-type": "STRING", "is-open": false },
      { "field-name": "address", "field-type": "RECORD", "is-open": false,
        "nested": [
                    { "field-name": "city", "field-type": "STRING", "is-open": false },
                    { "field-name": "state", "field-type": "STRING", "is-open": false }
                  ]
      },
      { "field-name":
            "related",
            "field-type": "ORDEREDLIST",
            "is-open": false,
            "list": [
                      { "field-type": "STRING" },
                      { "field-type": "STRING" },
                      { "field-type": "STRING" }
                    ]
      }
    ]
    

]

get_object_field_value

  • Syntax:

    get_object_field_value(input_object, string)
    
  • Access the field name given in the string_expression from the object_expression.

  • Arguments:
    • input_object : a object value.
    • string : a string representing the top level field name.
  • Return Value:
    • an any value saved in the designated field of the object,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first argument is any other non-object value,
      • or, the second argument is any other non-string value.
  • Example:

    get_object_field_value({
                             "id": 1,
                             "project": "AsterixDB",
                             "address": {"city": "Irvine", "state": "CA"},
                             "related": ["Hivestrix", "Preglix", "Apache VXQuery"]
                            },
                            "project"
                           );
    
  • The expected result is:

    "AsterixDB"
    

object_remove_fields

  • Syntax:

    object_remove_fields(input_object, field_names)
    
  • Remove indicated fields from a object given a list of field names.

  • Arguments:
    • input_object: a object value.
    • field_names: an array of strings and/or array of array of strings.
  • Return Value:

    • a new object value without the fields listed in the second argument,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first argument is any other non-object value,
      • or, the second argument is any other non-array value or recursively contains non-string items.
  • Example:

    object_remove_fields(
                           {
                             "id":1,
                             "project":"AsterixDB",
                             "address":{"city":"Irvine", "state":"CA"},
                             "related":["Hivestrix", "Preglix", "Apache VXQuery"]
                           },
                           [["address", "city"], "related"]
                         );
    
  • The expected result is:

    {
      "id":1,
      "project":"AsterixDB",
      "address":{ "state": "CA" }
    }
    

object_add_fields

  • Syntax:

    object_add_fields(input_object, fields)
    
  • Add fields to a object given a list of field names.

  • Arguments:
    • input_object : a object value.
    • fields: an array of field descriptor objects where each object has field_name and field_value.
  • Return Value:
    • a new object value with the new fields included,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • a type error will be raised if:
      • the first argument is any other non-object value,
      • the second argument is any other non-array value, or contains non-object items.
  • Example:

    object_add_fields(
                       {
                         "id":1,
                         "project":"AsterixDB",
                         "address":{"city":"Irvine", "state":"CA"},
                         "related":["Hivestrix", "Preglix", "Apache VXQuery"]
                        },
                        [{"field-name":"employment_location", "field-value":create_point(30.0,70.0)}]
                      );
    
  • The expected result is:

    {
       "id":1,
       "project":"AsterixDB",
       "address":{"city":"Irvine", "state":"CA"},
       "related":["Hivestrix", "Preglix", "Apache VXQuery"]
       "employment_location": point("30.0,70.0")
     }
    

object_merge

  • Syntax:

    object_merge(object1, object2)
    
  • Merge two different objects into a new object.

  • Arguments:
    • object1 : a object value.
    • object2 : a object value.
  • Return Value:
    • a new object value with fields from both input objects. If a field’s names in both objects are the same, an exception is issued,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • any other non-object input value will cause a type error.
  • Example:

    object_merge(
                  {
                    "id":1,
                    "project":"AsterixDB",
                    "address":{"city":"Irvine", "state":"CA"},
                    "related":["Hivestrix", "Preglix", "Apache VXQuery"]
                  },
                  {
                    "user_id": 22,
                    "employer": "UC Irvine",
                    "employment_type": "visitor"
                  }
                );
    
  • The expected result is:

    {
      "employment_type": "visitor",
      "address": {
        "city": "Irvine",
        "state": "CA"
      },
      "related": [
        "Hivestrix",
        "Preglix",
        "Apache VXQuery"
      ],
      "user_id": 22,
      "project": "AsterixDB",
      "employer": "UC Irvine",
      "id": 1
    }
    

object_length

  • Syntax:

    object_length(input_object)
    
  • Returns number of top-level fields in the given object

  • Arguments:
    • input_object : an object value.
  • Return Value:
    • an integer that represents the number of top-level fields in the given object,
    • missing if the argument is a missing value,
    • null if the argument is a null value or any other non-object value
  • Example:

    object_length(
                   {
                     "id": 1,
                     "project": "AsterixDB",
                     "address": {"city": "Irvine", "state": "CA"},
                   }
                 );
    
  • The expected result is:

    3
    

object_names

  • Syntax:

    object_names(input_object)
    
  • Returns names of top-level fields in the given object

  • Arguments:
    • input_object : an object value.
  • Return Value:
    • an array with top-level field names of the given object,
    • missing if the argument is a missing value,
    • null if the argument is a null value or any other non-object value
  • Example:

    object_names(
                   {
                     "id": 1,
                     "project": "AsterixDB",
                     "address": {"city": "Irvine", "state": "CA"},
                   }
                 );
    
  • The expected result is:

    [ "id", "project", "address" ]
    

object_remove

  • Syntax:

    object_remove(input_object, field_name)
    
  • Returns a new object that has the same fields as the input object except the field to be removed

  • Arguments:
    • input_object : an object value.
    • field_name : a string field name.
  • Return Value:
    • A new object that has the same fields as input_object except the field field_name,
    • missing if the argument input_object or field_name is missing,
    • null if the argument input_object is null or any other non-object value, or the argument field_name is null or any other non-string value.
  • Example:

    object_remove(
                   {
                     "id": 1,
                     "project": "AsterixDB",
                     "address": {"city": "Irvine", "state": "CA"}
                   }
                   , "address"
                 );
    
  • The expected result is:

    {
      "id": 1,
      "project": "AsterixDB",
    }
    

object_rename

  • Syntax:

    object_rename(input_object, old_field, new_field)
    
  • Returns a new object that has the same fields as input_object with field old_field replaced by new_field

  • Arguments:
    • input_object : an object value.
    • old_field : a string representing the old (original) field name inside the object input_object.
    • new_field : a string representing the new field name to replace old_field inside the object input_object.
  • Return Value:
    • A new object that has the same fields as input_object with field old_field replaced by new_field,
    • missing if any argument is a missing value,
    • null if any argument is null or input_object is non-object value, or old_field is non-string value, or new_field is any non-string value.
  • Example:

    object_rename(
                   {
                     "id": 1,
                     "project": "AsterixDB",
                     "address": {"city": "Irvine", "state": "CA"}
                   }
                   , "address"
                   , "location"
                 );
    
  • The expected result is:

    {
      "id": 1,
      "project": "AsterixDB",
      "location": {"city": "Irvine", "state": "CA"}
    }
    

object_unwrap

  • Syntax:

    object_unwrap(input_object)
    
  • Returns the value of the single name-value pair that appears in input_object.

  • Arguments:
    • input_object : an object value that consists of exactly one name-value pair.
  • Return Value:
    • The value of the single name-value pair that appears in input_object,
    • missing if input_object is missing,
    • null if input_object is null, or an empty object, or there is more than one name-value pair in input_object, or any non-object value.
  • Example:

    object_unwrap(
                 {
                   "id": 1
                 }
               );
    
  • The expected result is:

    {
      1
    }
    

object_replace

  • Syntax:

    object_replace(input_object, old_value, new_value)
    
  • Returns a new object that has the same fields as input_object with all occurrences of value old_value replaced by new_value

  • Arguments:
    • input_object : an object value.
    • old_value : a primitive type value to be replaced by new_value.
    • new_value : a value to replace old_value.
  • Return Value:
    • A new object that has the same fields as input_object with all occurrences of value old_value replaced by new_value,
    • missing if any argument is a missing value,
    • null if input_object or old_value is null,
    • a type error will be raised if:
      • old_value is not a primitive type value.
  • Example:

    object_replace(
                   {
                     "id": 1,
                     "project": "AsterixDB",
                     "address": {"city": "Irvine", "state": "CA"}
                   }
                   , "AsterixDB"
                   , "Apache AsterixDB"
                 );
    
  • The expected result is:

    {
      "id": 1,
      "project": "Apache AsterixDB",
      "location": {"city": "Irvine", "state": "CA"}
    }
    

object_add

  • Syntax:

    object_add(input_object, field_name, field_value)
    
  • Returns a new object that has the same fields as input_object as well as the new field field_name.

  • Arguments:
    • input_object : an object value.
    • field_name : a string representing a field name to be added.
    • field_value : a value to be assigned to the new field field_name.
  • Return Value:
    • A new object that has the same fields as input_object as well as the new field field_name,
    • missing if input_object or field_name is missing,
    • null if input_object or field_name is null, or input_object is not an object, or field_name is not a string,
    • input_object if field_namealready exists in input_object or field_value is missing.
  • Example:

    object_add(
                   {
                     "id": 1,
                     "project": "AsterixDB",
                     "address": {"city": "Irvine", "state": "CA"}
                   }
                   , "company"
                   , "Apache"
                 );
    
  • The expected result is:

    {
      "id": 1,
      "project": "AsterixDB",
      "location": {"city": "Irvine", "state": "CA"},
      "company": "Apache"
    }
    

object_put

  • Syntax:

    object_put(input_object, field_name, field_value)
    
  • Adds, modifies, or removes a field of an object.

  • Arguments:
    • input_object : an object value.
    • field_name : a string representing a field name to be added.
    • field_value : a value to be assigned to the new field field_name.
  • Return Value:
    • a new object that has the same fields as input_object as well as the new field field_name, or with updated field_name value to field_value if field_name already exists in input_object, or with field_nameremoved if field_name already exists in input_object and field_value is missing,
    • missing if input_object or field_name is missing,
    • null if input_object or field_name is null, or input_object is not an object, or field_name is not not a string.
  • Example:

    object_put(
                   {
                     "id": 1,
                     "project": "AsterixDB",
                     "address": {"city": "Irvine", "state": "CA"}
                   }
                   , "project"
                   , "Apache AsterixDB"
                 );
    
  • The expected result is:

    {
      "id": 1,
      "project": "Apache AsterixDB",
      "location": {"city": "Irvine", "state": "CA"}
    }
    

object_values

  • Syntax:

    object_values(input_object)
    
  • Returns an array of the values of the fields in input_object.

  • Arguments:
    • input_object : an object value.
  • Return Value:
    • An array of the values of the fields in input_object,
    • missing if input_object is missing,
    • null if input_object is null or any non-object value.
  • Example:

    object_values(
                   {
                     "id": 1,
                     "project": "AsterixDB",
                     "address": {"city": "Irvine", "state": "CA"}
                   }
                 );
    
  • The expected result is:

    [
      1,
      "AsterixDB",
      {"city": "Irvine", "state": "CA"}
    ]
    

object_pairs

  • Syntax:

    object_pairs(input_object)
    
  • Returns an array of objects describing fields of input_object. For each field of the input_object the returned array contains an object with two fields name and value which are set to the input_object’s field name and value.

  • Arguments:

    • input_object : an object value.
  • Return Value:
    • An array of the name/value pairs of the fields in input_object,
    • missing if input_object is missing,
    • null if input_object is null or any non-object value.
  • Example:

    object_pairs(
                  {
                    "id": 1,
                    "project": "AsterixDB",
                    "address": {"city": "Irvine", "state": "CA"}
                  }
                );
    
  • The expected result is:

    [
      { "name": "id", "value": 1 },
      { "name": "project", "value": "AsterixDB" },
      { "name": "address", "value": {"city": "Irvine", "state": "CA"} }
    ]
    

pairs

  • Syntax:

    pairs(input_object)
    
  • Returns an array of arrays describing fields of input_object, including nested fields. For each field of the input_object the returned array contains an array with two elements. The first element is the name and the second one is the value of the input_object’s field. The input object is introspected recursively, so all fields of its nested objects are returned. Nested objects contained in arrays and multisets are also processed by this function.

  • Arguments:

    • input_object : an object value (or an array or a multiset)
  • Return Value:
    • An array of arrays with name, value pairs of the fields in input_object, including nested fields. Each inner array has exactly two items: name and value of the input_object’s field.
    • missing if input_object is missing,
    • null if input_object is null or a value of a primitive data type.
  • Example:

    pairs(
           {
             "id": 1,
             "project": "AsterixDB",
             "address": {"city": "Irvine", "state": "CA"}
           }
         );
    
  • The expected result is:

    [
      [ "id", 1 ],
      [ "project", "AsterixDB" ],
      [ "address", { "city": "Irvine", "state": "CA" } ],
      [ "city", "Irvine" ],
      [ "state", "CA" ]
    ]
    

Aggregate Functions (Array Functions)

This section contains detailed descriptions of the built-in aggregate functions in the query language.

The query language also supports standard SQL aggregate functions (e.g., MIN, MAX, SUM, COUNT, and AVG). Note that these are not real functions in the query language, but just syntactic sugars over corresponding builtin aggregate functions (e.g., ARRAY_MIN, ARRAY_MAX, ARRAY_SUM, ARRAY_COUNT, and ARRAY_AVG). Refer to SQL-92 Aggregation Functions for details.

The DISTINCT keyword may be used with built-in aggregate functions and standard SQL aggregate functions. It may also be used with aggregate functions used as window functions. It determines whether the function aggregates all values in the group, or distinct values only. Refer to Aggregation Functions for details.

Aggregate functions may be used as window functions when they are used with an OVER clause. Refer to OVER Clauses for details.

array_count

  • Syntax:

    array_count(collection)
    
  • Gets the number of non-null and non-missing items in the given collection.

  • Arguments:
    • collection could be:
      • an array or multiset to be counted,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • a bigint value representing the number of non-null and non-missing items in the given collection,
    • null is returned if the input is null or missing,
    • any other non-array and non-multiset input value will cause an error.
  • Example:

    array_count( ['hello', 'world', 1, 2, 3, null, missing] );
    
  • The expected result is:

    5
    

array_avg

  • Syntax:

    array_avg(num_collection)
    
  • Gets the average value of the non-null and non-missing numeric items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset containing numeric values, nulls or missings,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • a double value representing the average of the non-null and non-missing numbers in the given collection,
    • null is returned if the input is null or missing,
    • null is returned if the given collection does not contain any non-null and non-missing items,
    • any other non-array and non-multiset input value will cause a type error,
    • any other non-numeric value in the input collection will cause a type error.
  • Example:

    array_avg( [1.2, 2.3, 3.4, 0, null] );
    
  • The expected result is:

    1.725
    

array_sum

  • Syntax:

    array_sum(num_collection)
    
  • Gets the sum of non-null and non-missing items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset containing numeric values, nulls or missings,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • the sum of the non-null and non-missing numbers in the given collection. The returning type is decided by the item type with the highest order in the numeric type promotion order (tinyint-> smallint->integer->bigint->float->double) among items.
    • null is returned if the input is null or missing,
    • null is returned if the given collection does not contain any non-null and non-missing items,
    • any other non-array and non-multiset input value will cause a type error,
    • any other non-numeric value in the input collection will cause a type error.
  • Example:

    array_sum( [1.2, 2.3, 3.4, 0, null, missing] );
    
  • The expected result is:

    6.9
    

array_min

  • Syntax:

    array_min(num_collection)
    
  • Gets the min value of non-null and non-missing comparable items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • the min value of non-null and non-missing values in the given collection. The returning type is decided by the item type with the highest order in the type promotion order (tinyint-> smallint->integer->bigint->float->double) among numeric items.
    • null is returned if the input is null or missing,
    • null is returned if the given collection does not contain any non-null and non-missing items,
    • multiple incomparable items in the input array or multiset will cause a type error,
    • any other non-array and non-multiset input value will cause a type error.
  • Example:

    array_min( [1.2, 2.3, 3.4, 0, null, missing] );
    
  • The expected result is:

    0.0
    

array_max

  • Syntax:

    array_max(num_collection)
    
  • Gets the max value of the non-null and non-missing comparable items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • the max value of non-null and non-missing numbers in the given collection. The returning type is decided by the item type with the highest order in the type promotion order (tinyint-> smallint->integer->bigint->float->double) among numeric items.
    • null is returned if the input is null or missing,
    • null is returned if the given collection does not contain any non-null and non-missing items,
    • multiple incomparable items in the input array or multiset will cause a type error,
    • any other non-array and non-multiset input value will cause a type error.
  • Example:

    array_max( [1.2, 2.3, 3.4, 0, null, missing] );
    
  • The expected result is:

    3.4
    

array_stddev_samp

  • Syntax:

    array_stddev_samp(num_collection)
    
  • Gets the sample standard deviation value of the non-null and non-missing numeric items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset containing numeric values, nulls or missings,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • a double value representing the sample standard deviation of the non-null and non-missing numbers in the given collection,
    • null is returned if the input is null or missing,
    • null is returned if the given collection does not contain any non-null and non-missing items,
    • any other non-array and non-multiset input value will cause a type error,
    • any other non-numeric value in the input collection will cause a type error.
  • Example:

    array_stddev_samp( [1.2, 2.3, 3.4, 0, null] );
    
  • The expected result is:

    1.4591664287073858
    

array_stddev_pop

  • Syntax:

    array_stddev_pop(num_collection)
    
  • Gets the population standard deviation value of the non-null and non-missing numeric items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset containing numeric values, nulls or missings,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • a double value representing the population standard deviation of the non-null and non-missing numbers in the given collection,
    • null is returned if the input is null or missing,
    • null is returned if the given collection does not contain any non-null and non-missing items,
    • any other non-array and non-multiset input value will cause a type error,
    • any other non-numeric value in the input collection will cause a type error.
  • Example:

    array_stddev_pop( [1.2, 2.3, 3.4, 0, null] );
    
  • The expected result is:

    1.2636751956100112
    

array_var_samp

  • Syntax:

    array_var_samp(num_collection)
    
  • Gets the sample variance value of the non-null and non-missing numeric items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset containing numeric values, nulls or missings,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • a double value representing the sample variance of the non-null and non-missing numbers in the given collection,
    • null is returned if the input is null or missing,
    • null is returned if the given collection does not contain any non-null and non-missing items,
    • any other non-array and non-multiset input value will cause a type error,
    • any other non-numeric value in the input collection will cause a type error.
  • Example:

    array_var_samp( [1.2, 2.3, 3.4, 0, null] );
    
  • The expected result is:

    2.1291666666666664
    

array_var_pop

  • Syntax:

    array_var_pop(num_collection)
    
  • Gets the population variance value of the non-null and non-missing numeric items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset containing numeric values, nulls or missings,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • a double value representing the population variance of the non-null and non-missing numbers in the given collection,
    • null is returned if the input is null or missing,
    • null is returned if the given collection does not contain any non-null and non-missing items,
    • any other non-array and non-multiset input value will cause a type error,
    • any other non-numeric value in the input collection will cause a type error.
  • Example:

    array_var_pop( [1.2, 2.3, 3.4, 0, null] );
    
  • The expected result is:

    1.5968749999999998
    

array_skewness

  • Syntax:

    array_skewness(num_collection)
    
  • Gets the skewness value of the non-null and non-missing numeric items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset containing numeric values, nulls or missings,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • a double value representing the skewness of the non-null and non-missing numbers in the given collection,
    • null is returned if the input is null or missing,
    • null is returned if the given collection does not contain any non-null and non-missing items,
    • any other non-array and non-multiset input value will cause a type error,
    • any other non-numeric value in the input collection will cause a type error.
  • Example:

    array_skewness( [1.2, 2.3, 3.4, 0, null] );
    
  • The expected result is:

    -0.04808451539164242
    

array_kurtosis

  • Syntax:

    array_kurtosis(num_collection)
    
  • Gets the kurtosis value from the normal distribution of the non-null and non-missing numeric items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset containing numeric values, nulls or missings,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • a double value representing the kurtosis from a normal distribution of the non-null and non-missing numbers in the given collection,
    • null is returned if the input is null or missing,
    • null is returned if the given collection does not contain any non-null and non-missing items,
    • any other non-array and non-multiset input value will cause a type error,
    • any other non-numeric value in the input collection will cause a type error.
  • Example:

    array_kurtosis( [1.2, 2.3, 3.4, 0, null] );
    
  • The expected result is:

    -1.342049701096427
    

strict_count

  • Syntax:

    strict_count(collection)
    
  • Gets the number of items in the given collection.

  • Arguments:
    • collection could be:
      • an array or multiset containing the items to be counted,
      • or a null value,
      • or a missing value.
  • Return Value:
    • a bigint value representing the number of items in the given collection,
    • null is returned if the input is null or missing.
  • Example:

    strict_count( [1, 2, null, missing] );
    
  • The expected result is:

    4
    

strict_avg

  • Syntax:

    strict_avg(num_collection)
    
  • Gets the average value of the numeric items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset containing numeric values, nulls or missings,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • a double value representing the average of the numbers in the given collection,
    • null is returned if the input is null or missing,
    • null is returned if there is a null or missing in the input collection,
    • any other non-numeric value in the input collection will cause a type error.
  • Example:

    strict_avg( [100, 200, 300] );
    
  • The expected result is:

    200.0
    

strict_sum

  • Syntax:

    strict_sum(num_collection)
    
  • Gets the sum of the items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset containing numeric values, nulls or missings,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • the sum of the numbers in the given collection. The returning type is decided by the item type with the highest order in the numeric type promotion order (tinyint-> smallint->integer->bigint->float->double) among items.
    • null is returned if the input is null or missing,
    • null is returned if there is a null or missing in the input collection,
    • any other non-numeric value in the input collection will cause a type error.
  • Example:

    strict_sum( [100, 200, 300] );
    
  • The expected result is:

    600
    

strict_min

  • Syntax:

    strict_min(num_collection)
    
  • Gets the min value of comparable items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • the min value of the given collection. The returning type is decided by the item type with the highest order in the type promotion order (tinyint-> smallint->integer->bigint->float->double) among numeric items.
    • null is returned if the input is null or missing,
    • null is returned if there is a null or missing in the input collection,
    • multiple incomparable items in the input array or multiset will cause a type error,
    • any other non-array and non-multiset input value will cause a type error.
  • Example:

    strict_min( [10.2, 100, 5] );
    
  • The expected result is:

    5.0
    

strict_max

  • Syntax:

    strict_max(num_collection)
    
  • Gets the max value of numeric items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • The max value of the given collection. The returning type is decided by the item type with the highest order in the type promotion order (tinyint-> smallint->integer->bigint->float->double) among numeric items.
    • null is returned if the input is null or missing,
    • null is returned if there is a null or missing in the input collection,
    • multiple incomparable items in the input array or multiset will cause a type error,
    • any other non-array and non-multiset input value will cause a type error.
  • Example:

    strict_max( [10.2, 100, 5] );
    
  • The expected result is:

    100.0
    

strict_stddev_samp

  • Syntax:

    strict_stddev_samp(num_collection)
    
  • Gets the sample standard deviation value of the numeric items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset containing numeric values, nulls or missings,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • a double value representing the sample standard deviation of the numbers in the given collection,
    • null is returned if the input is null or missing,
    • null is returned if there is a null or missing in the input collection,
    • any other non-numeric value in the input collection will cause a type error.
  • Example:

    strict_stddev_samp( [100, 200, 300] );
    
  • The expected result is:

    100.0
    

strict_stddev_pop

  • Syntax:

    strict_stddev_pop(num_collection)
    
  • Gets the population standard deviation value of the numeric items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset containing numeric values, nulls or missings,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • a double value representing the population standard deviation of the numbers in the given collection,
    • null is returned if the input is null or missing,
    • null is returned if there is a null or missing in the input collection,
    • any other non-numeric value in the input collection will cause a type error.
  • Example:

    strict_stddev_pop( [100, 200, 300] );
    
  • The expected result is:

    81.64965809277261
    

strict_var_samp

  • Syntax:

    strict_var_samp(num_collection)
    
  • Gets the sample variance value of the numeric items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset containing numeric values, nulls or missings,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • a double value representing the sample variance of the numbers in the given collection,
    • null is returned if the input is null or missing,
    • null is returned if there is a null or missing in the input collection,
    • any other non-numeric value in the input collection will cause a type error.
  • Example:

    strict_var_samp( [100, 200, 300] );
    
  • The expected result is:

    10000.0
    

strict_var_pop

  • Syntax:

    strict_var_pop(num_collection)
    
  • Gets the population variance value of the numeric items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset containing numeric values, nulls or missings,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • a double value representing the population variance of the numbers in the given collection,
    • null is returned if the input is null or missing,
    • null is returned if there is a null or missing in the input collection,
    • any other non-numeric value in the input collection will cause a type error.
  • Example:

    strict_var_pop( [100, 200, 300] );
    
  • The expected result is:

    6666.666666666667
    

strict_skewness

  • Syntax:

    strict_skewness(num_collection)
    
  • Gets the skewness value of the numeric items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset containing numeric values, nulls or missings,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • a double value representing the skewness of the numbers in the given collection,
    • null is returned if the input is null or missing,
    • null is returned if there is a null or missing in the input collection,
    • any other non-numeric value in the input collection will cause a type error.
  • Example:

    strict_skewness( [100, 200, 300] );
    
  • The expected result is:

    0.0
    

strict_kurtosis

  • Syntax:

    strict_kurtosis(num_collection)
    
  • Gets the kurtosis value from the normal distribution of the numeric items in the given collection.

  • Arguments:
    • num_collection could be:
      • an array or multiset containing numeric values, nulls or missings,
      • or, a null value,
      • or, a missing value.
  • Return Value:
    • a double value representing the kurtosis from a normal distribution of the numbers in the given collection,
    • null is returned if the input is null or missing,
    • null is returned if there is a null or missing in the input collection,
    • any other non-numeric value in the input collection will cause a type error.
  • Example:

    strict_kurtosis( [100, 200, 300] );
    
  • The expected result is:

    -1.5
    

Comparison Functions

greatest

  • Syntax:

    greatest(numeric_value1, numeric_value2, ...)
    
  • Computes the greatest value among arguments.

  • Arguments:
    • numeric_value1: a tinyint/smallint/integer/bigint/float/double value,
    • numeric_value2: a tinyint/smallint/integer/bigint/float/double value,
    • ….
  • Return Value:
    • the greatest values among arguments. The returning type is decided by the item type with the highest order in the numeric type promotion order (tinyint-> smallint->integer->bigint->float->double) among items.
    • null if any argument is a missing value or null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": greatest(1, 2, 3), "v2": greatest(float("0.5"), double("-0.5"), 5000) };
    
  • The expected result is:

    { "v1": 3, "v2": 5000.0 }
    

least

  • Syntax:

    least(numeric_value1, numeric_value2, ...)
    
  • Computes the least value among arguments.

  • Arguments:
    • numeric_value1: a tinyint/smallint/integer/bigint/float/double value,
    • numeric_value2: a tinyint/smallint/integer/bigint/float/double value,
    • ….
  • Return Value:
    • the least values among arguments. The returning type is decided by the item type with the highest order in the numeric type promotion order (tinyint-> smallint->integer->bigint->float->double) among items.
    • null if any argument is a missing value or null value,
    • any other non-numeric input value will cause a type error.
  • Example:

    { "v1": least(1, 2, 3), "v2": least(float("0.5"), double("-0.5"), 5000) };
    
  • The expected result is:

    { "v1": 1, "v2": -0.5 }
    

Type Functions

get_type

  • Syntax:

    get_type(expr)
    
  • Returns a string describing the type of the given expr. This includes incomplete information types (i.e. missing and null).

  • Arguments:
    • expr : an expression (any type is allowed).
  • Example:

    {
      "a": get_type(true),
      "b": get_type(false),
      "c": get_type(null),
      "d": get_type(missing),
      "e": get_type("d"),
      "f": gettype(4.0),
      "g": gettype(5),
      "h": gettype(["1", 2]),
      "i": gettype({"a":1})
    };
    
  • The expected result is:

    { "a": "boolean", "b": "boolean", "c": "null", "d": "missing", "e": "string", "f": "double", "g": "bigint", "h": "array", "i": "object" }
    

The function has an alias gettype.

is_array

  • Syntax:

    is_array(expr)
    
  • Checks whether the given expression is evaluated to be an array value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is an array value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

    {
      "a": is_array(true),
      "b": is_array(false),
      "c": isarray(null),
      "d": isarray(missing),
      "e": isarray("d"),
      "f": isarray(4.0),
      "g": isarray(5),
      "h": isarray(["1", 2]),
      "i": isarray({"a":1})
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": false, "f": false, "g": false, "h": true, "i": false }
    

The function has an alias isarray.

is_multiset

  • Syntax:

    is_multiset(expr)
    
  • Checks whether the given expression is evaluated to be an multiset value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is an multiset value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

    {
      "a": is_multiset(true),
      "b": is_multiset(false),
      "c": is_multiset(null),
      "d": is_multiset(missing),
      "e": is_multiset("d"),
      "f": ismultiset(4.0),
      "g": ismultiset(["1", 2]),
      "h": ismultiset({"a":1}),
      "i": ismultiset({{"hello", 9328, "world", [1, 2, null]}})
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": false, "f": false, "g": false, "h": false, "i": true }
    

The function has an alias ismultiset.

is_atomic (is_atom)

  • Syntax:

    is_atomic(expr)
    
  • Checks whether the given expression is evaluated to be a value of a primitive type.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a primitive type or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

    {
      "a": is_atomic(true),
      "b": is_atomic(false),
      "c": isatomic(null),
      "d": isatomic(missing),
      "e": isatomic("d"),
      "f": isatom(4.0),
      "g": isatom(5),
      "h": isatom(["1", 2]),
      "i": isatom({"a":1})
    };
    
  • The expected result is:

    { "a": true, "b": true, "c": null, "e": true, "f": true, "g": true, "h": false, "i": false }
    

The function has three aliases: isatomic, is_atom, and isatom.

is_boolean (is_bool)

  • Syntax:

    is_boolean(expr)
    
  • Checks whether the given expression is evaluated to be a boolean value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a boolean value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

    {
      "a": isboolean(true),
      "b": isboolean(false),
      "c": is_boolean(null),
      "d": is_boolean(missing),
      "e": isbool("d"),
      "f": isbool(4.0),
      "g": isbool(5),
      "h": isbool(["1", 2]),
      "i": isbool({"a":1})
    };
    
  • The expected result is:

    { "a": true, "b": true, "c": null, "e": false, "f": false, "g": false, "h": false, "i": false }
    

The function has three aliases: isboolean, is_bool, and isbool.

is_binary (is_bin)

  • Syntax:

    is_binary(expr)
    
  • Checks whether the given expression is evaluated to be a binary value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a binary value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

    {
      "a": is_binary(true),
      "b": is_binary(false),
      "c": isbinary(null),
      "d": isbinary(missing),
      "e": isbin(point("1,2")),
      "f": isbin(hex("ABCDEF0123456789")),
      "g": is_bin(sub_binary(hex("AABBCCDD"), 4)),
      "h": is_bin(2),
      "i": is_bin({"a":1})
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": false, "f": true, "g": true, "h": false, "i": false }
    

The function has three aliases: isbinary, is_bin, and isbin.

is_number (is_num)

  • Syntax:

    is_number(expr)
    
  • Checks whether the given expression is evaluated to be a numeric value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a smallint/tinyint/integer/bigint/float/double value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

    {
      "a": is_number(true),
      "b": is_number(false),
      "c": isnumber(null),
      "d": isnumber(missing),
      "e": isnumber("d"),
      "f": isnum(4.0),
      "g": isnum(5),
      "h": isnum(["1", 2]),
      "i": isnum({"a":1})
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": false, "f": true, "g": true, "h": false, "i": false }
    

The function has three aliases: isnumber, is_num, and isnum.

is_point

  • Syntax:

    is_point(expr)
    
  • Checks whether the given expression is evaluated to be a point value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a point value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

    {
      "a": is_point(true),
      "b": is_point(false),
      "c": is_point(null),
      "d": is_point(missing),
      "e": is_point(point("1,2")),
      "f": ispoint(line("30.0,70.0 50.0,90.0")),
      "g": ispoint(rectangle("30.0,70.0 50.0,90.0")),
      "h": ispoint(circle("30.0,70.0 5.0")),
      "i": ispoint(polygon("1.0,1.0 2.0,2.0 3.0,3.0 4.0,4.0")),
      "j": ispoint(3)
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": true, "f": false, "g": false, "h": false, "i": false, "j": false }
    

The function has an alias ispoint.

is_line

  • Syntax:

    is_line(expr)
    
  • Checks whether the given expression is evaluated to be a line value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a line value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

    {
      "a": is_line(true),
      "b": is_line(false),
      "c": is_line(null),
      "d": is_line(missing),
      "e": is_line(point("1,2")),
      "f": isline(line("30.0,70.0 50.0,90.0")),
      "g": isline(rectangle("30.0,70.0 50.0,90.0")),
      "h": isline(circle("30.0,70.0 5.0")),
      "i": isline(polygon("1.0,1.0 2.0,2.0 3.0,3.0 4.0,4.0")),
      "j": isline(3)
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": false, "f": true, "g": false, "h": false, "i": false, "j": false }
    

The function has an alias isline.

is_rectangle

  • Syntax:

    is_rectangle(expr)
    
  • Checks whether the given expression is evaluated to be a rectangle value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a rectangle value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

    {
      "a": is_rectangle(true),
      "b": is_rectangle(false),
      "c": is_rectangle(null),
      "d": is_rectangle(missing),
      "e": is_rectangle(point("1,2")),
      "f": isrectangle(line("30.0,70.0 50.0,90.0")),
      "g": isrectangle(rectangle("30.0,70.0 50.0,90.0")),
      "h": isrectangle(circle("30.0,70.0 5.0")),
      "i": isrectangle(polygon("1.0,1.0 2.0,2.0 3.0,3.0 4.0,4.0")),
      "j": isrectangle(3)
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": false, "f": false, "g": true, "h": false, "i": false, "j": false }
    

The function has an alias isrectangle.

is_circle

  • Syntax:

    is_circle(expr)
    
  • Checks whether the given expression is evaluated to be a circle value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a circle value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

    {
      "a": is_circle(true),
      "b": is_circle(false),
      "c": is_circle(null),
      "d": is_circle(missing),
      "e": is_circle(point("1,2")),
      "f": iscircle(line("30.0,70.0 50.0,90.0")),
      "g": iscircle(rectangle("30.0,70.0 50.0,90.0")),
      "h": iscircle(circle("30.0,70.0 5.0")),
      "i": iscircle(polygon("1.0,1.0 2.0,2.0 3.0,3.0 4.0,4.0")),
      "j": iscircle(3)
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": false, "f": false, "g": false, "h": true, "i": false, "j": false }
    

The function has an alias iscircle.

is_polygon

  • Syntax:

    is_polygon(expr)
    
  • Checks whether the given expression is evaluated to be a polygon value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a polygon value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

    {
      "a": is_polygon(true),
      "b": is_polygon(false),
      "c": is_polygon(null),
      "d": is_polygon(missing),
      "e": is_polygon(point("1,2")),
      "f": ispolygon(line("30.0,70.0 50.0,90.0")),
      "g": ispolygon(rectangle("30.0,70.0 50.0,90.0")),
      "h": ispolygon(circle("30.0,70.0 5.0")),
      "i": ispolygon(polygon("1.0,1.0 2.0,2.0 3.0,3.0 4.0,4.0")),
      "j": ispolygon(3)
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": false, "f": false, "g": false, "h": false, "i": true, "j": false }
    

The function has an alias ispolygon.

is_spatial

  • Syntax:

    is_spatial(expr)
    
  • Checks whether the given expression is evaluated to be a spatial value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a point/line/rectangle/circle/polygon value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

    {
      "a": is_spatial(true),
      "b": is_spatial(false),
      "c": is_spatial(null),
      "d": is_spatial(missing),
      "e": is_spatial(point("1,2")),
      "f": isspatial(line("30.0,70.0 50.0,90.0")),
      "g": isspatial(rectangle("30.0,70.0 50.0,90.0")),
      "h": isspatial(circle("30.0,70.0 5.0")),
      "i": isspatial(polygon("1.0,1.0 2.0,2.0 3.0,3.0 4.0,4.0")),
      "j": isspatial(3)
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": true, "f": true, "g": true, "h": true, "i": true, "j": false }
    

The function has an alias isspatial.

is_date

  • Syntax:

    is_date(expr)
    
  • Checks whether the given expression is evaluated to be a date value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a date value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

    {
      "a": is_date(true),
      "b": is_date(false),
      "c": is_date(null),
      "d": is_date(missing),
      "e": is_date(date("-19700101")),
      "f": isdate(date("2013-01-01")),
      "g": isdate(time("12:12:12.039Z")),
      "h": isdate(datetime("2013-01-01T12:12:12.039Z")),
      "i": isdate(duration("P100Y12MT12M")),
      "j": isdate(interval(date("2013-01-01"), date("20130505"))),
      "k": isdate(3)
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": true, "f": true, "g": false, "h": false, "i": false, "j": false, "k": false }
    

The function has an alias isdate.

is_datetime (is_timestamp)

  • Syntax:

    is_datetime(expr)
    
  • Checks whether the given expression is evaluated to be a datetime value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a datetime value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

    {
      "a": is_datetime(true),
      "b": is_datetime(false),
      "c": is_datetime(null),
      "d": is_datetime(missing),
      "e": is_datetime(datetime("2016-02-02T12:09:22.023Z")),
      "f": isdatetime(datetime("2011-03-03T12:10:42.011Z")),
      "g": isdatetime(time("12:12:12.039Z")),
      "h": is_timestamp(datetime("2013-01-01T12:12:12.039Z")),
      "i": is_timestamp(duration("P100Y12MT12M")),
      "j": istimestamp(interval(date("2013-01-01"), date("20130505"))),
      "k": istimestamp(3)
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": true, "f": true, "g": false, "h": true, "i": false, "j": false, "k": false }
    

The function has three aliases: isdatetime, is_timestamp, and istimestamp.

is_time

  • Syntax:

    is_time(expr)
    
  • Checks whether the given expression is evaluated to be a time value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a time value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

     {
      "a": is_time(true),
      "b": is_time(false),
      "c": is_time(null),
      "d": is_time(missing),
      "e": is_time(time("08:00:00.000Z")),
      "f": istime(date("2013-01-01")),
      "g": istime(time("12:12:12.039Z")),
      "h": istime(datetime("2013-01-01T12:12:12.039Z")),
      "i": istime(duration("P100Y12MT12M")),
      "j": istime(interval(date("2013-01-01"), date("20130505"))),
      "k": istime(3)
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": true, "f": false, "g": true, "h": false, "i": false, "j": false, "k": false }
    

The function has an alias istime.

is_duration

  • Syntax:

    is_duration(expr)
    
  • Checks whether the given expression is evaluated to be a duration value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a duration/year_month_duration/day_time_duration value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

     {
      "a": is_duration(true),
      "b": is_duration(false),
      "c": is_duration(null),
      "d": is_duration(missing),
      "e": is_duration(duration("-PT20.943S")),
      "f": isduration(date("2013-01-01")),
      "g": isduration(time("12:12:12.039Z")),
      "h": isduration(datetime("2013-01-01T12:12:12.039Z")),
      "i": isduration(duration("P100Y12MT12M")),
      "j": isduration(interval(date("2013-01-01"), date("20130505"))),
      "k": isduration(3)
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": true, "f": false, "g": false, "h": false, "i": true, "j": false, "k": false }
    

The function has an alias isduration.

is_interval

  • Syntax:

    is_interval(expr)
    
  • Checks whether the given expression is evaluated to be a interval value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a interval value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

     {
      "a": is_interval(true),
      "b": is_interval(false),
      "c": is_interval(null),
      "d": is_interval(missing),
      "e": is_interval(interval(datetime("2013-01-01T00:01:01.000Z"), datetime("2013-05-05T13:39:01.049Z"))),
      "f": isinterval(date("2013-01-01")),
      "g": isinterval(time("12:12:12.039Z")),
      "h": isinterval(datetime("2013-01-01T12:12:12.039Z")),
      "i": isinterval(duration("P100Y12MT12M")),
      "j": isinterval(interval(date("2013-01-01"), date("20130505"))),
      "k": isinterval(3)
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": true, "f": false, "g": false, "h": false, "i": false, "j": true, "k": false }
    

The function has an alias isinterval.

is_temporal

  • Syntax:

    is_temporal(expr)
    
  • Checks whether the given expression is evaluated to be a temporal value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a date/datetime/time/duration/year_month_duration/day_time_duration/interval value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

     {
      "a": is_temporal(true),
      "b": is_temporal(false),
      "c": is_temporal(null),
      "d": is_temporal(missing),
      "e": is_temporal(duration("-PT20.943S")),
      "f": istemporal(date("2013-01-01")),
      "g": istemporal(time("12:12:12.039Z")),
      "h": istemporal(datetime("2013-01-01T12:12:12.039Z")),
      "i": istemporal(duration("P100Y12MT12M")),
      "j": istemporal(interval(date("2013-01-01"), date("20130505"))),
      "k": istemporal(3)
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": true, "f": true, "g": true, "h": true, "i": true, "j": true, "k": false }
    

The function has an alias istemporal.

is_object (is_obj)

  • Syntax:

    is_object(expr)
    
  • Checks whether the given expression is evaluated to be a object value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a object value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

    {
      "a": is_object(true),
      "b": is_object(false),
      "c": isobject(null),
      "d": isobject(missing),
      "e": isobj("d"),
      "f": isobj(4.0),
      "g": isobj(5),
      "h": isobj(["1", 2]),
      "i": isobj({"a":1})
    };
    
  • The expected result is:

    { “a”: false, “b”: false, “c”: null, “e”: false, “f”: false, “g”: false, “h”: false, “i”: true }

The function has three aliases: isobject, is_obj, and isobj.

is_string (is_str)

  • Syntax:

    is_string(expr)
    
  • Checks whether the given expression is evaluated to be a string value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a string value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

    {
      "a": is_string(true),
      "b": isstring(false),
      "c": isstring(null),
      "d": isstr(missing),
      "e": isstr("d"),
      "f": isstr(4.0),
      "g": isstr(5),
      "h": isstr(["1", 2]),
      "i": isstr({"a":1})
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": true, "f": false, "g": false, "h": false, "i": false }
    

The function has three aliases: isstring, is_str, and isstr.

is_uuid

  • Syntax:

    is_uuid(expr)
    
  • Checks whether the given expression is evaluated to be a uuid value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the argument is a uuid value or not,
    • a missing if the argument is a missing value,
    • a null if the argument is a null value.
  • Example:

     {
      "a": is_uuid(true),
      "b": is_uuid(false),
      "c": is_uuid(null),
      "d": is_uuid(missing),
      "e": isuuid(4.0),
      "f": isuuid(date("2013-01-01")),
      "g": isuuid(uuid("5c848e5c-6b6a-498f-8452-8847a2957421"))
    };
    
  • The expected result is:

    { "a": false, "b": false, "c": null, "e": false, "f": false, "g": true }
    

The function has an alias isuuid.

is_null

  • Syntax:

    is_null(expr)
    
  • Checks whether the given expression is evaluated to be a null value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the variable is a null or not,
    • a missing if the input is missing.
  • Example:

    { "v1": is_null(null), "v2": is_null(1), "v3": is_null(missing) };
    
  • The expected result is:

    { "v1": true, "v2": false }
    

The function has an alias isnull.

is_missing

  • Syntax:

    is_missing(expr)
    
  • Checks whether the given expression is evaluated to be a missing value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the variable is a missing or not.
  • Example:

    { "v1": is_missing(null), "v2": is_missing(1), "v3": is_missing(missing) };
    
  • The expected result is:

    { "v1": false, "v2": false, "v3": true }
    

The function has an alias ismissing.

is_unknown

  • Syntax:

    is_unknown(expr)
    
  • Checks whether the given variable is a null value or a missing value.

  • Arguments:
    • expr : an expression (any type is allowed).
  • Return Value:
    • a boolean on whether the variable is a null/``missingvalue (true) or not (false`).
  • Example:

    { "v1": is_unknown(null), "v2": is_unknown(1), "v3": is_unknown(missing) };
    
  • The expected result is:

    { "v1": true, "v2": false, "v3": true }
    

The function has an alias isunknown.

to_array

  • Syntax:

    to_array(expr)
    
  • Converts input value to an array value

  • Arguments:
    • expr : an expression
  • Return Value:
    • if the argument is missing then missing is returned
    • if the argument is null then null is returned
    • if the argument is of array type then it is returned as is
    • if the argument is of multiset type then it is returned as an array with elements in an undefined order
    • otherwise an array containing the input expression as its single item is returned
  • Example:

    {
      "v1": to_array("asterix"),
      "v2": to_array(["asterix"]),
    };
    
  • The expected result is:

    { "v1": ["asterix"], "v2": ["asterix"] }
    

The function has an alias toarray.

to_atomic (to_atom)

  • Syntax:

    to_atomic(expr)
    
  • Converts input value to a primitive value

  • Arguments:
    • expr : an expression
  • Return Value:
    • if the argument is missing then missing is returned
    • if the argument is null then null is returned
    • if the argument is of primitive type then it is returned as is
    • if the argument is of array or multiset type and has only one element then the result of invoking to_atomic() on that element is returned
    • if the argument is of object type and has only one field then the result of invoking to_atomic() on the value of that field is returned
    • otherwise null is returned
  • Example:

    {
      "v1": to_atomic("asterix"),
      "v2": to_atomic(["asterix"]),
      "v3": to_atomic([0, 1]),
      "v4": to_atomic({"value": "asterix"}),
      "v5": to_number({"x": 1, "y": 2})
    };
    
  • The expected result is:

    { "v1": "asterix", "v2": "asterix", "v3": null, "v4": "asterix", "v5": null }
    

The function has three aliases: toatomic, to_atom, and toatom.

to_boolean (to_bool)

  • Syntax:

    to_boolean(expr)
    
  • Converts input value to a boolean value

  • Arguments:
    • expr : an expression
  • Return Value:
    • if the argument is missing then missing is returned
    • if the argument is null then null is returned
    • if the argument is of boolean type then it is returned as is
    • if the argument is of numeric type then false is returned if it is 0 or NaN, otherwise true
    • if the argument is of string type then false is returned if it’s empty, otherwise true
    • if the argument is of array or multiset type then false is returned if it’s size is 0, otherwise true
    • if the argument is of object type then false is returned if it has no fields, otherwise true
    • type error is raised for all other input types
  • Example:

    {
      "v1": to_boolean(0),
      "v2": to_boolean(1),
      "v3": to_boolean(""),
      "v4": to_boolean("asterix")
    };
    
  • The expected result is:

    { "v1": false, "v2": true, "v3": false, "v4": true }
    

The function has three aliases: toboolean, to_bool, and tobool.

to_bigint

  • Syntax:

    to_bigint(expr)
    
  • Converts input value to an integer value

  • Arguments:
    • expr : an expression
  • Return Value:
    • if the argument is missing then missing is returned
    • if the argument is null then null is returned
    • if the argument is of boolean type then 1 is returned if it is true, 0 if it is false
    • if the argument is of numeric integer type then it is returned as the same value of bigint type
    • if the argument is of numeric float/double type then it is converted to bigint type
    • if the argument is of string type and can be parsed as integer then that integer value is returned, otherwise null is returned
    • if the argument is of array/multiset/object type then null is returned
    • type error is raised for all other input types
  • Example:

    {
      "v1": to_bigint(false),
      "v2": to_bigint(true),
      "v3": to_bigint(10),
      "v4": to_bigint(float("1e100")),
      "v5": to_bigint(double("1e1000")),
      "v6": to_bigint("20")
    };
    
  • The expected result is:

    { "v1": 0, "v2": 1, "v3": 10, "v4": 9223372036854775807, "v5": 9223372036854775807, "v6": 20 }
    

The function has an alias tobigint.

to_double

  • Syntax:

    to_double(expr)
    
  • Converts input value to a double value

  • Arguments:
    • expr : an expression
  • Return Value:
    • if the argument is missing then missing is returned
    • if the argument is null then null is returned
    • if the argument is of boolean type then 1.0 is returned if it is true, 0.0 if it is false
    • if the argument is of numeric type then it is returned as the value of double type
    • if the argument is of string type and can be parsed as double then that double value is returned, otherwise null is returned
    • if the argument is of array/multiset/object type then null is returned
    • type error is raised for all other input types
  • Example:

    {
      "v1": to_double(false),
      "v2": to_double(true),
      "v3": to_double(10),
      "v4": to_double(11.5),
      "v5": to_double("12.5")
    };
    
  • The expected result is:

    { "v1": 0.0, "v2": 1.0, "v3": 10.0, "v4": 11.5, "v5": 12.5 }
    

The function has an alias todouble.

to_number (to_num)

  • Syntax:

    to_number(expr)
    
  • Converts input value to a numeric value

  • Arguments:
    • expr : an expression
  • Return Value:
    • if the argument is missing then missing is returned
    • if the argument is null then null is returned
    • if the argument is of numeric type then it is returned as is
    • if the argument is of boolean type then 1 is returned if it is true, 0 if it is false
    • if the argument is of string type and can be parsed as bigint then that bigint value is returned, otherwise if it can be parsed as double then that double value is returned, otherwise null is returned
    • if the argument is of array/multiset/object type then null is returned
    • type error is raised for all other input types
  • Example:

    {
      "v1": to_number(false),
      "v2": to_number(true),
      "v3": to_number(10),
      "v4": to_number(11.5),
      "v5": to_number("12.5")
    };
    
  • The expected result is:

    { "v1": 0, "v2": 1, "v3": 10, "v4": 11.5, "v5": 12.5 }
    

The function has three aliases: tonumber, to_num, and tonum.

to_object (to_obj)

  • Syntax:

    to_object(expr)
    
  • Converts input value to an object value

  • Arguments:
    • expr : an expression
  • Return Value:
    • if the argument is missing then missing is returned
    • if the argument is null then null is returned
    • if the argument is of object type then it is returned as is
    • otherwise an empty object is returned
  • Example:

    {
      "v1": to_object({"value": "asterix"}),
      "v2": to_object("asterix")
    };
    
  • The expected result is:

    { "v1": {"value": "asterix"}, "v2": {} }
    

The function has three aliases: toobject, to_obj, and toobj.

to_string (to_str)

  • Syntax:

    to_string(expr)
    
  • Converts input value to a string value

  • Arguments:
    • expr : an expression
  • Return Value:
    • if the argument is missing then missing is returned
    • if the argument is null then null is returned
    • if the argument is of boolean type then "true" is returned if it is true, "false" if it is false
    • if the argument is of numeric type then its string representation is returned
    • if the argument is of string type then it is returned as is
    • if the argument is of array/multiset/object type then null is returned
    • type error is raised for all other input types
  • Example:

    {
      "v1": to_string(false),
      "v2": to_string(true),
      "v3": to_string(10),
      "v4": to_string(11.5),
      "v5": to_string("asterix")
    };
    
  • The expected result is:

    { "v1": "false", "v2": "true", "v3": "10", "v4": "11.5", "v5": "asterix" }
    

The function has three aliases: tostring, to_str, and tostr.

Conditional Functions

if_null (ifnull)

  • Syntax:

    if_null(expression1, expression2, ... expressionN)
    
  • Finds first argument which value is not null and returns that value

  • Arguments:
    • expressionI : an expression (any type is allowed).
  • Return Value:
    • a null if all arguments evaluate to null or no arguments specified
    • a value of the first non-null argument otherwise
  • Example:

    {
        "a": if_null(),
        "b": if_null(null),
        "c": if_null(null, "asterixdb"),
        "d": is_missing(if_null(missing))
    };
    
  • The expected result is:

    { "a": null, "b": null, "c": "asterixdb", "d": true }
    

The function has an alias ifnull.

if_missing (ifmissing)

  • Syntax:

    if_missing(expression1, expression2, ... expressionN)
    
  • Finds first argument which value is not missing and returns that value

  • Arguments:
    • expressionI : an expression (any type is allowed).
  • Return Value:
    • a null if all arguments evaluate to missing or no arguments specified
    • a value of the first non-missing argument otherwise
  • Example:

    {
        "a": if_missing(),
        "b": if_missing(missing),
        "c": if_missing(missing, "asterixdb"),
        "d": if_missing(null, "asterixdb")
    };
    
  • The expected result is:

    { "a": null, "b": null, "c": "asterixdb", "d": null }
    

The function has an alias ifmissing.

if_missing_or_null (ifmissingornull, coalesce)

  • Syntax:

    if_missing_or_null(expression1, expression2, ... expressionN)
    
  • Finds first argument which value is not null or missing and returns that value

  • Arguments:
    • expressionI : an expression (any type is allowed).
  • Return Value:
    • a null if all arguments evaluate to either null or missing, or no arguments specified
    • a value of the first non-null, non-missing argument otherwise
  • Example:

    {
        "a": if_missing_or_null(),
        "b": if_missing_or_null(null, missing),
        "c": if_missing_or_null(null, missing, "asterixdb")
    };
    
  • The expected result is:

    { "a": null, "b": null, "c": "asterixdb" }
    

The function has two aliases: ifmissingornull and coalesce.

if_inf (ifinf)

  • Syntax:

    if_inf(expression1, expression2, ... expressionN)
    
  • Finds first argument which is a non-infinite (INF or-INF) number

  • Arguments:
    • expressionI : an expression (any type is allowed).
  • Return Value:
    • a missing if missing argument was encountered before the first non-infinite number argument
    • a null if null argument or any other non-number argument was encountered before the first non-infinite number argument
    • the first non-infinite number argument otherwise
  • Example:

    {
        "a": is_null(if_inf(null)),
        "b": is_missing(if_inf(missing)),
        "c": is_null(if_inf(double("INF"))),
        "d": if_inf(1, null, missing) ],
        "e": is_null(if_inf(null, missing, 1)) ],
        "f": is_missing(if_inf(missing, null, 1)) ],
        "g": if_inf(float("INF"), 1) ],
        "h": to_string(if_inf(float("INF"), double("NaN"), 1)) ]
    };
    
  • The expected result is:

    { "a": true, "b": true, "c": true, "d": 1, "e": true, "f": true, "g": 1, "h": "NaN" }
    

The function has an alias ifinf.

if_nan (ifnan)

  • Syntax:

    if_nan(expression1, expression2, ... expressionN)
    
  • Finds first argument which is a non-NaN number

  • Arguments:
    • expressionI : an expression (any type is allowed).
  • Return Value:
    • a missing if missing argument was encountered before the first non-NaN number argument
    • a null if null argument or any other non-number argument was encountered before the first non-NaN number argument
    • the first non-NaN number argument otherwise
  • Example:

    {
        "a": is_null(if_nan(null)),
        "b": is_missing(if_nan(missing)),
        "c": is_null(if_nan(double("NaN"))),
        "d": if_nan(1, null, missing) ],
        "e": is_null(if_nan(null, missing, 1)) ],
        "f": is_missing(if_nan(missing, null, 1)) ],
        "g": if_nan(float("NaN"), 1) ],
        "h": to_string(if_nan(float("NaN"), double("INF"), 1)) ]
    };
    
  • The expected result is:

    { "a": true, "b": true, "c": true, "d": 1, "e": true, "f": true, "g": 1, "h": "INF" }
    

The function has an alias ifnan.

if_nan_or_inf (ifnanorinf)

  • Syntax:

    if_nan_or_inf(expression1, expression2, ... expressionN)
    
  • Finds first argument which is a non-infinite (INF or-INF) and non-NaN number

  • Arguments:
    • expressionI : an expression (any type is allowed).
  • Return Value:
    • a missing if missing argument was encountered before the first non-infinite and non-NaN number argument
    • a null if null argument or any other non-number argument was encountered before the first non-infinite and non-NaN number argument
    • the first non-infinite and non-NaN number argument otherwise
  • Example:

    {
        "a": is_null(if_nan_or_inf(null)),
        "b": is_missing(if_nan_or_inf(missing)),
        "c": is_null(if_nan_or_inf(double("NaN"), double("INF"))),
        "d": if_nan_or_inf(1, null, missing) ],
        "e": is_null(if_nan_or_inf(null, missing, 1)) ],
        "f": is_missing(if_nan_or_inf(missing, null, 1)) ],
        "g": if_nan_or_inf(float("NaN"), float("INF"), 1) ],
    };
    
  • The expected result is:

    { "a": true, "b": true, "c": true, "d": 1, "e": true, "f": true, "g": 1 }
    

The function has an alias ifnanorinf.

null_if (nullif)

  • Syntax:

    null_if(expression1, expression2)
    
  • Compares two arguments and returns null if they are equal, otherwise returns the first argument.

  • Arguments:
    • expressionI : an expression (any type is allowed).
  • Return Value:
    • missing if any argument is a missing value,
    • null if
      • any argument is a null value but no argument is a missing value, or
      • argument1 = argument2
    • a value of the first argument otherwise
  • Example:

    {
        "a": null_if("asterixdb", "asterixdb"),
        "b": null_if(1, 2)
    };
    
  • The expected result is:

    { "a": null, "b": 1 }
    

The function has an alias nullif.

missing_if (missingif)

  • Syntax:

    missing_if(expression1, expression2)
    
  • Compares two arguments and returns missing if they are equal, otherwise returns the first argument.

  • Arguments:
    • expressionI : an expression (any type is allowed).
  • Return Value:
    • missing if
      • any argument is a missing value, or
      • no argument is a null value and argument1 = argument2
    • null if any argument is a null value but no argument is a missing value
    • a value of the first argument otherwise
  • Example:

    {
        "a": missing_if("asterixdb", "asterixdb")
        "b": missing_if(1, 2),
    };
    
  • The expected result is:

    { "b": 1 }
    

The function has an alias missingif.

nan_if (nanif)

  • Syntax:

    nan_if(expression1, expression2)
    
  • Compares two arguments and returns NaN value if they are equal, otherwise returns the first argument.

  • Arguments:
    • expressionI : an expression (any type is allowed).
  • Return Value:
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value
    • NaN value of type double if argument1 = argument2
    • a value of the first argument otherwise
  • Example:

    {
        "a": to_string(nan_if("asterixdb", "asterixdb")),
        "b": nan_if(1, 2)
    };
    
  • The expected result is:

    { "a": "NaN", "b": 1 }
    

The function has an alias nanif.

posinf_if (posinfif)

  • Syntax:

    posinf_if(expression1, expression2)
    
  • Compares two arguments and returns +INF value if they are equal, otherwise returns the first argument.

  • Arguments:
    • expressionI : an expression (any type is allowed).
  • Return Value:
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value
    • +INF value of type double if argument1 = argument2
    • a value of the first argument otherwise
  • Example:

    {
        "a": to_string(posinf_if("asterixdb", "asterixdb")),
        "b": posinf_if(1, 2)
    };
    
  • The expected result is:

    { "a": "+INF", "b": 1 }
    

The function has an alias posinfif.

neginf_if (neginfif)

  • Syntax:

    neginf_if(expression1, expression2)
    
  • Compares two arguments and returns -INF value if they are equal, otherwise returns the first argument.

  • Arguments:
    • expressionI : an expression (any type is allowed).
  • Return Value:
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value
    • -INF value of type double if argument1 = argument2
    • a value of the first argument otherwise
  • Example:

    {
        "a": to_string(neginf_if("asterixdb", "asterixdb")),
        "b": neginf_if(1, 2)
    };
    
  • The expected result is:

    { "a": "-INF", "b": 1 }
    

The function has an alias neginfif.

Miscellaneous Functions

uuid

  • Syntax:

    uuid()
    
  • Generates a uuid.

  • Arguments:
    • none
  • Return Value:
    • a generated, random uuid.

len

  • Syntax:

    len(array)

  • Returns the length of the array array.

  • Arguments:
    • array : an array, multiset, null, or missing, represents the collection that needs to be checked.
  • Return Value:
    • an integer that represents the length of input array or the size of the input multiset,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value.
  • Example:

    len(["Hello", "World"])
    
  • The expected result is:

    2
    

not

  • Syntax:

    not(expr)
    
  • Inverts a boolean value

  • Arguments:
    • expr : an expression
  • Return Value:
    • a boolean, the inverse of expr,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value,
    • other non-boolean argument value will cause a type error.
  • Example:
    { "v1": `not`(true), "v2": `not`(false), "v3": `not`(null), "v4": `not`(missing) };
    
  • The expected result is:

    { "v1": false, "v2": true, "v3": null }
    

random

  • Syntax:

    random( [seed_value] )
    
  • Returns a random number, accepting an optional seed value

  • Arguments:
    • seed_value: an optional tinyint/smallint/integer/bigint/float/double value representing the seed number.
  • Return Value:
    • A random number of type double between 0 and 1,
    • missing if the argument is a missing value,
    • null if the argument is a null value or a non-numeric value.
  • Example:

    {
      "v1": random(),
      "v2": random(unix_time_from_datetime_in_ms(current_datetime()))
    };
    

range

  • Syntax:

    range(start_numeric_value, end_numeric_value)
    
  • Generates a series of bigint values based start the start_numeric_value until the end_numeric_value.

  • Arguments:
  • start_numeric_value: a tinyint/smallint/integer/bigint value representing the start value.
  • end_numeric_value: a tinyint/smallint/integer/bigint value representing the max final value.
  • Return Value:
    • an array that starts with the integer value of start_numeric_value and ends with the integer value of end_numeric_value, where the value of each entry in the array is the integer successor of the value in the preceding entry.
  • Example:
    range(0, 3);
    
  • The expected result is:

    [ 0, 1, 2, 3 ]
    

switch_case

  • Syntax:

    switch_case(
        condition,
        case1, case1_result,
        case2, case2_result,
        ...,
        default, default_result
    )
    
  • Switches amongst a sequence of cases and returns the result of the first matching case. If no match is found, the result of the default case is returned.

  • Arguments:
    • condition: a variable (any type is allowed).
    • caseI/default: a variable (any type is allowed).
    • caseI/default_result: a variable (any type is allowed).
  • Return Value:
    • caseI_result if condition matches caseI, otherwise default_result.
  • Example 1:
    switch_case(
        "a",
        "a", 0,
        "x", 1,
        "y", 2,
        "z", 3
    );
    
  • The expected result is:

    0
    
  • Example 2:

    switch_case(
        "a",
        "x", 1,
        "y", 2,
        "z", 3
    );
    
  • The expected result is:

    3
    

deep_equal

  • Syntax:

    deep_equal(expr1, expr2)
    
  • Assess the equality between two expressions of any type (e.g., object, arrays, or multiset). Two objects are deeply equal iff both their types and values are equal.

  • Arguments:
    • expr1 : an expression,
    • expr2 : an expression.
  • Return Value:
    • true or false depending on the data equality,
    • missing if any argument is a missing value,
    • null if any argument is a null value but no argument is a missing value.
  • Example:

    deep_equal(
               {
                 "id":1,
                 "project":"AsterixDB",
                 "address":{"city":"Irvine", "state":"CA"},
                 "related":["Hivestrix", "Preglix", "Apache VXQuery"]
               },
               {
                 "id":1,
                 "project":"AsterixDB",
                 "address":{"city":"San Diego", "state":"CA"},
                 "related":["Hivestrix", "Preglix", "Apache VXQuery"]
               }
    );
    
  • The expected result is:

    false
    

Bitwise Functions

All Bit/Binary functions can only operate on 64-bit signed integers.

Note: All non-integer numbers and other data types result in null.

Note: The query language uses two’s complement representation.

When looking at the value in binary form, bit 1 is the Least Significant Bit (LSB) and bit 32 is the Most Significant Bit (MSB).

(MSB) Bit 32 → 0000 0000 0000 0000 0000 0000 0000 0000 ← Bit 1 (LSB)

bitand

  • Syntax:

    BITAND(int_value1, int_value2, ... , int_valueN)
    
  • Returns the result of a bitwise AND operation performed on all input integer values.

    The bitwise AND operation compares each bit of int_value1 to the corresponding bit of every other int_value. If all bits are 1, then the corresponding result bit is set to 1; otherwise it is set to 0 (zero).

  • Arguments:

    • int_valueI: Integers, or any valid expressions which evaluate to integers, that are used to compare.
  • Return Value:

    • An integer, representing the bitwise AND between all of the input integers.
  • Limitations:

    • Input values must be integers (such as 1 or 1.0) and cannot contain decimals (such as 1.2).
  • Example 1:

    Compare 3 (0011 in binary) and 6 (0110 in binary).

    { "BitAND": BITAND(3,6) };
    
  • The expected result is:

    { "BitAND": 2 }
    

    This results in 2 (0010 in binary) because only bit 2 is set in both 3 (0011) and 6 (0110).

  • Example 2:

    Compare 4.5 and 3 (0011 in binary).

    { "BitAND": BITAND(4.5,3) };
    
  • The expected result is:

    { "BitAND": null }
    

    The result is null because 4.5 is not an integer.

  • Example 3:

    Compare 4.0 (0100 in binary) and 3 (0011 in binary).

    { "BitAND": BITAND(4.0,3) };
    
  • The expected result is:

    { "BitAND": 0 }
    

    This results in 0 (zero) because 4.0 (0100) and 3 (0011) do not share any bits that are both 1.

  • Example 4:

    Compare 3 (0011 in binary) and 6 (0110 in binary) and 15 (1111 in binary).

    { "BitAND": BITAND(3,6,15) };
    
  • The expected result is:

    { "BitAND": 2 }
    

    This results in 2 (0010 in binary) because only the 2nd bit from the right is 1 in all three numbers.

bitclear

  • Syntax:

    BITCLEAR(int_value, positions)
    
  • Returns the result after clearing the specified bit, or array of bits in int_value using the given positions.

    Note: Specifying a negative or zero bit position makes the function return a null.

  • Arguments:

    • int_value: An integer, or any valid expression which evaluates to an integer, that contains the target bit or bits to clear.

    • positions: An integer or an array of integers specifying the position or positions to be cleared.

  • Return Value:

    • An integer, representing the result after clearing the bit or bits specified.
  • Limitations:

    • Input values must be integers (such as 1 or 1.0) and cannot contain decimals (such as 1.2).
  • Example 1:

    Clear bit 1 from 6 (0110 in binary).

    { "BitCLEAR": BITCLEAR(6,1) };
    
  • The expected result is:

    { "BitCLEAR": 6 }
    

    This results in 6 (0110 in binary) because bit 1 was already zero.

  • Example 2:

    Clear bits 1 and 2 from 6 (0110 in binary).

    { "BitCLEAR": BITCLEAR(6,[1,2]) };
    
  • The expected result is:

    { "BitCLEAR": 4 }
    

    This results in 4 (0100 in binary) because bit 2 changed to zero.

  • Example 3:

    Clear bits 1, 2, 4, and 5 from 31 (011111 in binary).

    { "BitCLEAR": BITCLEAR(31,[1,2,4,5]) };
    
  • The expected result is:

    { "BitCLEAR": 4 }
    

    This results in 4 (000100) because bits 1, 2, 4, and 5 changed to zero.

bitnot

  • Syntax:

    BITNOT(int_value)
    
  • Returns the results of a bitwise logical NOT operation performed on an integer value.

    The bitwise logical NOT operation reverses the bits in the value. For each value bit that is 1, the corresponding result bit will be set to 0 (zero); and for each value bit that is 0 (zero), the corresponding result bit will be set to 1.

    Note: All bits of the integer will be altered by this operation.

  • Arguments:

    • int_value: An integer, or any valid expression which evaluates to an integer, that contains the target bits to reverse.
  • Return Value:

    • An integer, representing the result after performing the logical NOT operation.
  • Limitations:

    • Input values must be integers (such as 1 or 1.0) and cannot contain decimals (such as 1.2).
  • Example 1:

    Perform the NOT operation on 3 (0000 0000 0000 0000 0000 0000 0000 0011 in binary).

    { "BitNOT": BITNOT(3) };
    
  • The expected result is:

    { "BitNOT": -4 }
    

    This results in -4 (1111 1111 1111 1111 1111 1111 1111 1100 in binary) because all bits changed.

bitor

  • Syntax:

    BITOR(int_value1, int_value2, ... , int_valueN)
    
  • Returns the result of a bitwise inclusive OR operation performed on all input integer values.

    The bitwise inclusive OR operation compares each bit of int_value1 to the corresponding bit of every other int_value. If any bit is 1, the corresponding result bit is set to 1; otherwise, it is set to 0 (zero).

  • Arguments:

    • int_valueI: Integers, or any valid expressions which evaluate to integers, that are used to compare.
  • Return Value:

    • An integer, representing the bitwise OR between all of the input integers.
  • Limitations:

    • Input values must be integers (such as 1 or 1.0) and cannot contain decimals (such as 1.2).
  • Example 1:

    Perform OR on 3 (0011 in binary) and 6 (0110 in binary).

    { "BitOR": BITOR(3,6) };
    
  • The expected result is:

    { "BitOR": 7 }
    

    This results in 7 (0111 in binary) because at least 1 bit of each (0011 and 0110) is 1 in bits 1, 2, and 3.

  • Example 2:

    Perform OR on 3 (0011 in binary) and -4 (1000 0000 0000 … 0000 1100 in binary).

    { "BitOR": BITOR(3,-4) };
    
  • The expected result is:

    { "BitOR": -1 }
    

    This results in -1 (1111 1111 1111 … 1111 1111 in binary) because the two 1 bits in 3 fill in the two 0 bits in -4 to turn on all the bits.

  • Example 3:

    Perform OR on 3 (0011 in binary) and 6 (0110 in binary) and 15 (1111 in binary).

    { "BitOR": BITOR(3,6,15) };
    
  • The expected result is:

    { "BitOR": 15 }
    

    This results in 15 (1111 in binary) because there is at least one 1 in each of the four rightmost bits.

bitset

  • Syntax:

    BITSET(int_value, positions)
    
  • Returns the result after setting the specified bit position, or array of bit positions, to 1 in the given int_value.

    Note: Specifying a negative or zero position makes the function return a null.

  • Arguments:

    • int_value: An integer, or any valid expression which evaluates to an integer, that contains the target bit or bits to set.

    • positions: An integer or an array of integers specifying the position or positions to be set.

  • Return Value:

    • An integer, representing the result after setting the bit or bits specified. If the bit is already set, then it stays set.
  • Limitations:

    • Input values must be integers (such as 1 or 1.0) and cannot contain decimals (such as 1.2).
  • Example 1:

    Set bit 1 in the value 6 (0110 in binary).

    { "BitSET": BITSET(6,1) };
    
  • The expected result is:

    { "BitSET": 7 }
    

    This results in 7 (0111 in binary) because bit 1 changed to 1.

  • Example 2:

    Set bits 1 and 2 in the value 6 (0110 in binary).

    { "BitSET": BITSET(6,[1,2]) };
    
  • The expected result is:

    { "BitSET": 7 }
    

    This also results in 7 (0111 in binary) because bit 1 changed while bit 2 remained the same.

  • Example 3:

    Set bits 1 and 4 in the value 6 (0110 in binary).

    { "BitSET": BITSET(6,[1,4]) };
    
  • The expected result is:

    { "BitSET": 15 }
    

    This results in 15 (1111 in binary) because bit 1 and 4 changed to ones.

bitshift

  • Syntax:

    BITSHIFT(int_value, shift_amount[, rotate])
    
  • Returns the result of a bit shift operation performed on the integer value int_value. The shift_amount supports left and right shifts. These are logical shifts. The third parameter rotate supports circular shift. This is similar to the BitROTATE function in Oracle.

  • Arguments:

    • int_value: An integer, or any valid expression which evaluates to an integer, that contains the target bit or bits to shift.

    • shift_amount: An integer, or any valid expression which evaluates to an integer, that contains the number of bits to shift.

      • A positive (+) number means this is a LEFT shift.

      • A negative (-) number means this is a RIGHT shift.

    • rotate: (Optional) A boolean, or any valid expression which evaluates to a boolean, where:

      • FALSE means this is a LOGICAL shift, where bits shifted off the end of a value are considered lost.

      • TRUE means this is a CIRCULAR shift (shift-and-rotate operation), where bits shifted off the end of a value are rotated back onto the value at the other end. In other words, the bits rotate in what might be thought of as a circular pattern; therefore, these bits are not lost.

      If omitted, the default is FALSE.

    For comparison, see the below table.

    Input Shift Result of Logical Shift (Rotate FALSE) Result of Circular Shift (Rotate TRUE)
    6 (0000 0110) 4 96 (0110 0000) 96 (0110 0000)
    6 (0000 0110) 3 48 (0011 0000) 48 (0011 0000)
    6 (0000 0110) 2 24 (0001 1000) 24 (0001 1000)
    6 (0000 0110) 1 12 (0000 1100) 12 (0000 1100)
    6 (0000 0110) 0 6 (0000 0110) 6 (0000 0110)
    6 (0000 0110) -1 3 (0000 0011) 3 (0000 0011)
    6 (0000 0110) -2 1 (0000 0001) -9223372036854775807 (1000 0000 … 0000 0001)
    6 (0000 0110) -3 0 (0000 0000) -4611686018427387904 (1100 0000 … 0000 0000)
    6 (0000 0110) -4 0 (0000 0000) 6917529027641081856 (0110 0000 … 0000 0000)
  • Return Value:

    • An integer, representing the result of either a logical or circular shift of the given integer.
  • Limitations:

    • Input values must be integers (such as 1 or 1.0) and cannot contain decimals (such as 1.2).
  • Example 1:

    Logical left shift of the number 6 (0110 in binary) by one bit.

    { "BitSHIFT": BITSHIFT(6,1,FALSE) };
    
  • The expected result is:

    { "BitSHIFT": 12 }
    

    This results in 12 (1100 in binary) because the 1-bits moved from positions 2 and 3 to positions 3 and 4.

  • Example 2:

    Logical right shift of the number 6 (0110 in binary) by two bits.

    { "BitSHIFT": BITSHIFT(6,-2) };
    
  • The expected result is:

    { "BitSHIFT": 1 }
    

    This results in 1 (0001 in binary) because the 1-bit in position 3 moved to position 1 and the 1-bit in position 2 was dropped.

  • Example 2b:

    Circular right shift of the number 6 (0110 in binary) by two bits.

    { "BitSHIFT": BITSHIFT(6,-2,TRUE) };
    
  • The expected result is:

    { "BitSHIFT": -9223372036854775807 }
    

    This results in -9223372036854775807 (1100 0000 0000 0000 0000 0000 0000 0000 in binary) because the two 1-bits wrapped right, around to the Most Significant Digit position and changed the integer’s sign to negative.

  • Example 3:

    Circular left shift of the number 524288 (1000 0000 0000 0000 0000 in binary) by 45 bits.

    { "BitSHIFT": BITSHIFT(524288,45,TRUE) };
    
  • The expected result is:

    { "BitSHIFT": 1 }
    

    This results in 1 because the 1-bit wrapped left, around to the Least Significant Digit position.

bittest

  • Syntax:

    BITTEST(int_value, positions [, all_set])
    
  • Returns TRUE if the specified bit, or bits, is a 1; otherwise, returns FALSE if the specified bit, or bits, is a 0 (zero).

    Note: Specifying a negative or zero bit position will result in null being returned.

  • Arguments:

    • int_value: An integer, or any valid expression which evaluates to an integer, that contains the target bit or bits to test.

    • positions: An integer or an array of integers specifying the position or positions to be tested.

    • all_set: (Optional) A boolean, or any valid expression which evaluates to a boolean.

      • When all_set is FALSE, then it returns TRUE even if one bit in one of the positions is set.

      • When all_set is TRUE, then it returns TRUE only if all input positions are set.

      If omitted, the default is FALSE.

  • Return Value:

    • A boolean, that follows the below table:
      int_value all_set Return Value
      all specified bits are TRUE FALSE TRUE
      all specified bits are TRUE TRUE TRUE
      some specified bits are TRUE FALSE TRUE
      some specified bits are TRUE TRUE FALSE
  • Limitations:

    • Input values must be integers (such as 1 or 1.0) and cannot contain decimals (such as 1.2).
  • Example 1:

    In the number 6 (0110 in binary), is bit 1 set?

    { "IsBitSET": ISBITSET(6,1) };
    
  • The expected result is:

    { "IsBitSET": false }
    

    This returns FALSE because bit 1 of 6 (0110 in binary) is not set to 1.

  • Example 2:

    In the number 1, is either bit 1 or bit 2 set?

    { "BitTEST": BITTEST(1,[1,2],FALSE) };
    
  • The expected result is:

    { "BitTEST": true }
    

    This returns TRUE because bit 1 of the number 1 (0001 in binary) is set to 1.

  • Example 3:

    In the number 6 (0110 in binary), are both bits 2 and 3 set?

    { "IsBitSET": ISBITSET(6,[2,3],TRUE) };
    
  • The expected result is:

    { "IsBitSET": true }
    

    This returns TRUE because both bits 2 and 3 in the number 6 (0110 in binary) are set to 1.

  • Example 4:

    In the number 6 (0110 in binary), are all the bits in positions 1 through 3 set?

    { "BitTEST": BITTEST(6,[1,3],TRUE) };
    
  • The expected result is:

    { "BitTEST": false }
    

    This returns FALSE because bit 1 in the number 6 (0110 in binary) is set to 0 (zero).

The function has an alias isbitset.

bitxor

  • Syntax:

    BITXOR(int_value1, int_value2, ... , int_valueN)
    
  • Returns the result of a bitwise Exclusive OR operation performed on two or more integer values.

    The bitwise Exclusive OR operation compares each bit of int_value1 to the corresponding bit of int_value2.

    If there are more than two input values, the first two are compared; then their result is compared to the next input value; and so on.

    When the compared bits do not match, the result bit is 1; otherwise, the compared bits do match, and the result bit is 0 (zero), as summarized:

    Bit 1 Bit 2 XOR Result Bit
    0 0 0
    0 1 1
    1 0 1
    1 1 0
  • Arguments:

    • int_valueI: Integers, or any valid expressions which evaluate to integers, that are used to compare.
  • Return Value:

    • An integer, representing the bitwise XOR between the input integers.
  • Limitations:

    • Input values must be integers (such as 1 or 1.0) and cannot contain decimals (such as 1.2).
  • Example 1:

    Perform the XOR operation on 3 (0011 in binary) and 6 (0110 in binary).

    { "BitXOR": BITXOR(3,6) };
    
  • The expected result is:

    { "BitXOR": 5 }
    

    This returns 5 (0101 in binary) because the 1st bit pair and 3rd bit pair are different (resulting in 1) while the 2nd bit pair and 4th bit pair are the same (resulting in 0):

    0011 (3)
    0110 (6)
    ====
    0101 (5)
    
  • Example 2:

    Perform the XOR operation on 3 (0011 in binary) and 6 (0110 in binary) and 15 (1111 in binary).

    { "BitXOR": BITXOR(3,6,15) };
    
  • The expected result is:

    { "BitXOR": 10 }
    

    This returns 10 (1010 in binary) because 3 XOR 6 equals 5 (0101 in binary), and then 5 XOR 15 equals 10 (1010 in binary).

Window Functions

Window functions are used to compute an aggregate or cumulative value, based on a portion of the tuples selected by a query. For each input tuple, a movable window of tuples is defined. The window determines the tuples to be used by the window function.

The tuples are not grouped into a single output tuple — each tuple remains separate in the query output.

All window functions must be used with an OVER clause. Refer to OVER Clauses for details.

Window functions cannot appear in the FROM clause clause or LIMIT clause.

The examples in this section use the GleambookMessages dataset, described in the section on SELECT Statements.

cume_dist

  • Syntax:

    CUME_DIST() OVER ([window-partition-clause] [window-order-clause])
    
  • Returns the percentile rank of the current tuple as part of the cumulative distribution – that is, the number of tuples ranked lower than or equal to the current tuple, including the current tuple, divided by the total number of tuples in the window partition.

    The window order clause determines the sort order of the tuples. If the window order clause is omitted, the function returns the same result (1.0) for each tuple.

  • Arguments:

    • None.
  • Clauses:

  • Return Value:

    • A number greater than 0 and less than or equal to 1. The higher the value, the higher the ranking.
  • Example:

    For each author, find the cumulative distribution of all messages in order of message ID.

    SELECT m.messageId, m.authorId, CUME_DIST() OVER (
      PARTITION BY m.authorId
      ORDER BY m.messageId
    ) AS `rank`
    FROM GleambookMessages AS m;
    
  • The expected result is:

    [
      {
        "rank": 0.2,
        "messageId": 2,
        "authorId": 1
      },
      {
        "rank": 0.4,
        "messageId": 4,
        "authorId": 1
      },
      {
        "rank": 0.6,
        "messageId": 8,
        "authorId": 1
      },
      {
        "rank": 0.8,
        "messageId": 10,
        "authorId": 1
      },
      {
        "rank": 1,
        "messageId": 11,
        "authorId": 1
      },
      {
        "rank": 0.5,
        "messageId": 3,
        "authorId": 2
      },
      {
        "rank": 1,
        "messageId": 6,
        "authorId": 2
      }
    ]
    

dense_rank

  • Syntax:

    DENSE_RANK() OVER ([window-partition-clause] [window-order-clause])
    
  • Returns the dense rank of the current tuple – that is, the number of distinct tuples preceding this tuple in the current window partition, plus one.

    The tuples are ordered by the window order clause. If any tuples are tied, they will have the same rank. If the window order clause is omitted, the function returns the same result (1) for each tuple.

    For this function, when any tuples have the same rank, the rank of the next tuple will be consecutive, so there will not be a gap in the sequence of returned values. For example, if there are three tuples ranked 2, the next dense rank is 3.

  • Arguments:

    • None.
  • Clauses:

  • Return Value:

    • An integer, greater than or equal to 1.
  • Example:

    For each author, find the dense rank of all messages in order of location.

    SELECT m.authorId, m.messageId, m.senderLocation[1] as longitude,
    DENSE_RANK() OVER (
      PARTITION BY m.authorId
      ORDER BY m.senderLocation[1]
    ) AS `rank`
    FROM GleambookMessages AS m;
    
  • The expected result is:

    [
      {
        "rank": 1,
        "authorId": 1,
        "messageId": 10,
        "longitude": 70.01
      },
      {
        "rank": 2,
        "authorId": 1,
        "messageId": 11,
        "longitude": 77.49
      },
      {
        "rank": 3,
        "authorId": 1,
        "messageId": 2,
        "longitude": 80.87
      },
      {
        "rank": 3,
        "authorId": 1,
        "messageId": 8,
        "longitude": 80.87
      },
      {
        "rank": 4,
        "authorId": 1,
        "messageId": 4,
        "longitude": 97.04
      },
      {
        "rank": 1,
        "authorId": 2,
        "messageId": 6,
        "longitude": 75.56
      },
      {
        "rank": 2,
        "authorId": 2,
        "messageId": 3,
        "longitude": 81.01
      }
    ]
    

first_value

  • Syntax:

    FIRST_VALUE(expr) [nulls-treatment] OVER (window-definition)
    
  • Returns the requested value from the first tuple in the current window frame, where the window frame is specified by the window definition.

  • Arguments:

    • expr: The value that you want to return from the first tuple in the window frame. [1]
  • Modifiers:

    • Nulls Treatment: (Optional) Determines how NULL or MISSING values are treated when finding the first value in the window frame.
      • IGNORE NULLS: If the values for any tuples evaluate to NULL or MISSING, those tuples are ignored when finding the first tuple. In this case, the function returns the first non-NULL, non-MISSING value.

      • RESPECT NULLS: If the values for any tuples evaluate to NULL or MISSING, those tuples are included when finding the first tuple.

      If this modifier is omitted, the default is RESPECT NULLS.

  • Clauses:

  • Return Value:

    • The specified value from the first tuple. The order of the tuples is determined by the window order clause.

    • NULL, if the frame was empty or if all values were NULL or MISSING and the IGNORE NULLS modifier was specified.

    • In the following cases, this function may return unpredictable results.

      • If the window order clause is omitted.

      • If the window frame is defined by ROWS, and there are tied tuples in the window frame.

    • To make the function return deterministic results, add a window order clause, or add further ordering terms to the window order clause so that no tuples are tied.

    • If the window frame is defined by RANGE or GROUPS, and there are tied tuples in the window frame, the function returns the first value of the input expression.

  • Example:

    For each author, show the length of each message, including the length of the shortest message from that author.

    SELECT m.authorId, m.messageId,
    LENGTH(m.message) AS message_length,
    FIRST_VALUE(LENGTH(m.message)) OVER (
      PARTITION BY m.authorId
      ORDER BY LENGTH(m.message)
    ) AS shortest_message
    FROM GleambookMessages AS m;
    
  • The expected result is:

    [
      {
        "message_length": 31,
        "shortest_message": 31,
        "authorId": 1,
        "messageId": 8
      },
      {
        "message_length": 39,
        "shortest_message": 31,
        "authorId": 1,
        "messageId": 11
      },
      {
        "message_length": 44,
        "shortest_message": 31,
        "authorId": 1,
        "messageId": 4
      },
      {
        "message_length": 45,
        "shortest_message": 31,
        "authorId": 1,
        "messageId": 2
      },
      {
        "message_length": 51,
        "shortest_message": 31,
        "authorId": 1,
        "messageId": 10
      },
      {
        "message_length": 35,
        "shortest_message": 35,
        "authorId": 2,
        "messageId": 3
      },
      {
        "message_length": 44,
        "shortest_message": 35,
        "authorId": 2,
        "messageId": 6
      }
    ]
    

lag

  • Syntax:

    LAG(expr[, offset[, default]]) [nulls-treatment] OVER ([window-partition-clause] [window-order-clause])
    
  • Returns the value from a tuple at a given offset prior to the current tuple position.

    The window order clause determines the sort order of the tuples. If the window order clause is omitted, the return values may be unpredictable.

  • Arguments:

    • expr: The value that you want to return from the offset tuple. [1]

    • offset: (Optional) A positive integer. If omitted, the default is 1.

    • default: (Optional) The value to return when the offset goes out of partition scope. If omitted, the default is NULL.

  • Modifiers:

    • Nulls Treatment: (Optional) Determines how NULL or MISSING values are treated when finding the offset tuple in the window partition.
      • IGNORE NULLS: If the values for any tuples evaluate to NULL or MISSING, those tuples are ignored when finding the offset tuple.

      • RESPECT NULLS: If the values for any tuples evaluate to NULL or MISSING, those tuples are included when finding the offset tuple.

      If this modifier is omitted, the default is RESPECT NULLS.

  • Clauses:

  • Return Value:

    • The specified value from the offset tuple.

    • If the offset tuple is out of partition scope, it returns the default value, or NULL if no default is specified.

  • Example:

    For each author, show the length of each message, including the length of the next-shortest message.

    SELECT m.authorId, m.messageId,
    LENGTH(m.message) AS message_length,
    LAG(LENGTH(m.message), 1, "No shorter message") OVER (
      PARTITION BY m.authorId
      ORDER BY LENGTH(m.message)
    ) AS next_shortest_message
    FROM GleambookMessages AS m;
    
  • The expected result is:

    [
      {
        "message_length": 31,
        "authorId": 1,
        "messageId": 8,
        "next_shortest_message": "No shorter message"
      },
      {
        "message_length": 39,
        "authorId": 1,
        "messageId": 11,
        "next_shortest_message": 31
      },
      {
        "message_length": 44,
        "authorId": 1,
        "messageId": 4,
        "next_shortest_message": 39
      },
      {
        "message_length": 45,
        "authorId": 1,
        "messageId": 2,
        "next_shortest_message": 44
      },
      {
        "message_length": 51,
        "authorId": 1,
        "messageId": 10,
        "next_shortest_message": 45
      },
      {
        "message_length": 35,
        "authorId": 2,
        "messageId": 3,
        "next_shortest_message": "No shorter message"
      },
      {
        "message_length": 44,
        "authorId": 2,
        "messageId": 6,
        "next_shortest_message": 35
      }
    ]
    

last_value

  • Syntax:

    LAST_VALUE(expr) [nulls-treatment] OVER (window-definition)
    
  • Returns the requested value from the last tuple in the current window frame, where the window frame is specified by the window definition.

  • Arguments:

    • expr: The value that you want to return from the last tuple in the window frame. [1]
  • Modifiers:

    • Nulls Treatment: (Optional) Determines how NULL or MISSING values are treated when finding the last tuple in the window frame.
      • IGNORE NULLS: If the values for any tuples evaluate to NULL or MISSING, those tuples are ignored when finding the last tuple. In this case, the function returns the last non-NULL, non-MISSING value.

      • RESPECT NULLS: If the values for any tuples evaluate to NULL or MISSING, those tuples are included when finding the last tuple.

      If this modifier is omitted, the default is RESPECT NULLS.

  • Clauses:

  • Return Value:

    • The specified value from the last tuple. The order of the tuples is determined by the window order clause.

    • NULL, if the frame was empty or if all values were NULL or MISSING and the IGNORE NULLS modifier was specified.

    • In the following cases, this function may return unpredictable results.

      • If the window order clause is omitted.

      • If the window frame clause is omitted.

      • If the window frame is defined by ROWS, and there are tied tuples in the window frame.

    • To make the function return deterministic results, add a window order clause, or add further ordering terms to the window order clause so that no tuples are tied.

    • If the window frame is defined by RANGE or GROUPS, and there are tied tuples in the window frame, the function returns the last value of the input expression.

  • Example:

    For each author, show the length of each message, including the length of the longest message from that author.

    SELECT m.authorId, m.messageId,
    LENGTH(m.message) AS message_length,
    LAST_VALUE(LENGTH(m.message)) OVER (
      PARTITION BY m.authorId
      ORDER BY LENGTH(m.message)
      ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING -- ➊
    ) AS longest_message
    FROM GleambookMessages AS m;
    
  • The expected result is:

    [
      {
        "message_length": 31,
        "longest_message": 51,
        "authorId": 1,
        "messageId": 8
      },
      {
        "message_length": 39,
        "longest_message": 51,
        "authorId": 1,
        "messageId": 11
      },
      {
        "message_length": 44,
        "longest_message": 51,
        "authorId": 1,
        "messageId": 4
      },
      {
        "message_length": 45,
        "longest_message": 51,
        "authorId": 1,
        "messageId": 2
      },
      {
        "message_length": 51,
        "longest_message": 51,
        "authorId": 1,
        "messageId": 10
      },
      {
        "message_length": 35,
        "longest_message": 44,
        "authorId": 2,
        "messageId": 3
      },
      {
        "message_length": 44,
        "longest_message": 44,
        "authorId": 2,
        "messageId": 6
      }
    ]
    

    ➀ This clause specifies that the window frame should extend to the end of the window partition. Without this clause, the end point of the window frame would always be the current tuple. This would mean that the longest message would always be the same as the current message.

lead

  • Syntax:

    LEAD(expr[, offset[, default]]) [nulls-treatment] OVER ([window-partition-clause] [window-order-clause])
    
  • Returns the value from a tuple at a given offset ahead of the current tuple position.

    The window order clause determines the sort order of the tuples. If the window order clause is omitted, the return values may be unpredictable.

  • Arguments:

    • expr: The value that you want to return from the offset tuple. [1]

    • offset: (Optional) A positive integer. If omitted, the default is 1.

    • default: (Optional) The value to return when the offset goes out of window partition scope. If omitted, the default is NULL.

  • Modifiers:

    • Nulls Treatment: (Optional) Determines how NULL or MISSING values are treated when finding the offset tuple in the window partition.
      • IGNORE NULLS: If the values for any tuples evaluate to NULL or MISSING, those tuples are ignored when finding the offset tuple.

      • RESPECT NULLS: If the values for any tuples evaluate to NULL or MISSING, those tuples are included when finding the offset tuple.

      If this modifier is omitted, the default is RESPECT NULLS.

  • Clauses:

  • Return Value:

    • The specified value from the offset tuple.

    • If the offset tuple is out of partition scope, it returns the default value, or NULL if no default is specified.

  • Example:

    For each author, show the length of each message, including the length of the next-longest message.

    SELECT m.authorId, m.messageId,
    LENGTH(m.message) AS message_length,
    LEAD(LENGTH(m.message), 1, "No longer message") OVER (
      PARTITION BY m.authorId
      ORDER BY LENGTH(m.message)
    ) AS next_longest_message
    FROM GleambookMessages AS m;
    
  • The expected result is:

    [
      {
        "message_length": 31,
        "authorId": 1,
        "messageId": 8,
        "next_longest_message": 39
      },
      {
        "message_length": 39,
        "authorId": 1,
        "messageId": 11,
        "next_longest_message": 44
      },
      {
        "message_length": 44,
        "authorId": 1,
        "messageId": 4,
        "next_longest_message": 45
      },
      {
        "message_length": 45,
        "authorId": 1,
        "messageId": 2,
        "next_longest_message": 51
      },
      {
        "message_length": 51,
        "authorId": 1,
        "messageId": 10,
        "next_longest_message": "No longer message"
      },
      {
        "message_length": 35,
        "authorId": 2,
        "messageId": 3,
        "next_longest_message": 44
      },
      {
        "message_length": 44,
        "authorId": 2,
        "messageId": 6,
        "next_longest_message": "No longer message"
      }
    ]
    

nth_value

  • Syntax:

    NTH_VALUE(expr, offset) [nthval-from] [nulls-treatment] OVER (window-definition)
    
  • Returns the requested value from a tuple in the current window frame, where the window frame is specified by the window definition.

  • Arguments:

    • expr: The value that you want to return from the offset tuple in the window frame. [1]

    • offset: The number of the offset tuple within the window frame, counting from 1.

  • Modifiers:

    • Nth Val From: (Optional) Determines where the function starts counting the offset.

      • FROM FIRST: Counting starts at the first tuple in the window frame. In this case, an offset of 1 is the first tuple in the window frame, 2 is the second tuple, and so on.

      • FROM LAST: Counting starts at the last tuple in the window frame. In this case, an offset of 1 is the last tuple in the window frame, 2 is the second-to-last tuple, and so on.

      The order of the tuples is determined by the window order clause. If this modifier is omitted, the default is FROM FIRST.

    • Nulls Treatment: (Optional) Determines how NULL or MISSING values are treated when finding the offset tuple in the window frame.

      • IGNORE NULLS: If the values for any tuples evaluate to NULL or MISSING, those tuples are ignored when finding the offset tuple.

      • RESPECT NULLS: If the values for any tuples evaluate to NULL or MISSING, those tuples are included when finding the offset tuple.

      If this modifier is omitted, the default is RESPECT NULLS.

  • Clauses:

  • Return Value:

    • The specified value from the offset tuple.

    • In the following cases, this function may return unpredictable results.

      • If the window order clause is omitted.

      • If the window frame is defined by ROWS, and there are tied tuples in the window frame.

    • To make the function return deterministic results, add a window order clause, or add further ordering terms to the window order clause so that no tuples are tied.

    • If the window frame is defined by RANGE or GROUPS, and there are tied tuples in the window frame, the function returns the first value of the input expression when counting FROM FIRST, or the last value of the input expression when counting FROM LAST.

  • Example 1:

    For each author, show the length of each message, including the length of the second shortest message from that author.

    SELECT m.authorId, m.messageId,
    LENGTH(m.message) AS message_length,
    NTH_VALUE(LENGTH(m.message), 2) FROM FIRST OVER (
      PARTITION BY m.authorId
      ORDER BY LENGTH(m.message)
      ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING -- ➊
    ) AS shortest_message_but_1
    FROM GleambookMessages AS m;
    
  • The expected result is:

    [
      {
        "message_length": 31,
        "shortest_message_but_1": 39,
        "authorId": 1,
        "messageId": 8
      },
      {
        "message_length": 39,
        "shortest_message_but_1": 39,
        "authorId": 1,
        "messageId": 11 // ➋
      },
      {
        "message_length": 44,
        "shortest_message_but_1": 39,
        "authorId": 1,
        "messageId": 4
      },
      {
        "message_length": 45,
        "shortest_message_but_1": 39,
        "authorId": 1,
        "messageId": 2
      },
      {
        "message_length": 51,
        "shortest_message_but_1": 39,
        "authorId": 1,
        "messageId": 10
      },
      {
        "message_length": 35,
        "shortest_message_but_1": 44,
        "authorId": 2,
        "messageId": 3
      },
      {
        "message_length": 44,
        "shortest_message_but_1": 44,
        "authorId": 2,
        "messageId": 6 // ➋
      }
    ]
    

    ➀ This clause specifies that the window frame should extend to the end of the window partition. Without this clause, the end point of the window frame would always be the current tuple. This would mean that for the shortest message, the function would be unable to find the route with the second shortest message.

    ➁ The second shortest message from this author.

  • Example 2:

    For each author, show the length of each message, including the length of the second longest message from that author.

    SELECT m.authorId, m.messageId,
    LENGTH(m.message) AS message_length,
    NTH_VALUE(LENGTH(m.message), 2) FROM LAST OVER (
      PARTITION BY m.authorId
      ORDER BY LENGTH(m.message)
      ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING -- ➊
    ) AS longest_message_but_1
    FROM GleambookMessages AS m;
    
  • The expected result is:

    [
      {
        "message_length": 31,
        "longest_message_but_1": 45,
        "authorId": 1,
        "messageId": 8
      },
      {
        "message_length": 39,
        "longest_message_but_1": 45,
        "authorId": 1,
        "messageId": 11
      },
      {
        "message_length": 44,
        "longest_message_but_1": 45,
        "authorId": 1,
        "messageId": 4
      },
      {
        "message_length": 45,
        "longest_message_but_1": 45,
        "authorId": 1,
        "messageId": 2 // ➋
      },
      {
        "message_length": 51,
        "longest_message_but_1": 45,
        "authorId": 1,
        "messageId": 10
      },
      {
        "message_length": 35,
        "longest_message_but_1": 35,
        "authorId": 2,
        "messageId": 3 // ➋
      },
      {
        "message_length": 44,
        "longest_message_but_1": 35,
        "authorId": 2,
        "messageId": 6
      }
    ]
    

    ➀ This clause specifies that the window frame should extend to the end of the window partition. Without this clause, the end point of the window frame would always be the current tuple. This would mean the function would be unable to find the second longest message for shorter messages.

    ➁ The second longest message from this author.

ntile

  • Syntax:

    NTILE(num_tiles) OVER ([window-partition-clause] [window-order-clause])
    
  • Divides the window partition into the specified number of tiles, and allocates each tuple in the window partition to a tile, so that as far as possible each tile has an equal number of tuples. When the set of tuples is not equally divisible by the number of tiles, the function puts more tuples into the lower-numbered tiles. For each tuple, the function returns the number of the tile into which that tuple was placed.

    The window order clause determines the sort order of the tuples. If the window order clause is omitted then the tuples are processed in an undefined order.

  • Arguments:

    • num_tiles: The number of tiles into which you want to divide the window partition. This argument can be an expression and must evaluate to a number. If the number is not an integer, it will be truncated.
  • Clauses:

  • Return Value:

    • An value greater than or equal to 1 and less than or equal to the number of tiles.
  • Example:

    Allocate each message to one of three tiles by length and message ID.

    SELECT m.messageId, LENGTH(m.message) AS `length`,
    NTILE(3) OVER (
      ORDER BY LENGTH(m.message), m.messageId
    ) AS `ntile`
    FROM GleambookMessages AS m;
    
  • The expected result is:

    [
      {
        "length": 31,
        "ntile": 1,
        "messageId": 8
      },
      {
        "length": 35,
        "ntile": 1,
        "messageId": 3
      },
      {
        "length": 39,
        "ntile": 1,
        "messageId": 11
      },
      {
        "length": 44,
        "ntile": 2,
        "messageId": 4
      },
      {
        "length": 44,
        "ntile": 2,
        "messageId": 6
      },
      {
        "length": 45,
        "ntile": 3,
        "messageId": 2
      },
      {
        "length": 51,
        "ntile": 3,
        "messageId": 10
      }
    ]
    

percent_rank

  • Syntax:

    PERCENT_RANK() OVER ([window-partition-clause] [window-order-clause])
    
  • Returns the percentile rank of the current tuple – that is, the rank of the tuples minus one, divided by the total number of tuples in the window partition minus one.

    The window order clause determines the sort order of the tuples. If the window order clause is omitted, the function returns the same result (0) for each tuple.

  • Arguments:

    • None.
  • Clauses:

  • Return Value:

    • A number between 0 and 1. The higher the value, the higher the ranking.
  • Example:

    For each author, find the percentile rank of all messages in order of message ID.

    SELECT m.messageId, m.authorId, PERCENT_RANK() OVER (
      PARTITION BY m.authorId
      ORDER BY m.messageId
    ) AS `rank`
    FROM GleambookMessages AS m;
    
  • The expected result is:

    [
      {
        "rank": 0,
        "messageId": 2,
        "authorId": 1
      },
      {
        "rank": 0.25,
        "messageId": 4,
        "authorId": 1
      },
      {
        "rank": 0.5,
        "messageId": 8,
        "authorId": 1
      },
      {
        "rank": 0.75,
        "messageId": 10,
        "authorId": 1
      },
      {
        "rank": 1,
        "messageId": 11,
        "authorId": 1
      },
      {
        "rank": 0,
        "messageId": 3,
        "authorId": 2
      },
      {
        "rank": 1,
        "messageId": 6,
        "authorId": 2
      }
    ]
    

rank

  • Syntax:

    RANK() OVER ([window-partition-clause] [window-order-clause])
    
  • Returns the rank of the current tuple – that is, the number of distinct tuples preceding this tuple in the current window partition, plus one.

    The tuples are ordered by the window order clause. If any tuples are tied, they will have the same rank. If the window order clause is omitted, the function returns the same result (1) for each tuple.

    When any tuples have the same rank, the rank of the next tuple will include all preceding tuples, so there may be a gap in the sequence of returned values. For example, if there are three tuples ranked 2, the next rank is 5.

    To avoid gaps in the returned values, use the DENSE_RANK() function instead.

  • Arguments:

    • None.
  • Clauses:

  • Return Value:

    • An integer, greater than or equal to 1.
  • Example:

    For each author, find the rank of all messages in order of location.

    SELECT m.authorId, m.messageId, m.senderLocation[1] as longitude,
    RANK() OVER (
      PARTITION BY m.authorId
      ORDER BY m.senderLocation[1]
    ) AS `rank`
    FROM GleambookMessages AS m;
    
  • The expected result is:

    [
      {
        "rank": 1,
        "authorId": 1,
        "messageId": 10,
        "longitude": 70.01
      },
      {
        "rank": 2,
        "authorId": 1,
        "messageId": 11,
        "longitude": 77.49
      },
      {
        "rank": 3,
        "authorId": 1,
        "messageId": 2,
        "longitude": 80.87
      },
      {
        "rank": 3,
        "authorId": 1,
        "messageId": 8,
        "longitude": 80.87
      },
      {
        "rank": 5,
        "authorId": 1,
        "messageId": 4,
        "longitude": 97.04
      },
      {
        "rank": 1,
        "authorId": 2,
        "messageId": 6,
        "longitude": 75.56
      },
      {
        "rank": 2,
        "authorId": 2,
        "messageId": 3,
        "longitude": 81.01
      }
    ]
    

ratio_to_report

  • Syntax:

    RATIO_TO_REPORT(expr) OVER (window-definition)
    
  • Returns the fractional ratio of the specified value for each tuple to the sum of values for all tuples in the window frame.

  • Arguments:

    • expr: The value for which you want to calculate the fractional ratio. [1]
  • Clauses:

  • Return Value:

    • A number between 0 and 1, representing the fractional ratio of the value for the current tuple to the sum of values for all tuples in the current window frame. The sum of returned values for all tuples in the current window frame is 1.

    • If the input expression does not evaluate to a number, or the sum of values for all tuples is zero, it returns NULL.

  • Example:

    For each author, calculate the length of each message as a fraction of the total length of all messages.

    SELECT m.messageId, m.authorId,
    RATIO_TO_REPORT(LENGTH(m.message)) OVER (
      PARTITION BY m.authorId
    ) AS length_ratio
    FROM GleambookMessages AS m;
    
  • The expected result is:

    [
      {
        "length_ratio": 0.21428571428571427,
        "messageId": 2,
        "authorId": 1
      },
      {
        "length_ratio": 0.20952380952380953,
        "messageId": 4,
        "authorId": 1
      },
      {
        "length_ratio": 0.14761904761904762,
        "messageId": 8,
        "authorId": 1
      },
      {
        "length_ratio": 0.24285714285714285,
        "messageId": 10,
        "authorId": 1
      },
      {
        "length_ratio": 0.18571428571428572,
        "messageId": 11,
        "authorId": 1
      },
      {
        "length_ratio": 0.4430379746835443,
        "messageId": 3,
        "authorId": 2
      },
      {
        "length_ratio": 0.5569620253164557,
        "messageId": 6,
        "authorId": 2
      }
    ]
    

row_number

  • Syntax:

    ROW_NUMBER() OVER ([window-partition-clause] [window-order-clause])
    
  • Returns a unique row number for every tuple in every window partition. In each window partition, the row numbering starts at 1.

    The window order clause determines the sort order of the tuples. If the window order clause is omitted, the return values may be unpredictable.

  • Arguments:

    • None.
  • Clauses:

  • Return Value:

    • An integer, greater than or equal to 1.
  • Example:

    For each author, number all messages in order of length.

    SELECT m.messageId, m.authorId,
    ROW_NUMBER() OVER (
      PARTITION BY m.authorId
      ORDER BY LENGTH(m.message)
    ) AS `row`
    FROM GleambookMessages AS m;
    
  • The expected result is:

    [
      {
        "row": 1,
        "messageId": 8,
        "authorId": 1
      },
      {
        "row": 2,
        "messageId": 11,
        "authorId": 1
      },
      {
        "row": 3,
        "messageId": 4,
        "authorId": 1
      },
      {
        "row": 4,
        "messageId": 2,
        "authorId": 1
      },
      {
        "row": 5,
        "messageId": 10,
        "authorId": 1
      },
      {
        "row": 1,
        "messageId": 3,
        "authorId": 2
      },
      {
        "row": 2,
        "messageId": 6,
        "authorId": 2
      }
    ]
    

1. If the query contains the GROUP BY clause or any aggregate functions, this expression must only depend on GROUP BY expressions or aggregate functions.