LAT Reference
LAT Data Transformation with J...
LAT Transformation Functions
data pipelines are now the preferred method for loading data into the ocient system for details, see load data docid\ xq0tg7yph vn62uwufibu lat supports many transformation functions to modify data during loading the functions include standard jmespath functions and {{ocient}} specific transformation functions transformation functions in lat are case sensitive the standard syntax for transformation function calls is all lower case characters standard jmespath functions the following standard jmespath functions are supported in lat transformation expressions for details about using these functions, see jmespath functions https //jmespath org/specification html#functions expressions jmespath function function signature example abs number abs(number $value) abs(num 1) avg number avg(array\[number] $elements) avg(array 1) contains boolean contains(array|string $subject, any $search) contains(array 1, 'test') ceil number ceil(number $value) ceil(num 1) ends with boolean ends with(string $subject, string $prefix) ends with(str 1, 'ing') floor number floor(number $value) floor(num 1) join string join(string $glue, array\[string] $stringsarray) join(',', array 1) keys array keys(object $obj) keys(obj 1) length number length(string|array|object $subject) length(array 1) map array\[any] map(expression→any→any expr, array\[any] elements) map(\&to string(@), array 1) max number max(array\[number]|array\[string] $collection) max(array 1) max by max by(array elements, expression→number|expression→string expr) max by(array 1, \&age) merge object merge(\[object argument, \[, object $…]]) merge(obj 1, obj 2, obj 3) min number min(array\[number]|array\[string] $collection) min(array 1) min by min by(array elements, expression→number|expression→string expr) min by(array 1, \&age) not null any not null(\[any $argument \[, any $…]]) not null(val 1, val 2, val 3) reverse array reverse(string|array $argument) reverse(str 1) sort array sort(array\[number]|array\[string] $list) sort(array 1) sort by sort by(array elements, expression→number|expression→string expr) sort by(array 1, \&age) starts with boolean starts with(string $subject, string $prefix) starts with(str 1, 'chi') sum number sum(array\[number] $collection) sum(array 1) to array array to array(any $arg) to array(val 1) to string string to string(any $arg) to string(val 1) type string type(array|object|string|number|boolean|null $subject) type(val 1) values array values(object $obj) values(obj 1) other transformation functions the following transformation functions are also available in lat transformation expressions add returns the sum of the two arguments 200,535 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type array cap caps the length of an array to a maximum number of elements 140,604 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type array cat concatenate two arrays, returning an array accepts an array or null value, returning an empty array if both parameters are null 147,587 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type concat concatenates two strings or converts arguments into strings and concatenates them into a string 111,632 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type divide returns the floating point division of the two arguments 194,549 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type hash code returns an integer hash code for a string, number, or boolean 168,575 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type if returns one of the other arguments based on the truth value of the first when the first argument is true, return the second argument when false, return the third a null input into the first argument will be evaluated as false both branches are eagerly evaluated 127,606 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type json parses a string into a json value for use in further jmespath operations all values other than string are returned unmodified strings containing null, boolean, number, array, and object types are supported 258,476 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type lazy if returns one of the other arguments based on the truth value of the first when the first argument is true, return the second argument when false, return the third the second and third arguments are both expressions branches are lazily evaluated 121,622 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type left returns the num chars leftmost characters of the string column if null, null is returned if the index is greater than the length of the string, the entire string is returned 146,597 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type lower case returns lower case version of a string 237,497 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type lpad pads an input string to a specified length with a padding string added to the left side 111,632 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type ltrim removes leading contiguous instances of a set of characters in a given string 111,632 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type map an override of jmespath’s built in map which maps a given expression into elements of an array this override makes the small adjustment that if the array itself is null, then a null is returned 126,617 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type matches returns boolean for whether this input matches the pattern flags can be s → pattern dotall m → pattern multiline i → pattern case insensitive x → pattern comments q → pattern literal 144,599 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type millis to timestamp converts an integer number of milliseconds since the epoch to an ocient timestamp column 167,566 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type multiply returns the floating point multiplication of the two arguments 188,555 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type nanos to timestamp converts an integer number of nanoseconds since the epoch to an ocient timestamp column 175,568 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type normalize space removes leading and trailing whitespace and replaces connected whitespace with a single space 205,530 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type now, current timestamp returns the current timestamp in nanos this will correctly load into an ocient timestamp column 175,560 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type null if, nullif returns null if the first argument matches any of the following arguments can be applied to multiple columns or to compare a column with a constant can also be used with more than two arguments in which case returns null if a matches any of the subsequent arguments returns the value the first argument if it does not match any other arguments 158,586 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type parse array given a string, parses it into an array of strings handles nested arrays as well 110,625 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type record uuid returns string that represents a unique identifier for the record in a given pipeline for a given file group or topic this return value will be consistent over multiple calls in the pipeline's transformation configuration for the same source record can be used as a surrogate key when loading multiple tables from the same source record in a pipeline to facilitate joins when no unique identifier exists on the source record 110,623 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type replace returns string with text matching the regex pattern replaced by the replacement text flags can be s → pattern dotall m → pattern multiline i → pattern case insensitive x → pattern comments q → pattern literal 110,623 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type right returns the num chars rightmost characters of the string column if null, null is returned if the index is greater than the length of the string, the entire string is returned 146,588 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type rpad pads an input string to a specified length with a padding string added to the right side 110,625 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type rtrim removes trailing contiguous instances of a set of characters in a given string 110,625 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type sample returns true or false based on the given sample rate the sample rate must be a number commonly used inside of if statements an exception will be thrown if the sample rate is outside of \[0 0, 1 0] 147,587 false unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type st forcepolygonccw forces a polygon to have a counterclockwise rotation of the outer polygon ring and a clockwise rotation of the inner polygon ring you can use this function to convert polygons extracted from systems that interpret polygon rotation differently 110,625 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type st geomfromewkb returns a geometry from an extended well known binary (ewkb) and well known binary (wkb) representations of a geometry you can load the result of this function into point, linestring, and polygon column types 110,625 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type st point constructs an point from numeric coordinates that represent the longitude and latitude of the point 135,600 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type substring returns the substring beginning at start index (with indexes beginning at 1) to the end of the string, or to num chars characters 121,613 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type substring after returns the string after the first occurrence of the needle 163,571 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type substring before returns the string before the first occurrence of the needle both parameters are literal strings 160,575 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type subtract returns the difference of the two arguments 185,550 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type to array length turns any json object into an array with n copies of that object 165,568 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type to binary converts a string to binary data for loading into an ocient binary or hash column if mode is 'hex' , data is parsed as a sequence of hexadecimal digits note that the sequence of digits must not begin with 0x otherwise, mode must be the name of a character encodings listed here https //docs oracle com/en/java/javase/17/intl/supported encodings html , and data is encoded in that encoding 120,622 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type to date converts a string to a date for loading into an ocient date column format string follows these formatting rules https //docs oracle com/en/java/javase/17/docs/api/java base/java/time/format/datetimeformatter html 155,577 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type to time converts a string into a time of day for loading into an ocient time column format string follows these formatting rules https //docs oracle com/en/java/javase/17/docs/api/java base/java/time/format/datetimeformatter html 150,582 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type to timestamp converts a string to timestamp for loading into an ocient timestamp column format string follows these formatting rules https //docs oracle com/en/java/javase/17/docs/api/java base/java/time/format/datetimeformatter html 135,600 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type to tuple works on tuples that get tokenized at the first nesting level by a given delimiter nested tuples, arrays or objects are considered entirely for each tokenized tuple element, a transformation function can be specified that is applied to the tuple element tokenized elements are always treated as strings, requiring additional transformation when being treated as numbers 110,631 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type tokenize returns array of strings after splitting on the regex pattern flags can be s → pattern dotall m → pattern multiline i → pattern case insensitive x → pattern comments q → pattern literal 135,598 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type translate replaces characters according to a map of characters if the to chars string is shorter than the from chars string, the function removes the characters that are not mapped 133,601 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type trim removes leading and trailing contiguous instances of a set of characters in a given string 110,625 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type truncate truncates a decimal number to have a specified number of digits after its decimal point if decimal is null, this function returns null during the transformation stage, the lat stores numbers as integers or floating point numbers by default however, floating point numbers are inexact for this reason, this function might return unexpected results for example, truncate(`0 29`, `2`) evaluates to 0 28 , not 0 29 (because 0 29 is really 0 28999999 ) 123,611 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type tuple element transformation works on an element of a given array and applies the specified expression to that element the function returns an array or null if specified 110,625 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unique return all unique values in an array 126,608 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type upper case returns upper case version of string 235,500 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type width bucket returns the bucket number of the queried value in a histogram starting at min, ending at max, and consisting of num buckets count of buckets the values are inclusive on the lower bound and exclusive of the upper bound so that all buckets are the same width also note that requesting 100 buckets will actually get you 102 possible buckets bucket 0 through 101 where 0 captures all values below the minimum and 101 captures all values at the maximum and beyond 110,625 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type zip zips n one dimensional arrays into a single two dimensional array with each inner array containing n elements the resulting outer array will have a length equal to the max length of any input array for input arrays smaller than the max length, null elements will be placed in the inner arrays 135,598 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type special functions explode performs an explosion on all indicated columns, producing one row per value in the exploded array this can be applied to multiple columns in the same table non exploded columns are held constant and exploded columns are zipped together in array order if multiple exploded columns have different array lengths, the record is exploded into n records where n is the max length columns with array lengths less than n will have null values after each of their values are exploded you can explode a maximum of 8,192 array elements for each source record usage explode(array\[any] arr) can be used like any other jmespath function on a json array type with a few limitations explode can only be used as the outermost function in a jmespath expression explode only works on a single dimension multidimensional explode is not supported examples single column explode transform configuration { "transform" { "topics" { "topic0" { "tables" { "table0" { "columns" { "col0" "explode(array0)", "col1" "integer0" } } } } } } } input { "array0" \[0, 1, 2, 3], "integer0" 4 } output \[ { "col0" 0, "col1" 4 }, { "col0" 1, "col1" 4 }, { "col0" 2, "col1" 4 }, { "col0" 3, "col1" 4 } ] multi column explode transform configuration { "transform" { "topics" { "topic0" { "tables" { "table0" { "columns" { "col0" "explode(array0)", "col1" "explode(array1)", "col2" "integer0" } } } } } } } input { "array0" \[0, 1, 2, 3], "array1" \[0, 1, 2], "integer0" 4 } output \[ { "col0" 0, "col1" 0, "col2" 4 }, { "col0" 1, "col1" 1, "col2" 4 }, { "col0" 2, "col1" 2, "col2" 4 }, { "col0" 3, "col1" null, "col2" 4 } ] note that in the 4th record col1 has a null value because array1 only has 3 values while array0 has 4 related links ingest data with legacy lat reference docid\ tt6tfoulap0mt aycm2ka