Loading and Transformation Ref...

Data Transformation with JMESPath

Within a transformation each column is configured with a JMESPath expression describing where to extract the column’s value and any transformations to apply to that value.

For information on common JMESPath functions and filtering tools, see the Standard JMESPATH Functions section.

For a list of functions to modify data during loading, see the Transformation Functions page.

Information on customizing your own transformation functions can be found on the User-Defined Transformations page.

For expressions to select elements from an array matching certain criteria, see the Record Filtering page.

Notes/Warnings

JSON Keys with Numbers and Special Characters

You must escape JSON keys that begin with a number or special character in the pipeline configuration.

  • For a JSON record: { "0_123" : "value" } transform this record as follows to write "value" to "my_col" --- { "my_col" : "\"0_123\"" }
  • For a JSON record: { "a-col" : 123 } transform this record as follows to write "value" to "my_col" --- { "my_col" : "\"a-col\"" }

Case Sensitivity

  • The
    
    system normalizes database objects (e.g., table names, column names, schemas) in LAT Transformations according to the same rules as the database unless explicitly escaped with quotes.
    • The Ocient system normalizes all names to lowercase unless explicitly escaped. For example, the transformation: { "myCol" : "value" } normalizes to mycol and matches a database column mycol that was not escaped at creation time.
    • If you create the database column as an escaped literal such as "myCol" then the LAT transformation must be specified as: { "\"myCol\"" : "value" } to correctly map to the column.
  • Transformation functions are case sensitive, so represent them by using all lower case letters (e.g., st_geomfromewkb(my_column)).

Literals

  • You must escape literal values with a backtick. This applies to numbers, booleans, and empty arrays (otherwise the Ocient system interprets the value as a JSON transformation operator).

Right:

JSON


Wrong:

JSON

  • You can write string literals with a single quote.
JSON


Loading Numbers in Exponential Notation

When you need to load numbers from exponential notation (e.g. 1.20E-05), you might require a conversion to load them into the target Ocient column or to perform transformations on the data. Use the to_number transformation to load this into a column of type decimal or to convert to a numeric value for further transformations like multiplication.

Example:

A source record in a CSV file might have a number like 7.00E-05 that appears as a string in the field named source_field.

JSON


This JSON code converts the string in exponential notation into the double 0.00007 that is now suitable for loading into decimal, double, or float columns, or for use in mathematical transformation functions.

Flatten, Array Access, and Pipes

For general syntax help for arrays, projections, and pipe expressions, see JMESPath website. Flattening arrays, accessing entire arrays, and using pipe expressions are three important operations when you work with nested data in LAT transformations.

  • The flatten operator (e.g., []) flattens an array one level. This also removes NULL values from the array.

This can cause unintended consequences if multiple arrays need to remain aligned for data processing.

  • When you work with nested arrays, you can maintain array hierarchy. For example, you can replace an array index such as * with [*] to maintain the hierarchy.
  • You can use the pipe operator (e.g., a.b[*] | [0]) to stop the projection of an expression into the prior expression.

Related Links

Load Data