Documentation Index
Fetch the complete documentation index at: https://docs.ocient.com/llms.txt
Use this file to discover all available pages before exploring further.
Source Fields
When you load data using a data pipeline, you reference extracted data using a special syntax named a source field reference. These source field references can behave differently depending on the data format. It is important to understand how to reference source field data and also how assigns the type of data you reference.Source Field References
You reference source fields in a pipeline using a named reference such as$my_field or a numeric field reference such as $1. For some data formats like JSON and , Ocient supports nested data access such as $my_field.sub_field.
Learn more about what source field references are supported for each data format in Data Formats for Data Pipelines.
Source Field Data Types
The Ocient System treats references to source fields (e.g.,$my_field) differently depending on the data format FORMAT specified in the data pipeline.
Text-Based Data Formats
For text-based formats such as DELIMITED and JSON, the Ocient System treats source fields as VARCHAR. For JSON data, the system treats source fields strictly as VARCHAR data even if the field value is represented logically in JSON as Boolean, integer, NULL, or double. Learn more about text-based formats in Load JSON Data and Load Delimited and CSV Data.
Binary-Based Data Format
For the binary-based format BINARY, Ocient treats source fields as the BINARY data type. Learn more about binary formats in Load Binary Data.
Parquet-Based Data Format
For the Parquet-based format PARQUET, Ocient encodes information about the data type of the fields during the load.
Supported Data Types
The data pipeline functionality supports all column data types.ST_POINT is not supported in a 3-coordinate format.Automatic Conversion and Casting
Loading pipelines apply automatic casting to the final values of your transformations and source fields where the data type does not match the target column type. The casting function that the Ocient System applies varies based on the SQL type of the data that results from your transformation and the SQL type of the target column. The system applies casting automatically in cases to resolve conflicts between data types:- Promotion of a type to expand to a wider data type (e.g.,
SMALLINTtoBIGINT) - Implicit casting from one family of data type to another (e.g.,
VARCHARtoTIMESTAMP)- Ocient supports implicit casting of all simple data types to
VARCHAR.
- Ocient supports implicit casting of all simple data types to
Auto-casting in Parquet does not support the automatic conversion to
VARCHAR columns. You must explicitly cast data to the CHAR data type when you convert Parquet data that is not string data to a VARCHAR column or VARCHAR function argument.- Implicit downcasting to a narrower data type (e.g.,
BIGINTtoSMALLINT)
Implicit Casting Support List
VARCHAR Source Data The Ocient System treats the source field reference fromJSON and DELIMITED formatted data as VARCHAR regardless of the logical type in the source.
Ocient supports the implicit casting of VARCHAR to all data types.
BINARY Source Data
The system treats the source field reference from BINARY-formatted data as raw bytes with automatic casting functions tailored to the BINARY data type.
Ocient supports implicit casting of BINARY to:
TINYINT,SMALLINT,INT,BIGINTFLOAT,DOUBLETIMESTAMP,DATE,TIME(asVARCHARdata)
For
BINARY or VARCHAR to BOOL:- You can represent
trueas:'t', 'true', 'yes', '1', 'y', 'on' - You can represent
falseas:'f', 'false', 'no', '0', 'n', 'off'
For
BINARY or VARCHAR to TIMESTAMP, DATE, and TIME, the automatic casting assumes that the character data has the format:TIMESTAMP—YYYY-MM-DD HH:mm:ss[.SSSSSSSSS]DATE—YYYY-MM-DDTIME—HH:mm:ss
TIMESTAMPdata implicitly downcasts toDATEandTIMEcolumns, truncating the unused portion.- ℹ️This option is available in version 25 or later.
- The system implicitly casts all simple types to
VARCHARcolumns except:HASHorBINARYDECIMALIPorIPV4
Data Type Promotion List
The data type promotion list describes how the Ocient System can automatically promote the specified type to a target column type. The system achieves the promotion using standard casting functions (e.g.,BIGINT(SMALLINT)). See Implicit Casting Support List for implicit casting, for example casting to VARCHAR.
| Source Data Type | Automatic Casting Precedence for Target Column Type |
|---|---|
BYTE or TINYINT | BYTE TINYINT SMALLINT INT BIGINTFLOAT DOUBLE |
SMALLINT | SMALLINT INT BIGINT FLOAT DOUBLE |
INT | INT BIGINT FLOAT DOUBLE |
BIGINT | BIGINTDOUBLETIMESTAMP TIME |
FLOAT | FLOAT DOUBLE |
DOUBLE | DOUBLE |
DECIMAL | DECIMAL |
BOOL | BOOLEAN |
UUID | UUID |
DATE | DATE TIMESTAMP |
TIME | TIME |
TIMESTAMP | TIMESTAMPSee Implicit Casting Support List for supported downcasting. |
IPV4 | IPV4 |
IP | IP |
BINARY | BINARYSee Implicit Casting Support List for supported cross-casting. |
ARRAY | ARRAY |
TUPLE | TUPLE |
VARCHAR | VARCHARSee Implicit Casting Support List for supported cross-casting. |
ST_POINT | ST_POINT, ST_LINESTRING, ST_POLYGON |
ST_LINESTRING | ST_LINESTRING, ST_POLYGON |
ST_POLYGON | ST_POLYGON |
BIGINTtoTIMEautomatic casting assumes a time value in milliseconds.BIGINTtoTIMESTAMPautomatic casting assumes milliseconds after the epoch. (January 1, 1970 at midnight UTC)- For complex types, the system applies the type precedence for automatic casting to the elements in the type.
- Automatic casting of integral types to DOUBLE and FLOAT are potentially lossy due to limits in precision of floating point representations.
- ℹ️This option is available in version 25 or later.
The system treats
BYTE and TINYINT aliases identically.
