Load Data
Data Types for Data Pipelines
source fields when you load data using a data pipeline, you reference extracted data using a special syntax named a source field reference these source field references can behave differently depending on the data format it is important to understand how to reference source field data and also how {{ocient}} assigns the type of data you reference source field references you reference source fields in a pipeline using a named reference such as $my field or a numeric field reference such as $1 for some data formats like json and {{parquet}} , ocient supports nested data access such as $my field sub field learn more about what source field references are supported for each data format in docid 1 jhzrhblgnqucl6skiaq source field data types the ocient system treats references to source fields (e g , $my field ) differently depending on the data format format specified in the data pipeline text based data formats for text based formats such as delimited and json , the ocient system treats source fields as varchar for json data, the system treats source fields strictly as varchar data even if the field value is represented logically in json as boolean, integer, null, or double learn more about text based formats in docid 1 jhzrhblgnqucl6skiaq and docid 1 jhzrhblgnqucl6skiaq binary based data format for the binary based format binary , ocient treats source fields as the binary data type learn more about binary formats in docid 1 jhzrhblgnqucl6skiaq parquet based data format for the parquet based format parquet , ocient encodes information about the data type of the fields during the load supported data types the data pipeline functionality supports all column data types except tuples of tuples tuples of arrays st point is not supported in a 3 coordinate format when you load data into a column type, in many cases, ocient automatically casts the source data to the target column however, in some cases, you might need to cast data to the target column type explicitly learn more about supported casting functions in docid\ aimcafoydn2xf fgqssys automatic conversion and casting loading pipelines apply automatic casting to the final values of your transformations and source fields where the data type does not match the target column type the casting function that the ocient system applies varies based on the sql type of the data that results from your transformation and the sql type of the target column the system applies casting automatically in cases to resolve conflicts between data types promotion of a type to expand to a wider data type (e g , smallint to bigint ) implicit casting from one family of data type to another (e g , varchar to timestamp ) ocient supports implicit casting of all simple data types to varchar auto casting in parquet does not support the automatic conversion to varchar columns you must explicitly cast data to the char data type when you convert parquet data that is not string data to a varchar column or varchar function argument automatic casting does not apply in some situations where data loss might occur implicit downcasting to a narrower data type (e g , bigint to smallint ) where automatic casting is not supported, you can explicitly cast data or use built in transformation functions according to your requirements see docid\ aimcafoydn2xf fgqssys for more details implicit casting support list varchar source data the ocient system treats the source field reference from json and delimited formatted data as varchar regardless of the logical type in the source ocient supports the implicit casting of varchar to all simple data types these types are not supported geospatial types arrays tuples matrixes binary source data the system treats the source field reference from binary formatted data as raw bytes with automatic casting functions tailored to the binary data type ocient supports implicit casting of binary to tinyint , smallint , int , bigint float , double timestamp , date , time (as varchar data) for binary or varchar to bool you can represent true as 't', 'true', 'yes', '1', 'y', 'on' you can represent false as 'f', 'false', 'no', '0', 'n', 'off' for binary or varchar to timestamp , date , and time , the automatic casting assumes that the character data has the format timestamp — yyyy mm dd hh\ mm\ ss\[ sssssssss] date — yyyy mm dd time — hh\ mm\ ss other implicit casts or downcasts the system also supports some other implicit casts that are not type promotions timestamp data implicitly downcasts to date and time columns, truncating the unused portion ℹ️this option is available in version 25 or later all simple types implicitly cast to varchar columns except hash or binary decimal ip or ipv4 array , tuple , matrix for geospatial data types, see docid 4dwngsvovxrfnjvlzdbcv data type promotion list the data type promotion list describes how the ocient system can automatically promote the specified type to a target column type the system achieves the promotion using standard casting functions (e g , bigint(smallint) ) see /#implicit casting support list for implicit casting, for example casting to varchar source data type automatic casting precedence for target column type byte or tinyint byte tinyint smallint int bigint float double smallint smallint int bigint float double int int bigint float double bigint bigint double timestamp time float float double double double decimal decimal bool boolean uuid uuid date date timestamp time time timestamp timestamp see /#implicit casting support list for supported downcasting ipv4 ipv4 ip ip binary binary see /#implicit casting support list for supported cross casting array array tuple tuple varchar varchar see /#implicit casting support list for supported cross casting st point st point , st linestring , st polygon st linestring st linestring , st polygon st polygon st polygon bigint to time automatic casting assumes a time value in milliseconds bigint to timestamp automatic casting assumes milliseconds after the epoch (january 1, 1970 at midnight utc) for complex types, the system applies the type precedence for automatic casting to the elements in the type automatic casting of integral types to double and float are potentially lossy due to limits in precision of floating point representations ℹ️this option is available in version 25 or later the system treats byte and tinyint aliases identically related links docid\ xq0tg7yph vn62uwufibu docid 1 jhzrhblgnqucl6skiaq docid\ l8tdfpfzzvzeyabc2h7bq docid\ a3n4wkcawrpo1gtefetmm