Understanding Data Types
supports many common data types such as INT, SHORT, DATE, and TIMESTAMP as well as complex types like ARRAY and TUPLE and specialized data types like POINT and LINESTRING used for geospatial analysis. For detailed information about supported data types, refer to the Data Types page.
All of the following data types can be used as types for columns in tables and for queries. When used in queries, Scalar Data Conversion Functions can be used to manipulate data types and various functions and aggregates can be run depending on the data type. For example, numeric types can be summed, averaged or be processed with other statistical functions, and character types can be trimmed, manipulated, or aggregated into a single string.
Basic data types such as INT, BOOLEAN, and VARCHAR behave similarly to other databases. Complex types such as ARRAYS and TUPLES can be queried and inspected like any other column and also have special operators used to interrogate the sets of data.
Numeric and Boolean types are used to represent data with different precision and scale. Integer values are supported from a 1-byte signed int up to an 8-byte signed int. All integers are signed in Ocient. Floating point values are possible as either single or double precision, and exact decimal precision is available with the Decimal type. The Boolean type is used to represent true or false in the database.
All numeric types are fixed-length data types.
- BIGINT — 8-byte signed integer
- INT — 4-byte signed integer
- SMALLINT — 2-byte signed integer
- TINYINT — 1-byte signed integer (alias: BYTE)
- FLOAT — 4-byte single precision floating-point number (alias: REAL or SINGLE PRECISION)
- DOUBLE — 8-byte double precision floating-point number (alias: DOUBLE PRECISION)
- DECIMAL(P,S) — Exact numerical decimal value with precision P and Scale S
- BOOLEAN — 1-byte logical Boolean representing true or false
Learn more about working with numeric data types in the following sections:
Character Types represent text and character data. Both fixed-length and variable-length options are available. Fixed-length provides some benefits to the database storage and query engine when applicable, but indexes are available that target variable-length character columns. Binary types represent byte array data.
- CHAR(N) — Variable-length character string with length N (alias: CHARACTER(N))
- VARCHAR(N) — Variable-length character string with maximum length of N
- BINARY(N) — Fixed-length binary array with length N (alias: HASH(N))
- VARBINARY(N) — Variable-length binary array with maximum length N (alias: HASH(N))
Learn more about working with character and binary data types in the following sections:
Date and Time types are used to represent time in Ocient. Timestamps take on special capabilities when defined as a on a table. This is described in more detail in Time Keys, Clustering Keys, and Indexes. Ocient provides many functions to operate on timeseries data.
- DATE — 4-byte Calendar date representing year, month, day
- TIMESTAMP — 8-byte date and time with nanosecond precision in UTC time zone
- TIME — 8-byte time of day in nanoseconds
Learn more about working with date and time data types in the following sections:
Ocient provides data types for processing of IP addresses and UUIDs that allow schemas to leverage improved searching and analysis of both types as well as easier loading of structured data into these types. The IP type is designed for IP V6 and can handle IP V4 addresses as well.
- IPV4 — 4-byte Internet Protocol V4 address
- IP — 16-byte Internet Protocol V6 address
- UUID — 16-byte universally unique identifier (e.g., 01234567-89ab-cdef-1357-0123456789ab)
Geospatial analysis is a rich capability in Ocient enabled by special data types and functions. The foundation rests on POINT, LINESTRING, and POLYGON types that represent respective points and shapes on the Earth. From these and composites of these types, many proximity, intersection, and computational geospatial analytics can be performed at an incredible scale.
- POINT — Geometric point in 2 dimensions.
- LINESTRING — Geometric type composed of N POINTS.
- POLYGON — Geometric type composed of N closed LINESTRINGS.
Learn more about working with geospatial types in the Geospatial Functions section.
Arrays and Tuples are SQL container types used to store other SQL types. These allow for complex schemas to be represented as a column in a table instead of separate, normalized tables. In some situations, using Arrays, Tuples, or Arrays of Tuples can lead to significant performance benefits. Use the Matrix type to store one or two-dimensional arrays of fixed-size data types for matrix computations.
- ARRAY — An array of any supported type other than ARRAY.
- MATRIX — A one or two-dimensional mathematical matrix used for matrix calculations.
- TUPLE — A tuple of elements of different types. Tuple is similar to a structure in other programming languages. You can include Tuples in arrays.
Learn more about complex data types in the following sections: