Data Types
The following data types are supported in table columns and queries.
Name | Description | String Representation | Example Usage |
BIGINT | 8-byte signed integer | - | 9876543210 |
BINARY(N) or HASH(N) | Fixed-length binary array with length N | "xxxx…" (a string of 2N hexadecimal digits representing N bytes) | BINARY(4): '01234567', BINARY(3): 'abcdef' |
BOOLEAN | 1-byte logical Boolean (false or true) | ["true", "false", "t", "f", "yes", "no", "on", "off", "y", "n", "1", "0"] | 'true', '0' |
CHARACTER(N) or CHAR(N) | Variable-length character string with length N | - | CHAR(4): 'This is a string' |
DATE | 4-byte calendar date (year, month, day) | ["yyyy:MM:dd", "yyyy-MM-dd"] | '2020-02-02', '2000:01:01' |
DECIMAL(P,S) | Exact numerical with precision P and scale S | - | 123.45 |
DOUBLE PRECISION or DOUBLE | 8-byte double precision floating-point number | - | 3.141592 |
INT | 4-byte signed integer | - | 123456789 |
IPV4 | 4-byte Internet Protocol version 4 | "xxx.xxx.xxx.xxx" | '127.0.0.1' |
IP | 16-byte Internet Protocol version 6 (can also hold IPV4) | ["xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx", "::x", "xxx.xxx.xxx.xxx"] | '0123:4567:89ab:cdef:0123:4567:89ab:cdef', '1111::0', '127.0.0.1' |
POINT or ST_POINT | Geometric point in 2 dimensions | [POINT("lon lat")] | 'POINT(-87.6410 41.8841)' |
LINESTRING or ST_LINESTRING | Geometric type composed of N st_points | [LINESTRING("lon1 lat1, …, lonN latN")] | 'LINESTRING(0 0,2 0)' |
POLYGON or ST_POLYGON | Geometric type composed of N closed st_linestrings | [POLYGON(("lon1 lat1, …, lonN latN"), ("lon1 lat1, …, lonN latN"))] | 'POLYGON((0 0,2 0,5 5,0 2,0 0), (0 0,1 0,2 2,0 2,0 0))' |
REAL or FLOAT or SINGLE PRECISION | 4-byte single precision floating-point number | - | 2.718 |
SMALLINT | 2-byte signed integer | - | 32767 |
TIME | 8-byte time of day in nanoseconds | "HH:mm:ss.FFFFFFFFF" | '12:34:56.012345678' |
TIMESTAMP | 8-byte date and time in nanoseconds with no associated time zone. | ["yyyy:MM:ddTHH:mm:ss", "x"] | '2000:01:02T12:34:45', 1234567890000000000 |
TINYINT or BYTE | 1-byte signed integer | - | 127 |
TUPLE<<type1, type2, …>> | Tuple of elements of different types | "value1, value2, …" | 1,0xafed,1,1.0,2.0,626.3,2000-05-20 06:43:49.999999999,0xaf1325,'some string',f716:a70d:8319:c4ee:d5f3:711e:b7c0:e0be,1999-12-31,true |
TYPE[] | Array type. TYPE is any supported data type besides ARRAY. (e.g., INT[] for an array of integers) | "type[value1, value2, …]" | 'bigint[]', 'bigint[1,2,3]', 'tuple<<bigint,bigint>>[tuple<<bigint,bigint>>(1,2), tuple<<bigint,bigint>>(3,4)]', [] can be used for empty arrays. |
UUID | 16-byte universally unique identifier | "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" | '01234567-89ab-cdef-1357-0123456789ab' |
VARBINARY(N) | Variable-length binary array with maximum length of N | "xxxx…" (a string of hexadecimal digits representing at most N bytes) | VARBINARY(4): 'aabbccddeeff' |
VARCHAR(N) | Variable-length character string with maximum length of N | - | VARCHAR(4): 'This is a variable length string' |
Integer literals in SQL statements are of type BIGINT unless explicitly cast to another type.
Floating point literals in SQL statements are of type DOUBLE unless explicitly cast to another type.
TRUE and FALSE can be used in SQL statements as valid BOOLEAN literals.
Floating point values must be strictly numeric. Infinity and NaN values are unsupported.
The decimal type supports a maximum precision of 31 and a scale of one less than the precision.
Interval types can only be used within an expression. If the final result of an outermost expression is an interval type, it will automatically be converted to a BIGINT and will lose units information.
- "yyyy" : The year as a four-digit number.
- "MM" : The month, from 01 through 12.
- "dd" : The day of the month, from 01 through 31.
- "HH" : The hour, using a 24-hour clock from 00 to 23.
- "mm" : The minute, from 00 through 59.
- "ss" : The second, from 00 through 59.
- "FFFFFFFFF" : If non-zero, the billionths of a second in a date and time value.
Variable-length columns have limits to how large the value can be.
Data types with a 124 KiB size limit:
- TUPLE<<type1, type2, …>> : Tuple value with a VARCHAR(N) or VARBINARY(N) type
- ST_LINESTRING
- ST_POLYGON
Data types with a 512 MiB size limit:
- TYPE[] : The inner values of the array are subject to the size limits of the inner type.
- VARBINARY(N)
- VARCHAR(N)
When you query data, if the database must analyze a value that exceeds 124 KiB at query execution time, the system increases memory usage. Large computed values are non-freeable during query execution and the system cannot offload to temporary disk. Queries that operate on computed large values with many rows can cause the system to run out of memory and result in the kill of the query.
Array is a SQL type container that stores multiple elements of the same SQL type. Arrays can have zero or any number of values. For instance, an INT array can contain many integers including NULL. Arrays in can store any SQL type; i.e: IPV4, IP, BOOLEAN, BYTE, SHORT, INT, LONG, FLOAT, DOUBLE, TIME, TIMESTAMP, CHAR, DATE, BINARY, HASH, ST_POINT, UUID, DECIMAL and TUPLE.
Arrays have special functions and operators that are described in the following sections:
Tuple is a SQL type container that is meant to store elements of different types. Ocient supports tuple using any of the supported SQL types; i.e: IPV4, IP, BOOLEAN, BYTE, SHORT, INT, BIGINT, FLOAT, DOUBLE, TIME, TIMESTAMP, CHAR, DATE, BINARY, HASH, ST_POINT, UUID and DECIMAL. In Ocient, tuples can either exist as single columns or as SQL array elements. There is currently no support for storing nested tuples or tuples of arrays.
Tuple functions and operators are described in the following sections:
Matrix is a SQL type container. It is a fixed-size, 1 or 2-dimensional array (mathematical matrix) of the same fixed-size SQL datatypes. Currently, Matrix only supports storing doubles. Matrix dimensions must be non-zero.
Matrix functions and operators are described in the following sections:
Ocient supports three different geospatial geographies: POLYGONS, LINESTRINGS, and POINTS.
- A POLYGON can be constructed with a closed LINESTRING or an outer shell and array of inner rings.
- A LINESTRING represents a series of POINTS connected by line segments. It can be constructed with either LINESTRINGS or POINTS.
- A POINT represents a point in space defined by an (x, y)/(long, lat) coordinate pair.
Polygon semantics in the Ocient system follow the OGC standard (ISO 19125-1) for LinearRings, where the exterior and holes can be meaningfully oriented either counter-clockwise or clockwise. To conceptualize this, imagine walking along the exterior of the polygon in the order of its defined points, so everything to your left is the "interior" of the polygon. A counter-clockwise polygon is the typical orientation where, using an example of a state boundary, the interior of the state corresponds to the interior of the polygon. If you define the state boundary using the clockwise orientation, the interior of the polygon represents everything except the state.
If a counter-clockwise exterior polygon has holes, the holes should be clockwise-oriented.
Ensure that both the data and polygon literals are oriented in the way you intend when you use them in queries.
Many other geospatial implementations use an object-oriented approach to geospatial data types by implementing a GEOGRAPHY union type that might contain any combination of these types: POINT, LINESTRING, or POLYGON.
Ocient supports GEOGRAPHY slightly differently by using POLYGON as an implicit union type. This container type means unclosed POLYGON types with no holes are equivalent to LINESTRING types, and that a POLYGON might contain a single POINT. This definition of a POLYGON differs from other implementations, where such POLYGON types would be considered degenerate. During loading, Ocient is able to convert incoming GEOGRAPHY data to the appropriate POLYGON equivalent. Ocient geospatial functions appropriately handle these cases as needed.
Similar to geography union types, many other geospatial implementations define an attribute table of column identifiers and an associated SRID. The system uses the SRID lookup to define semantics of functions such as st_area, which uses different calculations if the specified polygon is planar, defined on an ideal sphere, or on a spheroid. The identifiers also specify units of measure and the number of variables used to define a point in the reference system. However, Ocient only supports the storage of latitude and longitude coordinates for its geography types, and uses only GCS WGS 84 (EPSG code 4326) semantics. Rather than applying a transformation like you might in other implementations to change units or model accuracy, the geospatial measurement functions in the Ocient system allow you to specify the units of measure directly, as well as whether to use a faster idealized sphere model of the Earth, or a more accurate spheroidal model.
Ensure that you set the units and spheroidal model correctly when performing critical measurements. Also, be careful when you export data from other databases to Ocient so that the incoming data is in GCS WGS 84 format. Ensure that you use the Loading and Transformation pipeline to prepend or strip SRID definitions from EWKT or EWKB formatted data. Internally, the Ocient database has no concept of SRIDs, however it does support parsing and emitting SRID-prepended data for compatibility with external tools. When the database parses SRID-prepended data, the system ignores the SRID component and parses the data as a GCS WGS 84 defined geography.
Each Geospatial geography has a number of functions and operators that can be used in SQL queries to perform analyses. Some functions apply to specific geography types. For more function and operator specifics, refer to the following sections: