> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ocient.com/llms.txt
> Use this file to discover all available pages before exploring further.

# CREATE TABLE SQL Statement Examples

export const TimeKey = "TimeKey®";

export const OcientGeo = "OcientGeo™";

export const OcientDataIntelligencePlatform = "OcientAIQ™ Unified Data Platform";

export const Ocient = "Ocient®";

Tables in the {OcientDataIntelligencePlatform} contain various configuration options that can impact performance, reduce storage requirements, or provide extra data safeguards.

As some table configuration options cannot be added after data is already loaded, it is important that you take the time to design `CREATE TABLE` SQL statements with forethought for the queries that you expect to execute frequently. This topic provides examples of `CREATE TABLE` SQL statements designed for different use cases and describes the configuration choices.

For syntax and parameter information, see [CREATE TABLE](/tables#create-table) .

## Create a Table with All Data Types

This example creates a table with all supported data types in {Ocient}. The intent of this example is to show the required formatting for different data type values, which is shown as the `DEFAULT` value for each column.

```sql SQL theme={null}
CREATE TABLE data_type_example (
 	col_bigint BIGINT NOT NULL DEFAULT 9876543210,
 	col_binary BINARY(3) NOT NULL DEFAULT '0xabcdef',
	col_boolean BOOL NOT NULL DEFAULT TRUE,
 	col_char CHAR(4) NOT NULL DEFAULT 'val',
 	col_date DATE NOT NULL DEFAULT '2000-01-01',
 	col_decimal DECIMAL(18,4) NOT NULL DEFAULT 123.45,
  	col_double DOUBLE NOT NULL DEFAULT 3.141592,
 	col_float FLOAT NOT NULL DEFAULT 2.718,
 	col_int INT NOT NULL DEFAULT 123456789,
 	col_int_array INT[] NOT NULL DEFAULT 'INT[0,1,2,3]',
 	col_ipv4 IPV4 NOT NULL DEFAULT '127.0.0.1',
 	col_ip IP NOT NULL DEFAULT '0123:4567:89ab:cdef:0123:4567:89ab:cdef',
 	col_smallint SMALLINT NOT NULL DEFAULT 32767,
  	col_matrix MATRIX[2][3] NOT NULL DEFAULT '{ {0, 0, 0}, {0, 0, 0} }',
	col_point POINT NOT NULL DEFAULT 'POINT(0 0)',
	col_linestring LINESTRING NOT NULL DEFAULT 'LINESTRING(0 0, 1 1)',
	col_polygon POLYGON NOT NULL DEFAULT 'POLYGON((0 0, 0 1, 1 1, 1 0, 0 0))',
	col_time TIME NOT NULL DEFAULT '12:34:56.012345678',
	col_timestamp TIMESTAMP NOT NULL DEFAULT '2000-01-02 12:34:45',
	col_tinyint TINYINT NOT NULL DEFAULT 127,
	col_tuple TUPLE<<INT, INT>> NOT NULL DEFAULT 'tuple<<INT,INT>>(0,0)',
	col_array_of_tuples TUPLE<<INT, INT>>[] NOT NULL DEFAULT 'TUPLE<<INT,INT>>[tuple<<INT,INT>>(1,2),tuple<<int,int>>(3,4)]',
	col_uuid UUID NOT NULL DEFAULT '01234567-89ab-cdef-1357-0123456789ab',
	col_varbinary VARBINARY NOT NULL DEFAULT '0xaabbccddeeff',
	col_varchar VARCHAR NOT NULL DEFAULT 'This is a variable length string'
);
```

## Create a Table for Real-Time Analysis

This example assumes you plan to ingest large quantities of time series data for rapid querying. The most important columns that you plan to reference in queries frequently include:

* `created_at` — A granular timestamp column that represents the primary time reference in the table.
* `user_id` — An integer column to identify each unique user.
* `sell_id` — An integer column representing each transaction by a user.

```sql SQL theme={null}
CREATE TABLE IF NOT EXISTS "transact_data" (
    "created_at" TIMESTAMP TIME KEY BUCKET(1, HOUR) NOT NULL,
    "user_id" BIGINT NOT NULL DEFAULT 0,
    "sell_id" BIGINT NOT NULL,
    "purchase_amount" DOUBLE NOT NULL DEFAULT 0,
    "buyer_name" VARCHAR,
    CLUSTERING KEY "ck" ("sell_id", "user_id")
);

CREATE INDEX "purchase_idx" ON "transact_data" ("purchase_amount");
CREATE INDEX "buyer_idx" ON "transact_data" ("buyer_name")
    USING NGRAM(4);
```

This example table makes use of these Ocient System configurations.

| **Column Name**      | **Configuration Description**                                                                                                                                                                                                                                                    |
| -------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `created_at`         | This column is the table {TimeKey}, which means that the database indexes for filter optimization. <br />The example configures the TimeKey to partition segments by every hour, `(1, HOUR)`, which is the approximate time granularity you expect to use in your query filters. |
| `user_id`, `sell_id` | The example table identifies these two columns as the table Clustering Key. The database pairs these two columns for quick reference, such as filtering one column for row values of the other.                                                                                  |
| `purchase_amount`    | The `CREATE INDEX` SQL statement adds a secondary index to this column. <br />As a `BIGINT` data type, the `purchase_amount` column defaults to the `INVERTED` index.                                                                                                            |
| `buyer_name`         | This `CREATE INDEX` SQL statement adds a secondary index to this column. <br />As a `VARCHAR` column, the `buyer_name` column normally defaults to a `HASH` index, but this example specifies it should be an `NGRAM` index of width `4` instead.                                |

For details about using the TimeKey and Clustering Key functionality, see [TimeKeys and Clustering Keys](/timekeys-and-clustering-keys).

For details about other index types, see [Secondary Indexes](/secondary-indexes).

## Create a Table for Geospatial Data

This example assumes you plan to ingest geospatial data linked to time series data using {OcientGeo}. The most important columns that you plan to reference frequently in queries include:

* `route_time` — A granular timestamp column that represents the primary time reference in the table.
* `start_point`, `end_point` — The geospatial point values for the start and destination locations.
* `trip_id` — An integer column to identify each unique trip.
* `vehicle_id` — An integer column that represents each vehicle.

```sql SQL theme={null}
CREATE TABLE "trip_details" (
    "route_time" TIMESTAMP TIME KEY BUCKET(14, DAY) NOT NULL,
    "start_point" POINT NULL,
    "end_point" POINT NOT NULL DEFAULT 'POINT(0 0)',
    "route_path" LINESTRING NOT NULL
        DEFAULT 'LINESTRING(0 0, 1 1)',
    "route_area" POLYGON NOT NULL
        DEFAULT 'POLYGON((0 0, 0 1, 1 1, 1 0, 0 0))',
    "trip_id" BIGINT NOT NULL,
    "vehicle_id" BIGINT NOT NULL,
    CLUSTERING KEY "ck" ("trip_id", "vehicle_id")
);

CREATE INDEX "start_idx" on "trip_details" ("start_point")
    USING SPATIAL;
CREATE INDEX "end_idx" on "trip_details" ("end_point")
    USING SPATIAL;
```

This example table makes use of these Ocient System configurations.

| **Column Name**                         | **Configuration Description**                                                                                                                                                                                                                                                          |
| --------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `route_time`                            | This column is the table TimeKey, which means that the database indexes for filter optimization.<br />The example configures the TimeKey to partition segments by every two weeks `(14, DAY)`, which is the approximate time granularity that you expect to use in your query filters. |
| `trip_id`, `vehicle_id`                 | The example table identifies these two columns as the table Clustering Key. The database pairs these two columns for quick reference, such as filtering one column for row values of the other.                                                                                        |
| `end_point`, `route_path`, `route_area` | These columns have `POINT`, `LINESTRING`, and `POLYGON` data types, respectively. <br />Each of these geospatial columns includes default values in its respective format.                                                                                                             |
| `start_point`, `end_point`              | The `CREATE INDEX` SQL statements for both of these columns assign `SPATIAL` indexes for quicker reference. This index is the default type for any column with a geospatial data type.                                                                                                 |

For details about OcientGeo functionality, see [Geospatial Functions](/geospatial-functions).

## Create a Table With Compression Options

This example table includes multiple compression options to ensure minimal storage for specific columns and extra performance.

<Info>
  Most data types, except arrays, have `COMPRESSION DYNAMIC` applied by default.
</Info>

```sql SQL theme={null}
CREATE TABLE "call_records" (
    "utc_timestamp" TIMESTAMP TIME KEY BUCKET(1, DAY) NOT NULL,
    "sectorid" VARCHAR
        COMPRESSION ZSTD,
    "address" VARCHAR
        COMPRESSION NONE,
    "call_type" VARCHAR
        COMPRESSION ZSTD
        COMPRESSION GDC(4),
    "access_information" VARCHAR(500)
        COMPRESSION GDC(4),
    "file_name" VARCHAR
        COMPRESSION ZSTD compression_level = 3,
        dictionary_size = 1048576,
    CLUSTERING KEY "ck" ("call_type", "access_information")
);
```

This example table makes use of these Ocient System configurations.

| **Column name**                   | **Configuration Description**                                                                                                                                                                                                                                                                                                                                                                                                               |
| --------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `address`                         | This column explicitly removes compression by specifying `COMPRESSION NONE` in the `CREATE TABLE` SQL statement.                                                                                                                                                                                                                                                                                                                            |
| `call_type`, `access_information` | These columns have `VARCHAR` data types configured with Global Dictionary Compression (GDC). <br />This compression scheme maps the variable-length type rows to integers. Among other benefits, this compression allows both columns to support a Clustering Key for faster querying. <br />The `call_type` column stacks multiple compression schemes by including both GDC and ZSTD compression to provide additional storage reduction. |
| `file_name`                       | This column uses ZSTD compression with extra options. The `compression_level` is set for `3`, meaning the column data is slightly more compressed than the default `0`. The dictionary\_size is set for the maximum value, `1048576`, which means greater compression but higher memory demand.                                                                                                                                             |

For details about Ocient compression schemes, see [Table Compression Options](/table-compression-options).

## Create a Table for Various Container Types

This example table includes complex container data types, including a two-dimensional array, a tuple with variable-length values, and a tuple with various data types.

```sql SQL theme={null}
CREATE TABLE "container_data" (
    "utc_timestamp" TIMESTAMP TIME KEY BUCKET(1, DAY) NOT NULL,
    "user_id" BIGINT NOT NULL DEFAULT 0,
    "sell_id" BIGINT NOT NULL DEFAULT 0,
    "point_array_col" POINT[],
    "array_col" VARCHAR[] COMPRESSION GDC(4),
    "2D_array_col" INT[][],
    "tuple_col1" TUPLE<<
        VARCHAR(255),
        VARCHAR(500)
        >>,
    "tuple_col2" TUPLE<<
        INT,
        INT[],
        VARCHAR COMPRESSION GDC(1) COMPRESSION ZSTD,
        VARCHAR
        >>,
    "matrix_col" MATRIX[5][10],
    CLUSTERING INDEX "ck" ("user_id", "sell_id")
);

CREATE INDEX "point_idx" on "container_data" ("point_array_col")
    USING SPATIAL;
CREATE INDEX "tuple_idx" on "container_data" ("tuple_col1"[1])
    USING HASH;
CREATE INDEX "tuple_idx2" on "container_data" ("tuple_col2"[2])
    USING INVERTED;
```

This example table makes use of these Ocient System configurations.

| **Column name**   | **Configuration Description**                                                                                                                                                                                                                                                                                                                          |
| ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `point_array_col` | This column is an array of `POINT` values that is indexed using a `SPATIAL` index type as specified by the `CREATE INDEX` SQL statement.                                                                                                                                                                                                               |
| `tuple_col1`      | This tuple column contains two variable-length values. <br />Each object in the tuple can support its own compression scheme and index. The `CREATE INDEX` SQL statement assigns a `HASH` index to the first value (`VARCHAR(255)`).                                                                                                                   |
| `tuple_col2`      | This tuple column contains various data types. Each object in the tuple can support its own compression scheme and index. <br />The two `VARCHAR` objects in the tuple column each have their own unique compression scheme. <br />The `CREATE INDEX` SQL statement assigns an `INVERTED` index to the second tuple object, the integer array `INT[]`. |

For details about Ocient container types, see [Array, Tuple, and Matrix Overview](/array-tuple-and-matrix-overview).

## Create a Table With a Retention Policy

This example creates a retention policy that keeps rows only if they are less than one day old, as specified by the last line, `RETENTION POLICY AGE 1 DAY`. The retention policy depends on the `created_at` column because it is the TimeKey column. The TimeKey bucket value `(1, HOUR)` determines how frequently the system checks for any rows to remove.

```sql SQL theme={null}
CREATE TABLE IF NOT EXISTS "transact_data" (
    "created_at" TIMESTAMP TIME KEY BUCKET(1, HOUR) NOT NULL,
    "user_id" INT NOT NULL DEFAULT 0,
    "sell_id" BIGINT NOT NULL,
    "purchase_amount" BIGINT NOT NULL DEFAULT 0,
    "buyer_name" VARCHAR(1048576),
    CLUSTERING KEY "primary_index" ("sell_id", "user_id")
) RETENTION POLICY AGE 1 DAY;
```

For details about configuring table retention, see [Table Retention Policies](/table-retention-policies).

## Related Links

[Data Types](/data-types)

[Data Definition Language (DDL) Statement Reference](/data-definition-language-ddl-statement-reference)

[Table Compression Options](/table-compression-options)

[Secondary Indexes](/secondary-indexes)
