Database Administration
Schema Design
TimeKeys and Clustering Keys
segment keys define how an {{ocient}} system partitions data on disk based on particular columns in a database these keys allow the system to quickly filter rows and skip irrelevant records for queries involving those columns for this reason, segment keys are integral tools for optimizing query performance for pivotal columns although segment keys are optional, deploying them on every database table in your system is a good practice because they require practically no additional processing or storage resources ocient supports two types of segment keys timekeys and clustering keys /#timekeys — a segment key that partitions data based on a single time series data column timekeys and clustering keys /#clustering key — a series of columns that are frequently queried together the system subdivides these segments on disk for faster reference together, these segment keys form the basis for the ocient time cluster storage engine, which subdivides data and metadata for optimal performance attributes of segment keys advantages ocient segment keys share these important advantages improved performance for query filtering and record retrieval minimal resource demand for storage or i/o easy to use without sophisticated system knowledge, you can create segment keys with basic data definition language (ddl) statements limitations there are no significant downsides to using segment keys however, they have some limitations for deployment you can implement segment keys only upon table creation as part of the initial database, tables, views, and indexes docid\ uacarixqhe493vlhudb5b statement you cannot add them later to a table each database table can support only one key of each type for faster performance on additional columns, you can also add clustering indexes and other secondary indexes for details, see timekeys and clustering keys /#add clustering indexes and secondary indexes docid\ ssxi4zc3p mqdtr0b7qdn segment keys are restricted to specific column data types timekeys timekeys partition data based on a single time series column in a database table this column should normally be either a date or timestamp data type for optimal performance, you can configure a timekey column with two arguments that comprise a time bucket the time bucket is a unit of time combined with a number value (e g , hour, 2 or day, 1 ) this value sets the granularity for the partitioning in general, this value should match the level of filtering that you expect your queries to request for example, a database table you expect to query for transactions by minute should probably have a small timekey bucket value, such as an hour if you expect your queries to filter by broader time values, your table can have a timekey bucket of one day or larger the ocient system uses the timekey and the bucket granularity to partition rows into segments that incorporate a minimum and maximum time value for all rows for queries that involve time filtering, the timekey allows an ocient system to immediately skip data segments that are outside the query bounds without any i/o processing this syntax example creates a single timekey column and must be included in the column definition of a create table sql statement for the full syntax, see database, tables, views, and indexes docid\ uacarixqhe493vlhudb5b column name \[ data type ] time key bucket (bucket granularity, bucket value) 215,115,293 false true unhandled content type unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type unhandled content type unhandled content type false unhandled content type for ddl examples of tables that use timekey columns, see create table sql statement examples docid 0rmcbcyysu 0ej2rmrcqy clustering key every table in an ocient system can support a clustering key to speed up query operations on a series of adjacent columns t he clustering key determines how data is ordered into segments on disk this ordering acts like an index, enabling an ocient system to quickly locate any relevant rows of the clustering key columns every database table should define one clustering key on a sequence of frequently queried columns for best results clustering key columns can be of any fixed length data type except for decimal variable length columns can also support clustering keys if they are configured with global dictionary compression for details, see table compression options docid gac3iwnrtwwnndngn50b in addition to the clustering key, you can also define clustering indexes unlike the clustering key, clustering indexes do not involve ordering data into segments they create separate indexes for each specified column set, which boosts performance but requires some additional storage this syntax example creates a clustering key across multiple columns and must be included in a create table sql statement for the full syntax, see database, tables, views, and indexes docid\ uacarixqhe493vlhudb5b clustering key key name (ck col1, ck col2 \[, ]) \[, index index name (idx col1, idx col2 \[, ]) \[, index ] ] 241,79,293 false true unhandled content type unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type false unhandled content type select clustering key columns you can define a clustering key and any additional clustering indexes as arguments of a database, tables, views, and indexes docid\ uacarixqhe493vlhudb5b sql statement the column ordering of a clustering key is significant you must specify columns in rank order table example this create table sql statement example shows the definition of a clustering key on columns (ck1, ck2, ck3) the data types of these columns are irrelevant to this example create table example table ( , ck1 int, ck2 int, ck3 int, , clustering key ck example (ck1, ck2, ck3), ); the clustering key primes the records for search and filter operations on its columns in descending order of importance this means the first column ck1 is the most important of the subset while the third ck3 is the least important to boost performance with a clustering key, a query must make references to the columns following this key order for a clustering key in the (ck1, ck2, ck3) pattern, these sequences of query column references are effective ck1 (ck1,ck2) (ck1,ck2,ck3) query example this example select query follows the key order for its filter operations select from example table ( where ck1 > 1 and ck2 = 0 and ck3 < 0; for a clustering key in the (ck1, ck2, ck3) pattern, these sequences of query column references are ineffective (ck1, ck3) (ck2, ck3) (ck3, ck2) to improve performance on queries that reference columns in these sequences, you can use clustering indexes for details, see timekeys and clustering keys /#add clustering indexes to check the order of column references in a query, you can test a query using an query analysis docid\ xc5pdlk7orduwvuqcmran sql statement this statement shows the i/o pipeline and how the system references indexes for ddl examples of tables that use clustering key columns, see create table sql statement examples docid 0rmcbcyysu 0ej2rmrcqy add clustering indexes a table can only have one clustering key, but it can also have any number of clustering indexes to boost performance clustering indexes act similarly to the clustering key, but they do not order data into segments instead, each clustering index creates a separate index for the specified column set, which can impact storage and loading overhead clustering indexes can provide flexibility to your tables, allowing your queries to reference the same columns as your clustering key, but in different sequences you must define clustering keys and clustering indexes when you first create a table you cannot add them at a later time example define a clustering key across the first four columns (ck1, ck2, ck3, ck4) in the example table table in this create table sql statement the statement also includes two clustering indexes using the index definition the first creates an index for a subset of just two columns (ck2, ck3) from the clustering key the second index uses all of the key columns (ck2, ck4, ck1, ck3) , but in a different sequence create table example table ( ck1 int not null, ck2 int not null, ck3 int not null, ck4 int not null, col5 int not null, clustering key ck example (ck1, ck2, ck3, ck4), index idx 01 (ck2, ck3), index idx 02 (ck2, ck4, ck1, ck3) ); in the example, col5 is not part of the clustering key, meaning this column cannot be included in any clustering indexes permissions to create timekeys and clustering keys, you must have the create privilege for the current database observability you can query these system catalog tables to get information on the effectiveness of your configurations for segment keys and indexes sys indexes — includes information on the index identifier, type, and other information sys index columns — contains information on indexes by their column position sys segment parts — c ontains data on the segment parts associated with each segment key for examples of querying segment and index metadata tables, see discover insights from system catalog tables docid szkiqnuxled51xsbjf3w related links ingest data with legacy lat reference docid\ tt6tfoulap0mt aycm2ka query ocient docid 4ycq1d8tkfmlsacorynf6 database, tables, views, and indexes docid\ uacarixqhe493vlhudb5b secondary indexes docid\ ssxi4zc3p mqdtr0b7qdn related videos at the whiteboard with ocient indexing at hyperscale