Key Concepts
The is a real-time OLAP datastore designed for running analytics against large time series data sets. The System operates in compliance with the ANSI standard for SQL.
The Ocient System includes:
- Ultra-fast query speeds for analytics on hyperscale data sets.
- Scalability for cost and performance on a system built with commodity hardware.
- Flexible deployment for running on premises, in the cloud, or as an Ocient-managed service.
- Data movement at scale for ETL and ELT ingestion and operations.
The Ocient System is based on (CASA), which collocates NVMe drive storage with the system compute resources to optimize performance. This design keeps records near and accessible for computation without a separate storage layer, which avoids many common bottlenecks for database engines, such as limits on network capacity or processing throughput. The design achieves superior query performance when it operates on trillions of rows of data.
To learn more about the design principles behind the Ocient System, see Ocient Architecture.
Ocient is a distributed system consisting of interconnected nodes within clusters, which together form the data warehouse. The Ocient System designates nodes for these system roles:
- SQL Nodes — Parse incoming SQL statements and administer commands throughout the system. These nodes serve as the interface for system and database administration.
- Loader Nodes — Manage ETL ingestion from batch or streaming sources and index data for optimized performance.
- Foundation Nodes — Store data and perform the bulk of query processing. Foundation Nodes perform as much query processing as possible on their stored data before interacting with other nodes.
Understanding nodes and their roles in an Ocient System is most useful for system setup, administration, and maintenance. For details about nodes, see Ocient Architecture. However, the Ocient System makes it easy to load and query data without knowing these details.
The Ocient Hyperscale Data Warehouse, the functions, and the indexes use only native SQL data types to optimize storage and performance. Supported data types include common SQL scalar types and complex types such as IP, arrays, tuples, and geospatial shapes. For details about supported data types, see Understanding Data Types.
The Ocient System can load data for an end-to-end flow from file or streaming sources, including common sources such as:
For all the source and format options, Ocient uses pipelines to control ingestion. You can control pipelines using DDL commands, API on Loader Nodes, or a command-line interface. You can transform data during loading using SQL functions.
Loading ingestion throughput can scale as needed based on the Ocient architecture.
For details, see Load Data.
The design of the Ocient architecture minimizes the amount of on-disk data that must be read and processed to execute a query. To do this, the system compiles a custom I/O pipeline for each data segment relevant to a query. These custom pipelines make use of any keys or indexes that can help facilitate faster throughput.
To learn more about query processing, see Query Performance Optimizations.
An Ocient datastore can use multiple layers of indexing to facilitate query performance. Using these indexes is pivotal to optimizing query performance for large data sets.
You can apply indexes using basic DDL statements that do not require deep knowledge of the internal system or data set. You also have some ability to customize indexes for their specific use cases.
Indexes are divided into two main categories.
Segment Keys
These indexes require no additional storage and usually should be deployed on every table. Segment keys operate by accessing data from the partitioned data segments to quickly filter rows without I/O processing.
Segment keys include:
- Time Key — A segment key that partitions data based on a time-series data column.
- Clustering Key — A series of columns that are frequently queried together. The system subdivides these segments on disk for faster reference.
For more information on segment keys, see Time Keys, Clustering Keys, and Indexes.
Secondary Indexes
When deployed precisely, Ocient secondary indexes can dramatically reduce the time for queries to run on large data sets.
Ocient supports secondary indexes for these data categories:
- Numeric
- String
- Partial string
- Geospatial
For more information, see Secondary Indexes.
Ocient provides SQL access to its database using industry-standard drivers:
End users can query and analyze data in Ocient without understanding the organization, nodes, networking, or other parts of the system architecture.
For more information, see Connect to Ocient.
Ocient supports integration with various third-party tools for database administration, business intelligence, and system monitoring.
For details about supported third-party tools, see Ocient Integrations.
The Ocient System uses erasure coding to organize and compute parity blocks so the system is fault-tolerant and can rebuild missing data. This failsafe does not require redundant copies of data, meaning that storage requirements are minimal.
As a unified platform, Ocient helps keep data secure by consolidating security capability and reactions in one place with a suite of auditing and monitoring tools. Ocient supports these security standards:
- Data Encryption: Optional through TLS/SSL protocols.
- Security Compliance: Audited and certified SOC 2 Type 2.
- Monitoring: Log-level monitoring and alerts on key system information.
- Access Controls: SSO, role-based, and system-level access controls are available.