Key Concepts
the is a real time olap datastore designed for running analytics against large time series data sets the system operates in compliance with the ansi standard for sql the ocient system includes ultra fast query speeds for analytics on hyperscale data sets scalability for cost and performance on a system built with commodity hardware flexible deployment for running on premises, in the cloud, or as an ocient managed service robust functionality for ddl, geospatial ( ), and machine learning ( ) data movement at scale for etl and elt ingestion and operations the ocient system the ocient system is based on (casa), which collocates nvme drive storage with the system compute resources to optimize performance this design keeps records near and accessible for computation without a separate storage layer, which avoids many common bottlenecks for database engines, such as limits on network capacity or processing throughput the design achieves superior query performance when it operates on trillions of rows of data to learn more about the design principles behind the ocient system, see ocient architecture docid\ plaeqda8ax4tvbnj2rxbc deployment ocient is designed for flexible deployment for unique use scenarios that include — a cloud deployment managed by ocient support this option uses performance tested hardware and network infrastructure public cloud — installation on third party cloud providers, including and on premises — a self managed option for hosting ocient in your own data center system nodes ocient is a distributed system consisting of interconnected nodes within clusters, which together form the data warehouse the ocient system designates nodes for these system roles sql nodes — parse incoming sql statements and administer commands throughout the system these nodes serve as the interface for system and database administration loader nodes — manage etl ingestion from batch or streaming sources and index data for optimized performance foundation nodes — store data and perform the bulk of query processing foundation nodes perform as much query processing as possible on their stored data before interacting with other nodes ocient architecture diagram that shows the relationship between data sources, loading and transformation of the data, and data storage understanding nodes and their roles in an ocient system is most useful for system setup, administration, and maintenance for details about nodes, see ocient architecture docid\ plaeqda8ax4tvbnj2rxbc however, the ocient system makes it easy to load and query data without knowing these details data types the ocient data intelligence platform, the functions, and the indexes use only native sql data types to optimize storage and performance supported data types include common sql scalar types and complex types such as ip , arrays, tuples, and geospatial for details about supported data types, see understanding data types docid\ lkwvkbymwwvjhr60stg56 loading the ocient system can load data for an end to end flow from file or streaming sources, including common sources such as aws s3 for all the source and format options, ocient uses pipelines to control ingestion you can control pipelines using ddl commands, api on loader nodes, or a command line interface you can transform data during loading using sql functions loading ingestion throughput can scale as needed based on the ocient architecture for details, see load data docid\ lk7xyhhwzkwj32rx8p v2 query processing the design of the ocient architecture minimizes the amount of on disk data that must be read and processed to execute a query to do this, the system compiles a custom i/o pipeline for each data segment relevant to a query these custom pipelines leverage any keys or indexes to improve throughput to learn more about query processing, see query performance optimizations docid\ das3yqqqhangqr7xusanl ocient indexing an ocient datastore can use multiple layers of indexing to facilitate query performance using these indexes is pivotal to optimizing query performance for large data sets you can apply indexes using basic ddl statements that do not require deep knowledge of the internal system or data set you also have some ability to customize indexes for their specific use cases indexes are divided into two main categories segment keys these indexes require no additional storage and usually should be deployed on every table segment keys operate by accessing data from the partitioned data segments to quickly filter rows without i/o processing segment keys include time key — a segment key that partitions data based on a time series data column clustering key — a series of columns that are frequently queried together the system subdivides these segments on disk for faster reference for more information on segment keys, see timekeys and clustering keys docid\ tfr hznzvabrm8wqf46lm secondary indexes when deployed precisely, ocient secondary indexes can dramatically reduce the time for queries to run on large data sets ocient supports secondary indexes for these data categories numeric string partial string geospatial for more information, see secondary indexes docid\ xmmylaxzqfci6ysnff5tg connecting to ocient ocient provides sql access to its database using industry standard interfaces jdbc pyocient, a based driver for ocient http query api, a rest based interface for executing sql statements end users can query and analyze data in ocient without understanding the organization, nodes, networking, or other parts of the system architecture for more information, see connect to ocient docid\ bz8g2ykkd26fmwpywjbuu integrations ocient supports integration with various third party tools for database administration, business intelligence, and system monitoring for details about supported third party tools, see ocient integrations docid 8yfm g2lgskw sg3zw87v machine learning the ocient system includes functionality for training and executing machine learning models directly within sql ocientml builds on native linear algebra support, including first class matrix data types and operations such as matrix arithmetic, inversion, and eigenvalue decomposition you can create models using the create mlmodel statement and execute them as scalar functions in queries ocient supports a broad suite of models spanning regression, classification, clustering, dimensionality reduction, ensemble methods, and neural networks to address a wide range of analytical use cases for details, see machine learning in ocient docid\ oja43vqudjt25sfxl5iij resilience to hardware failure the ocient system uses erasure coding to organize and compute parity blocks so the system is fault tolerant and can rebuild missing data this failsafe does not require redundant copies of data, meaning that storage requirements are minimal security as a unified platform, ocient helps keep data secure by consolidating security capabilities and reactions in one place with a suite of auditing and monitoring tools ocient supports these security standards data encryption optional through tls/ssl protocols security compliance audited and certified soc 2 type 2 monitoring log level monitoring and alerts on key system information access controls sso, role based, and system level access controls are available related links core elements of an ocient system docid 3t15f0f lwnchdpzvlcx ocient architecture docid\ plaeqda8ax4tvbnj2rxbc ocient simulator docid\ z3tcswplhpl c8dwu4vxm connect to ocient docid\ bz8g2ykkd26fmwpywjbuu related videos at the whiteboard with ocient compute adjacent storage architecture™ https //www youtube com/watch?v=7cv0hr7f1fg at the whiteboard with ocient sql at scale https //www youtube com/watch?v=g64g7u6tlcq at the whiteboard with ocient indexing at hyperscale https //youtu be/0m90mussilo