Ocient Hyperscale Data Warehouse Release Notes
Release Highlights
- Better SQL Errors: Improved error messages for poorly constructed SQL queries.
- Improved Tracing: Enabled using the TRACE keyword in SQL queries to profile query performance. For details, see the TRACE keyword.
- Large Blobstore Support: Enabled more than 2TiB of data to spill per disk for high-density drive situations.
- Large Drive Support: Added support for drives up to 15.36TB in size per node.
- Rebalance System: Added the REBALANCE task, which enables optimization of query efficiency by transferring data around the system until nodes are roughly balanced in terms of data volume per node. For details, see Rebalance System.
- System Storage Space: Added support for multiple storage spaces and enabled the creation of a system storage space for internal Ocient System data.
- Workload Management Usability:
- Added support for assigning service classes to queries based on query text. For details, see the CREATE SERVICE CLASS SQL statement.
- Added support for changing the priority for queries. For details, see the ALTER QUERY SQL statement.
Features
- [DB-19266]: Network Configuration —
- All nodes must now belong to a connectivity pool. Manage connectivity pools using these new SQL statements. For details, see CONNECTIVITY POOL.
- CREATE CONNECTIVITY_POOL to create a connectivity pool.
- DROP CONNECTIVITY_POOL to drop a connectivity pool.
- ALTER CONNECTIVITY_POOL SET to set the metadata of a connectivity pool.
- ALTER CONNECTIVITY_POOL RENAME TO to rename a connectivity pool.
- ALTER CONNECTIVITY_POOL ADD PARTICIPANTS to add nodes to a connectivity pool.
- ALTER CONNECTIVITY_POOL DROP PARTICIPANTS to remove nodes from a connectivity pool.
- The ALTER NODE SET ADDRESS SQL statement changes the internal IP address for a node. For details, see ALTER NODE SET ADDRESS.
- You can now configure the network of an Ocient System. For details, see Manage the Network Configuration of an Ocient System.
- Redirects now occur only within connectivity pools. When you upgrade an Ocient System, you must first configure a connectivity pool.
- [DB-28986]: System Catalog Table Updates —
- Renamed client_version to protocol_version in the sys.queries and sys.completed_queries system catalog tables.
- Added driver_version column to the sys.queries and sys.completed_queries tables.
- [DB-29788]: Regular Expression Functions —
- Added new functions that use regular expression search patterns. The new functions are:
Version Compatibility
- The bootstrap.conf no longer supports highspeedAddress as an advanced system configuration option. For other options, see Node Bootstrapping Reference.
- The default behavior for the UNNEST function no longer uses the NULL_INPUT clause for a multi-item SELECT list. If you would like to utilize the default behavior from version 23.0 and prior, you may do so by changing ALTER SYSTEM ALTER CONFIG SET sql.unnestLegacySelectListBehavior = 'true'.
Feature Removal
- Removed COMPRESSION LZ4 from the compression options. This compression scheme remains in use as part of COMPRESSION DYNAMIC for variable-length columns.
- Removed the table-valued function REPLACEMENT_JOIN. For creating compressed lookup tables, see Global Dictionary Compression.
- Particle swarm optimization functionality has been removed from the Ocient System.
- ODBC connection has been removed from the Ocient System.
Release Highlights
- All machine learning functionality is available to use. For details, see Machine Learning Model Functions and Machine Learning in Ocient to get started.
- Delete Syntax: Enabled the deletion of individual rows in the database.
- Integrations: Added drivers and support for the following third-party applications:
Features
- [DB-13607]: Delete Syntax — Added the SQL DELETE statement syntax that enables the deletion of individual rows in the database. For details, see DELETE FROM TABLE.
- [DB-18020]: Large Geospatial Types — Increased the size of LINESTRING and POLYGON geospatial data types to 512 MB. For details, see Load Geospatial Data.
- [DB-19048]: Geospatial Index — Added the SPATIAL index type for indexing geospatial data. For details, see SPATIAL Index Type.
- [DB-18280]: Connectors Refresh — Added integration with DBeaver and . For details, see DBeaver Integration and Tableau Integration.
- [DB-20412]: Multi-Cluster Loading and Cluster of Clusters — Added support for loading and working with multiple clusters. For details, see Multiple Storage Clusters for Loading Data.
- [DB-21609]: Machine Learning Model Updates —
Version Compatibility
- Large Geospatial Types are not backwards compatible with earlier releases. For details, see Version Compatibility.
- The database data control language (DCL) denotes user role privileges to remove data using the DELETE keyword instead of TRUNCATE.
Release Highlights
- HyperLogLog (HLL): Added HLL sketch functionality.
- Information Schema: Added the information_schema schema that shows system metadata.
- Integrations: Added drivers and support for the following third-party applications:
Features
- [DB-13603]: Information Schema — Added the information_schema schema that shows system metadata in an accessible format.
- [DB-20484]: SQLAlchemy Integration — Published sqlalchemy-ocient driver to PyPI.
- [DB-21011]: EXCEPT Clause — Added EXCEPT clause so that SELECT * queries can explicitly omit columns from results.
- [DB-21769]: Metabase Integration — Added Ocient as a Metabase partner driver, allowing Metabase to access Ocient databases out-of-the-box.
- [DB-23030]: JDBC Packaging — Removed OpenJump dependency from ocient-jdbc4.
- [DB-23175]: Time Zone Adjustment Support — Added various improvements to time zone functionality, including:
- Support for daylight savings adjustment based on time zone.
- Added time zone functions CONVERT_UTC_TIMESTAMP_TO LOCAL and CONVERT_LOCAL_TIMESTAMP_TO_UTC. For more information, see Time Zone Functions.
- Enhanced performance for time zone conversion.
- [DB-23177]: Push-Down Aggregation to the I/O Layer — Under certain conditions, the system pushes aggregation to the I/O operator for better efficiency and performance.
- [DB-23299]: HLL Sketch Functionality — Added support for variable log2k HLL sketch algorithm and associated functions. For details, see the HLL Functions page.
- [DB-23745]: Implement Evacuate Node — Evacuate node is a tool to move all segments off of a node in a system that is overprovisioned to the other nodes in the cluster. This tool is useful when you replace drives or a node.
- [DB-19888]: Machine Learning Model Updates —
- The Ocient System scopes machine learning models to schemas. The system assigns the pre_v22_mlmodel schema to any model you created prior to version 22.0.
- The sys.multiple_linear_regression_slopes system catalog table has been removed.
- Rename machine learning models using ALTER MLMODEL.
- New DDL commands:
- CREATE OR REPLACE
- REFRESH
- EXPORT
LAT Features
- [LAT-1475]: Enablement of Stopping Load Processing During Error Condition — Added default behavior to stop processing during file loading in the event of an unrecoverable error when the system extracts records from a file. For details, see continue_on_unrecoverable_error.
- [LAT-1476]: Enablement of LAT Service in Installation — Enabled LAT Service in systemd by default upon installation completion. This update reflects a change in the default behavior during installation.
- [LAT-1477]: LAT Version for Metrics — Exposed lat_version in the metrics.
- [LAT-1557]: Support for Loading Multiple S3 Buckets — Added LAT functionality to load data from multiple S3 buckets simultaneously within the same pipeline.
Version Compatibility
- Information Schema — Views created prior to Version 22.0 do not have column data appearing in the information_schema. You can drop and recreate these views to populate column data.
- LAT — Version 3.0.0 and greater is only compatible with Version 22.0 and greater of the Ocient system. For details, see Version Compatibility.
Release Highlights
The Ocient System now supports the following operating systems:
- Ubuntu® 20.04
- Debian 11
- RHEL 8
Other highlights include:
- Whole column compression: Added Zstandard (ZSTD) compression for fixed and variable length columns.
- Check system configuration: Added precheck and postcheck commands to check system configuration before and after installation.
- Workload management dynamic priority: Enabled the adjustment of the query priority dynamically at the session, service class, and query levels.
- Ability to quiesce node: Added process for graceful node shutdown.
Features
- [DB-18636]: ZSTD Compression - Added a new whole-column compression scheme (ZSTD) that can be enabled for fixed and variable length columns.
- [DB-18990]: Improved Stats Storage And Usage - Various improvements have been added to speed up the fetching of statistics by the optimizer and ensure it gets up-to-date statistics. These changes primarily center around probability density functions being stored as pre-aggregated stats files instead of on a per-segment basis.
- [DB-20190]: Distributed Tasks - Added check_disk task type and new vtables sys.subtasks, sys.tasks, and sys.rebuild_tasks for monitoring tasks. Remove CHECK DATA command.
- [DB-19117]: Metadata - Added participating_nodes to the sys.queries and sys.completed_queries virtual tables.
- [DB-18633]: Graceful Node Shutdown - Added quiesce process for graceful node shutdown.
- [DB-18061]: LCK Deprecation - Added new disk data format that is smaller and also improves performance of some index based queries.
- [DB-19414]: Range Query Improvement - Improved performance of range queries by utilizing the inverted secondary index.
- [DB-20168]: Geospatial Function Expansion - Added these geospatial scalar functions.
- Measurement Functions
- ST_ANGLE
- ST_DISTANCESPHERE
- ST_DISTANCESPHEROID
- ST_LENGTH2D
- ST_HAUSDORFFDISTANCE
- Analytic and Property Functions
- ST_DIMENSION
- ST_GEOHASH
- ST_SRID
- ST_ISPOLYGONCW
- ST_ISPOLYGONCCW
- To String and Binary Functions
- ST_ASWKT
- ST_ASWKB
- ST_ASEWKT
- Geography Simplification Function
- ST_SIMPLIFY
- Constructor Functions
- ST_POINTFROMGEOHASH
- ST_GEOGPOINT
- ST_MAKEPOLYGONORIENTED
- ST_POINT_FROMEWKT
- ST_LINESTRING_FROMEWKT
- ST_POLYGON_FROMEWKT
- ST_MAKEENVELOPE
- Additionally, you can construct ST_POLYGON types directly from a POINT[] without going through an intermediate ST_LINESTRING.
Keywords
Added these new keywords as reserved words in the Ocient system.
- ANALYSIS
- AUTOREGRESSION
- BAYES
- CANCEL
- COMPONENT
- DECISION
- DISABLE
- DISABLE_STATS_FILE_UPDATES
- ENABLE
- FEEDFORWARD
- INSERT
- KMEANS
- KNN
- LOGISTIC
- MACHINE
- MOVE
- NAIVE
- NETWORK
- NONLINEAR
- PRINCIPAL
- REPLACE
- SOURCE
- SUPPORT
- TREE
- VECTOR
- ZSTD
Release Highlights
- CREATE TABLE AS SELECT SQL Statement: Extract, load, and transform (ELT) workflow functionality to extract data and load it into a new database table by using the query results from a SELECT SQL statement. The tables you create using the CREATE TABLE AS SELECT SQL statement have some indexing limitations in version 20.0. For details, see the "About Create Table As Select (CTAS)" section of the Ocient user documentation.
- INSERT INTO SQL Statement: ELT workflow functionality to extract data and insert it into an existing database table using the INSERT INTO SQL statement.
- N-gram Indexes: Full index on VARCHAR, VARCHAR arrays, and VARCHAR tuple components for efficient queries using the LIKE SQL statement.
- Large VARCHAR [DB-16142]: Support VARCHAR columns up to 1GB in size.
- Ocient Simulator: An instance of the Ocient system for data loading and functional testing.
- Single Sign-On (SSO): Authenticate access to Ocient through an external SSO server and assign SSO users to groups in Ocient.
Feature Removal
The ALTER ROLE DDL command has been removed. You can make all changes using the ALTER CONFIG SQL statement. To alter a role, prefix the key with the role name followed by a dot.
The following system tables have been added:
- average_bb_sizes
- linear_combination_regression_models
- node_config
- node_status
- sso_connections
- storage_device_status
The following system tables have been removed:
- hugepage_configurations
- memory_module_models
- node_memory_modules
- oidc_integrations
- oidc_sessions
- polynomial_regression_models
- security_integrations
- sessions
Features
- [DB-14527] - Adaptive Water Mark Feature - Indexer Node dynamically increase and reduce batch size without manual tuning
- [DB-14656] - Added a rest endpoint to expose a node’s configuration parameters (:9090/v1/configparams)
- [DB-15123] - Expose cluster total storage space and storage usage through virtual tables
- [DB-15515] - Add support for expr::dtype cast notation
- [DB-16289] - Remove the web ui and YAML service role configuration
- [DB-16904] - Allow any predicate type to be used in conjunction with the values in arrays
- [DB-17889] - Improve ability to continue data loading when a foundation node is down
- [DB-18393] - Leverage hyperthreading in query execution
- In v19 the service role configuration previously set through the web UI has been replaced by the ALTER … ALTER ROLE/CONFIG … DDL command. The web UI is still available in v19, but will be removed in a subsequent release. The ALTER … ALTER ROLE/CONFIG … command should be used to change system configuration, rather than the web UI. Please reference the Upgrade Ocient Software section of the user documentation for details.
- [DB-12747] - Add support for lateral joins.
- [DB-13924] - Add support for multi-column subqueries.
- [DB-14990] - Add support for native right joins.
- [DB-15996] - Add support for array_to_string function.
- [DB-16231] - Improve GIS function performance and introduce expanded support for GIS functions. Please refer to the User Documentation for details.
- [DB-17037] - Add new scalar functions and operators added for GIS types (POINT, LINESTRING, and POLYGON). Please refer to the User Documentation for details.
- [DB-17892] - Add support for right lateral joins.
- [DB-16061] - Secondary indexes can now be created on VARCHAR and VARCHAR[] columns. Please refer to the User Documentation for details.
Features
- [DB-17635] - Remove query log properties timestamp_optimizationcomplete and time_optimizationcomplete and add new properties timestamp_optimizationstart and time_optimizationstart
- [DB-16567] - Make error messages more clear for queries with GroupBy missing
- [DB-17316] - Change array_length(empty array) to return 0
- [DB-16417] - Allow for integral types for integer field is GIS functions
- [DB-16200] - Make Explains more convenient for the user.
- [DB-16092] - Distributed Result Set Caching
- [DB-15623] - Add support for rebuilding individual nodes via DDL
- [DB-15375] - ALTER CLUSTER ADD PARTICIPANTS DDL
- [DB-14720] - Provide a way to kill long running optimizations
- [DB-14017] - Support for CLI command history across sessions
Features
- [DB-12888] - Add support for array values larger than 128 KB. The new maximum value of an array is 512 MB.
Features
- [DB-10329] - Add support for full disk encryption of Opal drives. Disk encryption will be automatically enabled when Opal support is detected.
Features
- [DB-14159] - Default hex values for binary or varbinary columns must contain a leading 0x
Features
- [DB-14330] - Remove last dependencies on PostgreSQL from the database
Features
- [DB-13334] - Add support for zip unnest, which unnests multiple arrays in parallel
Features
- [DB-12887] - Add support for the array of tuples. Users can create array columns containing tuple SQL types. Please refer to the User Documentation for the latest information on supported data types.
- [DB-12885] - Add support for unnest(), which expands array elements from input array columns out to individual output rows
Features
- [DB-12394] - Added support for running on CentOS 8.
Features
- [DB-13162] - Added support for Tasks to the System Catalog
- [DB-10332] - Implemented access controls on system and database-level objects. Improved users, groups, and added new roles within Ocient.
- [DB-12829] - Optionally enforce encrypted connections for JDBC and ODBC.
Features
- [DB-10330] - External Network Security. SSL/TLS support in ODBC and JDBC, SSL support for the web interface.
Features
- [DB-10921] - Adds support for multi-dimensional arrays and the ability to do joins, windows, sorts and aggregations that involve arrays.
- [DB-10927] - Adds support for global dictionary compression (GDC) on VARCHAR array columns and the ability to do replacement joins.
- [DB-10282] - Adds support for DROP COLUMN DDL to remove columns from a table.
- [DB-11472] - Adds support for skipping failed rows for CSV loading up to some specified threshold.
Features
- [DB-9707] - Scriptable Bulk Load Essentials: Allows users to create translations and launch bulk load tasks via DDL
- [DB-10479] - Adds support for Tableau through Ocient’s JDBC Custom Connector. Users can find Ocient’s connector and the installation instructions on Tableau’s extension gallery. Please refer to Tableau for more inforamation.
Features
- [DB-9477] - Adds support for the array data type. Users can create single-dimensional array columns from any other supported data type. Please refer to the User Documentation for the latest information on supported data types.
- [DB-9656] - Add column support. The engine now supports the add column DDL statement with the ability to add columns to an existing table. Existing data that was loaded without the new column uses the configured default values when queried. Please refer to User Documentation for information on the DDL syntax and default values.
Features
- [DB-6386] - Availability of the storage engine, allowing queries to run with a node or drive failure
- [DB-6588] - Bulk loading of CSV files from HDFS or an S3 endpoint
- [DB-7221] - Delta compression in the TKT engine for timestamp columns
- [DB-7623] - Virtual tables to retrieve information from the storage cluster state
- [DB-7247] - OS Upgrade functionality
- [DB-6125] - AWS initial support
- [DB-6383] - Data Definition Language (DDL) operations
- [DB-6940] - All system configuration in the System Catalog
- [DB-6362] - Stats Virtual Tables
- [DB-7098] - External Window Operator support
- [DB-7497] - List Running Tasks Page
- [DB-7097] - Segment Group Deletion
- [DB-7139] - Cancel Query and Cancel Task support
- Numerous stability and performance improvements