Discover Insights From System Catalog Tables
You can query the System catalog tables to discover important information on system objects such as databases, segments, tables, and indexes. These statistics can be important for monitoring your system to ensure it operates as intended.
However, querying these tables can be challenging because generating insights about your system can involve writing complex queries that involve merging, grouping, and filtering multiple system catalog tables.
Here, you can find examples of useful queries that you can use as templates for discovering insights about your system.
Most of these example queries use the data_type_coverage table available with the Ocient Simulator. For details, see Ocient Simulator.
For specific column definitions of the system catalog tables referenced in these examples, see the Storage section of the System Catalog page.
Check the size of segments and return insights about the segment storage allocation and the number of table rows contained in each segment.
Output
For information on the system catalog tables queried in this example, see sys.segments and sys.tables.
Check the details for pages and segments in the system.
Output
Check all storage spaces on your system.
Output
For specific column definitions of the system catalog tables referenced in these examples, see the Monitoring section of the System Catalog page.
Check all queries currently running on the system.
The output of the sys.queries table depends on what database you are logged into and your system role. For details, see Query Visibility.
Output
Check the most recent queries completed by the system.
The output of the sys.completed_queries table depends on what database you are logged into and your system role. For details, see Query Visibility.
Output
For specific column definitions of the system catalog tables used in these examples, see the Databases section of the System Catalog page.
Check the total size of all databases on your system.
Output
Check the total size of all tables on the system.
Output
This query provides a breakdown of table statistics and partition information.
Output
For information on the system catalog tables queried in this example, see sys.segment_groups and sys.segment_parts.
Check the information on the compression for each column in the specified table.
Output
For information on the system catalog table queried in this example, see sys.columns_compression_info.
This query summarizes the total compression of the specified table. In the example, replace the table name and schema name accordingly.
Output
For information on the system catalog table queried in this example, see sys.columns_compression_info.
Check the total size of each index across the different segment groups.
In this example, the table has three secondary indexes added before loading data.
Output
For information on the system catalog tables queried in this example, see sys.indexes and sys.segment_parts.
This query finds the size of all columns for the specified table. Before running this query on your system, replace table_name in the example.
Output
This output is based on a query using the table data_type_coverage.
This query finds the cardinality of each column in the specified table. Knowing the cardinality is useful for determining which compression scheme to use for a column.
Output
For information on the system catalog table queried in this example, see sys.column_cardinalities.
For specific column definitions of the system catalog tables referenced in these examples, see the System section of the System Catalog page.
Check the status of all nodes in the system.
Output
Check the roles of each node on the system.
Output
Check the operational status, software versions, and assigned roles of your nodes. Possible status values include ACTIVE, STARTING, STOPPING, ERROR, UNKNOWN, or UNREACHABLE.
Output
Check for all Foundation Nodes on your system clusters.
Output
Check the status of all drives assigned to nodes in the system.
Output
For specific column definitions of the system catalog tables referenced in these examples, see the User Management section of the System Catalog page.
Check for all users on the system and their corresponding assigned group.
Output
Check for all users on the system and their assigned role.
Output
Check the privileges for the specific table name. Before running on your system, replace username in the example.
Output
Check the granted privileges across all database objects of the specific type, such as GROUP, TABLE, or DATABASE. In this example, the query checks for TABLE privileges.
Output
Check for the service class assignment for all groups.
Output
Check for the settings for the specified service class. Before running on your system, replace the high priority service class name in the example.
For details about service class settings, see Workload Management and Service Classes.
Output
Check for the service class of the specific user. Before running on your system, replace the jmack@test username in this example with the specific username that you want to query.
Output
While your pipeline is running, the pipeline generates events in the sys.pipeline_events system catalog table to mark significant checkpoints when something has occurred. In this example, you run the CREATE PIPELINE and START PIPELINE SQL statements on the pipeline named my_pipeline and let the pipeline complete. As expected, the CREATED, STARTED, and COMPLETED events appear. For details about the lists of files, see the sys.pipeline_files system catalog table.
Pipeline events include messages from many different tasks. These events are the background processes that execute across different Loader Nodes during pipeline operation. For details about tasks, you can query the sys.tasks and sys.subtasks system catalog tables.
Output
Find metrics that measure the performance and activity of a pipeline in the sys.pipeline_metrics system catalog table. This table contains one row for each metric at a specified point in time. A common use of this data is to plot metric values over time to examine the behavior of a pipeline.
Metric Types
Metrics are either instantaneous or incremental:
- Instantaneous metrics reflect the current value of a counter at the time of the metric collection. These values can increase or decrease based on the current state or rate of the metric.
- Incremental metrics increment over time and reflect the cumulative value for a specified counter.
Example
The first snapshot of a metric appears when it is initialized. Afterward, the metric is updated every 10 seconds. The updated_at column shows the metric collection time.
For example, if you are interested in the number of the transformed record bytes, you can run this query. The value increases over time as the pipeline runs.
Output
Metric Scope
Each metric has a scope that defines its uniqueness. Be careful not to aggregate metrics across different scopes.
The scope columns are pipeline_id, extractor_task_id, partition_id, and sink_index. If the value of any of these columns is NULL, the scope applies to all values in that dimension.
For example, because the partition_id and sink_index columns are NULL for the metric count.record.bytes.transformed, its scope applies to the pipeline indicated by the pipeline_id and extractor_task_id across all sinks and partitions.
An extractor_task_id is the internal identifier used when the Ocient System executes a pipeline process on a Loader Node. A pipeline might have many of these processes across time and loaders.
Example
Retrieve the most recent snapshot for each metric and aggregate the values to produce a single value for a pipeline.
Output
This output contains a single row for each metric for each pipeline. If you run a pipeline with 64 files and 100,000 records, the output might look like this. This table explains each column in the output.