System Administration
Configure Storage Spaces
when setting up an {{ocient}} system, a storage cluster represents a set of foundation nodes in the system a storage space is a set of parameters within a storage cluster that defines how the system administers data and fault tolerance creating a storage space is an important step for launching an ocient system that must be done after installing hardware and bootstrapping nodes but before you start loading and querying data to see how creating a storage space fits into the ocient installation process, see docid\ nneedy7yn8g1pennmamng this tutorial explains how to set up and configure storage spaces to meet the needs of your ocient system ocient hyperscale data warehouse contains foundation nodes in segments grouped into segment groups that are grouped into storage clusters tables with metadata are stored in segments when an ocient system starts, the system automatically creates a default system storage space for persistent metadata, such as system catalog tables this default storage space is immutable, meaning you cannot delete or modify it the name of the metadata storage space is systemstoragespace in the https //docs ocient com/system catalog#li63a system catalog table user defined storage spaces for storage other than metadata, you must create one or more separate storage spaces the creation of storage spaces is necessary before loading data or performing other operations user defined storage spaces are configurable for extra resiliency or storage based on your system needs a storage space represents how the storage cluster spreads data across foundation nodes to balance storage and fault tolerance in general terms, the storage cluster configuration defines how the storage cluster behaves for these attributes regular data storage operations (see /#segment group width ) query resiliency (see /#parity width ) load resiliency (see /#overprovision ) you can configure a storage space to assign nodes to these job roles by using ddl statements (see docid\ xga0pas8wadtq33 a x7v ) planning out your storage space carefully is important because you cannot change a storage space configuration after creation segment group width storage space parameter width the segment group width determines the number of segments that comprise a segment group functionally, this setting defines how many foundation nodes perform read and write operations for a segment group within a segment group, the system assigns each segment to a different node hence, the segment group width cannot exceed the number of nodes in your system in most circumstances , segment group width should comprise most of your foundation nodes on system startup, the default segment group width is three, which is the minimum number of nodes required for an ocient system parity width storage space parameter parity width you can assign a subset of the segment group width to parity width the parity width of a storage space determines its fault tolerance, defining the number of parity coding bits to use for each segment group this number determines how many nodes can fail before the cluster is disabled parity width protects system querying capabilities if any nodes fail, the system can execute and complete queries as long as the number of failed nodes is less than or equal to the parity width when deployed in conjunction with overprovisioned nodes, parity width also provides fault tolerance for loading operations the parity width requires additional storage overhead, which is calculated by the formula (parity width) / (segment group width parity width) overprovision the ocient system overprovisions any foundation nodes in the cluster in excess of the segment group width by default these nodes can include any nodes not configured in the storage space or any nodes added later overprovisioned nodes protect loading operations from node failure, although they must be used in conjunction with parity width in the event of node failures, loading operations can continue as long as the number of failed nodes does not exceed the parity width nodes or the number of overprovisioned nodes for this reason, providing fault tolerance for loading requires a balance of parity width and overprovisioned nodes unlike segment group width and parity width, you do not explicitly define a value for overprovisioning instead, any foundation nodes in the cluster in excess of the number allocated toward segment group width or parity width become overprovisioned by default storage space configuration examples this section demonstrates different configurations for storage spaces the examples each go through different scenarios for node failures to show the fault tolerance of each setup for information on the syntax for storage spaces, see docid\ xga0pas8wadtq33 a x7v 10 node cluster this example assumes you have a storage cluster of 10 foundation nodes create storagespace ocient width = 10, parity width = 2; the width = 10 parameter means the system stores a segment group on 10 different nodes the parity width = 2 parameter means each segment in the group contains enough parity bits to restore lost data for up to two nodes in effect, this means parity bits comprise 20 percent of storage tolerance scenarios this storage space setup has no overprovisioning, making loading operations less resilient this setup results in these outcomes if nodes become disabled if one node fails, loading fails, and querying can continue if two nodes fail, loading fails, and querying can continue if three nodes fail, loading and querying both fail 12 node cluster this example assumes a storage cluster of 12 foundation nodes create storagespace ocient width = 10, parity width = 2; note that this ddl statement is the same as the /#10 node cluster , but the cluster in this example has two additional nodes the width = 10 parameter means the system stores a segment group on 10 of the 12 nodes the two remaining nodes are overprovisioned the parity width = 2 parameter means each segment in the group contains enough parity bits to restore lost data for up to two nodes in effect, this means parity bits comprise 16 6 percent of storage tolerance scenarios this setup provides a level of resiliency for both querying and loading the setup results in these outcomes if nodes become disabled if one node fails, loading and querying continues if two nodes fail, loading and querying continues if three nodes fail, loading and querying both fail 12 node cluster with more parity width this example assumes a storage cluster of 12 foundation nodes with more nodes allocated to parity create storagespace ocient width = 10, parity width = 3; the width = 10 parameter means the system stores a segment group on 10 of the 12 nodes the two remaining nodes are overprovisioned the parity width = 3 parameter means each segment in the group contains enough parity bits to restore lost data for up to three nodes in effect, this means parity bits comprise 42 percent of storage tolerance scenarios this setup provides resiliency for loading and especially for querying the setup results in these outcomes if nodes become disabled if one node fails, both loading and querying continue if two nodes fail, both loading and querying continue if three nodes fail, loading fails, and querying continues if four nodes fail, both loading and querying fail 15 node cluster this example assumes a cluster of 15 foundation nodes the configuration in this setup includes five overprovisioned nodes, exceeding the parity width nodes this number of overprovisioned nodes would be inefficient and unable to provide the full benefit in a real system, but this example demonstrates what happens if you added extra nodes at a later time to a cluster, which the storage space would recognize as overprovisioned create storagespace ocient width = 10, parity width = 3; the width = 10 parameter means the system stores a segment group on 10 of the 15 nodes the five remaining nodes are overprovisioned the parity width = 3 parameter means each segment in the group contains enough parity bits to restore lost data for up to three nodes in effect, this means parity bits comprise 20 percent of storage tolerance scenarios this setup provides resiliency for both querying and loading, but having more than three overprovisioned nodes provides no benefit because it exceeds three parity width nodes this setup results in these outcomes if nodes become disabled if one node fails, loading and querying continue if two nodes fail, loading and querying continue if three nodes fail, loading and querying continue if four nodes fail, loading and querying both fail because this breaches the parity width tolerance related links docid\ adp4amtxi3djdrsq2khdb docid\ xga0pas8wadtq33 a x7v