System Administration
Maintenance Overview
Power Safety Recommendations
like any distributed data processing system, power safety in {{ocient}} is important to ensure data integrity data integrity in ocient comprises newly loaded data that is in the process of being written to storage and metadata files that contain information about the state of the system ensure that you have power loss protection included in the nvme drives the ocient system does provide some protection against power loss so, follow these best practices for power management nvme ssd power loss protection (plp) the ocient hyperscale data warehouse uses nvme solid state drives (ssd) to store data to improve i/o performance, ssds cache data in a volatile memory temporary buffer before it is flushed to non volatile flash memory during a normal shutdown, the host system notifies the ssd that the system is shutting down so it can flush data to non volatile storage however, unexpected power loss can result in loss or corruption of this data if this volatile memory is not preserved to protect against this data loss, nvme drives validated for use with ocient systems must include power loss protection (plp) mechanisms plp are standard in enterprise and datacenter grade nvme drives they are typically implemented using super capacitors that can hold enough power to allow safe writing of cached data to non volatile memory in the event of a power loss to ensure data integrity ocient distributed system protections while data is loaded it temporarily resides in memory (ram) or disk caches the data stored in either of these locations is safely persisted to disks during a planned shutdown of nodes the operating system or software might not be able to persist this data to disk during an unplanned power loss or system crash, so ocient is designed to handle this scenario gracefully ocient provides redundancy mechanisms using clustering to prevent metadata loss due to a sudden power outage on a single node the admin istrator cluster consists of multiple nodes performing the admin istrator role that maintain system metadata this prevents metadata loss when a single node is lost or damaged foundation storage clusters use parity configuration to protect data in the event of sudden power loss to a foundation node in such an event, inflight queries might fail but without any permanent loss of data dynamic rebuilding of missing data allows a configurable number of foundation nodes to lose power and ocient can continue to serve queries if power loss damages a foundation node or causes data corruption, the data on the damaged foundation node can be rebuilt with the information available in parity bits on other nodes to fully recover lost data this might not hold true in simultaneous power loss on more than one admin istrator node or simultaneous power loss on more foundation nodes than the number of parity bits in the storage cluster such coordinated power loss could leave an ocient system in a state where admin istrator metadata is corrupted or damaged data on foundation nodes cannot be fully rebuilt to prevent simultaneous power loss, the following best practices are recommended power safety best practices the risk of complete system loss can be greatly reduced by avoiding sudden loss of power to an ocient system best practices to reduce this risk are nvme drives must include plp each ocient node should be configured with redundant power supplies where each power supply is serviced by independent electrical circuits a rack or data center (dc) level uninterruptible power system (ups) should be available to provide constant power to ocient nodes during a sudden power outage when power loss occurs and the ups system cannot provide power for an extended period of time, you must properly shut down ocient nodes by using the startup and shutdown procedure docid\ lscr8jcvjpr7xq4qt4vzr ocient systems should be hosted in tier iii or tier iv data centers these data centers provide redundant power and cooling systems and guarantee minimal down time related links install an ocient system docid\ clmx7aipvis6ctybuagzx