Sink Configuration
The Sink Configuration controls the destination of data in the LAT Pipeline.
A sink configuration object.
Required keys:
Type of sink to use in the pipeline.
Type: | string |
Required: | Yes |
Default: | |
Allowed values:
Required keys:
Array of one or more Ocient Loader Nodes, in host:port,... format
Type: | string[] |
Required: | Yes |
Default: | |
Number of records to buffer per partition before flushing records to Ocient
Type: | int |
Required: | No |
Default: | 1000 |
Time based flushing parameter, in milliseconds. Records will flush to Ocient after this duration has elapsed with no new activity, even if fewer than batch_records records have been processed.
Type: | int |
Required: | Yes |
Default: | 30000 |
Time based polling parameter, in milliseconds. This Sink will periodically poll the remote for progress on write durability for idle partitions.
Type: | int |
Required: | No |
Default: | 60000 |
Request timeout when communicating with Ocient remotes, in milliseconds.
Type: | int |
Required: | No |
Default: | 300000 |
Duration to delay after a failed request to an Ocient remote prior retrying, in milliseconds.
Type: | int |
Required: | No |
Default: | 1000 |
Additional duration to delay after a failed request to an Ocient remote prior retrying, in milliseconds. The total delay incurred prior to a given retry is request_backoff + rand(0, request_jitter).
Type: | int |
Required: | No |
Default: | 5000 |
High watermark memory point, in bytes. The LAT will stop pushing new rows to memory buffers. It will not resume pushing rows into the memory buffers until low_watermark is reached.
Type: | int |
Required: | No |
Default: | 1000000000 |
Low watermark memory point, in bytes. After reaching high_watermark, the LAT will begin pushing rows to memory buffers again when this memory level is reached.
Type: | int |
Required: | No |
Default: | 500000000 |
UUID of the storage scope that rows will be associated with. The scope with the given UUID must already exist in the target cluster.
Type: | string |
Required: | No |
Default: | null |
A Boolean value to determine whether to omit page replicas for the specified storage scope. This is ignored if sink.storage_scope_id is not specified or has already been seen by the remotes.
Type: | boolean |
Required: | No |
Default: | false |
The number of threads in the Netty event loop group used to communicate with remotes.
Type: | int |
Required: | No |
Default: | 1 |
A Sink type for testing LAT pipelines that writes the transformed data to local JSONL files.
Required keys:
An absolute or relative path to the location that the sink should write files to.
Type: | String |
Required: | Yes |
Default: | |
Rather than including a sink directly within the pipeline, it is also possible to configure a pipeline to use a sink that is specified externally. Sinks can be managed (created, deleted, and more) using the LAT Client Command Line Interface. A sink must exist before a pipeline can use it.
There are three ways to configure a pipeline to use a sink.
- If a sink is not specified within the pipeline, you can specify a sink_name that corresponds to a sink previously created using the LAT Client.
- If neither sink nor sink_name is specified in a pipeline, the default sink will be used. If a default sink has not been created using the LAT Client, a pipeline must specify either a sink or a sink_name.
Load Data