sink
A sink configuration object.
Required keys:
sink.type [#sink-type]
Type of sink to use in the pipeline.
| Type: | string |
|---|---|
| Required: | Yes |
| Default: |
ocient: see Ocient Sink for additional configuration.file: see File Sink for additional configuration.
Ocient Sink
The Ocient Sink allows LAT to connect to an Ocient cluster to write rows to one or more tables. Required keys:sink.remotes [#sink-remotes]
Array of one or more Ocient Loader Nodes, in host:port,... format
| Type: | string[] |
|---|---|
| Required: | Yes |
| Default: |
sink.batch_records
Number of records to buffer per partition before flushing records to Ocient
| Type: | int |
|---|---|
| Required: | No |
| Default: | 1000 |
sink.batch_duration
Time based flushing parameter, in milliseconds. Records will flush to Ocient after this duration has elapsed with no new activity, even if fewer than batch_records records have been processed.
| Type: | int |
|---|---|
| Required: | Yes |
| Default: | 30000 |
sink.idle_partition_polling_period
Time based polling parameter, in milliseconds. This Sink will periodically poll the remote for progress on write durability for idle partitions.
| Type: | int |
|---|---|
| Required: | No |
| Default: | 60000 |
sink.request_timeout
Request timeout when communicating with Ocient remotes, in milliseconds.
| Type: | int |
|---|---|
| Required: | No |
| Default: | 300000 |
sink.request_backoff
Duration to delay after a failed request to an Ocient remote prior retrying, in milliseconds.
| Type: | int |
|---|---|
| Required: | No |
| Default: | 1000 |
sink.request_jitter
Additional duration to delay after a failed request to an Ocient remote prior retrying, in milliseconds. The total delay incurred prior to a given retry is request_backoff + rand(0, request_jitter).
| Type: | int |
|---|---|
| Required: | No |
| Default: | 5000 |
sink.high_watermark
High watermark memory point, in bytes. The LAT will stop pushing new rows to memory buffers. It will not resume pushing rows into the memory buffers until low_watermark is reached.
| Type: | int |
|---|---|
| Required: | No |
| Default: | 1000000000 |
sink.low_watermark
Low watermark memory point, in bytes. After reaching high_watermark, the LAT will begin pushing rows to memory buffers again when this memory level is reached.
| Type: | int |
|---|---|
| Required: | No |
| Default: | 500000000 |
sink.storage_scope_id
UUID of the storage scope that rows will be associated with. The scope with the given UUID must already exist in the target cluster.
| Type: | string |
|---|---|
| Required: | No |
| Default: | null |
sink.skip_page_replication
A Boolean value to determine whether to omit page replicas for the specified storage scope. This is ignored if sink.storage_scope_id is not specified or has already been seen by the remotes.
| Type: | boolean |
|---|---|
| Required: | No |
| Default: | false |
sink.netty_event_loop_group_threads
The number of threads in the Netty event loop group used to communicate with remotes.
| Type: | int |
|---|---|
| Required: | No |
| Default: | 1 |
Example Ocient Sink Configuration
JSON
File Sink
A Sink type for testing LAT pipelines that writes the transformed data to local JSONL files. Required keys:sink.location [#sink-location]
An absolute or relative path to the location that the sink should write files to.| Type: | String |
|---|---|
| Required: | Yes |
| Default: |
Example File Sink Configuration
JSON
External Sink Configuration
Rather than including a sink directly within the pipeline, it is also possible to configure a pipeline to use a sink that is specified externally. Sinks can be managed (created, deleted, and more) using the LAT Client Command Line Interface. A sink must exist before a pipeline can use it. There are three ways to configure a pipeline to use a sink.- If a sink is included directly within a pipeline (using the LAT Sink Configuration), it will be used.
- If a sink is not specified within the pipeline, you can specify a sink_name that corresponds to a sink previously created using the LAT Client.
- If neither
sinknorsink_nameis specified in a pipeline, the default sink will be used. If a default sink has not been created using the LAT Client, a pipeline must specify either asinkor asink_name.

