Loading and Transformation Ref...
Pipeline Configuration

Transform Configuration

transform

A transform configuration object.

Required keys:

transform.topics

A collection of topics and their associated configuration. Keys are topic names and values are topic configuration objects . Each <topic> set as a key in this object represents an topic defined in Kafka.

When loading from an s3 or local File Source type, transform.file_groups should be used instead of transform.topics. Each key in the file_group must match a file_group_name defined in the Source section of the pipeline configuration.

Type:

object

Required:

Yes

Default:



transform.database

A database to be used to fully qualify any table names that are not fully qualified.

For example, if transform.database is set to myDatabase, a table name of the form schema.table will become myDatabase.schema.table.

Type:

string

Required:

No

Default:

null

transform.schema

A schema to be used to fully qualify any table names that are not fully qualified. Specifying this property requires that transform.database is specified.

For example, if transform.database is set to myDatabase and transform.schema is set to mySchema:

  • A table name of the form table will become myDatabase.mySchema.table
  • A table name of the form schema.table will become myDatabase.schema.table
  • A table name of the form database.schema.table will stay as database.schema.table

Type:

string

Required:

No

Default:

null

Kafka Load Transform Example

JSON


File Based Load Transform Example

Unlike Kafka, File loads define file groups in the source section of the pipeline configuration. The "file_groups" defined in the source and transform sections must match.

JSON


Topics

Topic configuration objects.

Required keys:

For file based loads, topics are replaced by file_groups, but all other settings are equivalent.

transform.topics.<topic>.filter

A record filter to apply at the topic level. See Record Filtering for details.

Type:

string

Required:

Yes

Default:



transform.topics.<topic>.tables

A collection of tables and their associated configuration. Keys are table names and values are columns configuration.

Type:

object

Required:

Yes

Default:



Tables

Table configuration objects.

Required keys:

transform.topics.<topic>.tables.<table>.filter

A record filter to apply at the table level. See Record Filtering for details.

Type:

string

Required:

Yes

Default:



transform.topics.<topic>.tables.<table>.columns

A collection of columns and their associated configurations. Keys are table names and values are Columns.

Columns

Column transformation configurations.

Required keys:

transform.topics.<topic>.tables.<table>.columns.<column>

A column transformation keyed by a column name. A column’s value is defined as a transformation expression. The expression will query the record and return a value that is loaded into the associated column. The grammar of these expressions uses JMESPath enhanced with some custom transformations and User Defined Transformations (UDTs).

Type:

string

Required:

Yes

Default:



Complex Transform Example

JSON