Avro Format – Etlworks Support

Overview

Avro is binary format with built-in schema support. Used in streaming and Hadoop ecosystems.

Etlworks can read and write Avro files, including nested Avro files.

Use Avro Format when configuring a source-to-destination transformation that reads or writes Avro documents. The files in Avro Format can also be used to load data in Snowflake.

To create a new Avro Format, go to Connections, select the Formats tab, click Add Format, and type in avro in the Search field.

The following parameters are available when configuring the Avro Format:

Compression Codec: the compression algorithm used when creating Avro files. You don't need to select the algorithm if all your files are uncompressed or you are only reading the Avro files.
Normalize nested records with one field: if this option is enabled (it is disabled by default), the Avro parser will create less nested datasets when reading the nested Avro files with an array containing only one field.
Column names compatible with SQL: this converts column names to SQL compatible column names by removing all characters except alphanumeric and spaces.
Treat 'null' as null: if this option is enabled, Etlworks will treat string values equal to 'null' as actual nulls (no value).
Trim Strings: if this option is enabled, Etlworks will trim leading and trailing white spaces from the value.
Schema: the schema is used to create Avro files. You can leave this field empty if you are only reading the Avro files.

Read how to configure the Connection when reading and writing messages in Avro Format from and to the message queues.