When to use this Format
Etlworks can read and write Parquet files, including nested Parquet files.
To create a new Parquet Format, go to
Connections, select the
Formats tab, click
Add Format, and type
parquet in the
The following parameters are available when configuring the Parquet Format:
Compression Codec: the compression algorithm used when creating Parquet files. You don't need to select the algorithm if all your files are uncompressed or you are only reading the Parquet files.
Normalize nested records with one field: if this option is enabled, (it is disabled by default) the Parquet parser will create less nested datasets when reading the nested Parquet files with an array containing only one field.
Column names compatible with SQL: this converts column names to SQL compatible column names by removing all characters except alphanumeric and spaces.
Treat 'null' as null: if this option is enabled, Etlworks Integrator will treat string values equal to 'null' as actual nulls (no value).
Trim Strings: if this option is enabled, Etlwortks Integrator will trim leading and trailing white spaces from the value.
SchemaEnter: the schema is used to create Parquet files. You can leave this field empty if you are only reading the Parquet files.