Fixed Length Text Format – Etlworks Support

When to use this Format

In fixed-length text files, each field starts and ends at the same place in every record. Otherwise, they are the same as CSV.

Use fixed-length Format when configuring a source-to-destination transformation that reads or writes fixed-length files.

Create Format

To create a new fixed-length Format, go to Connections, select the Formats tab, click Add Format, and type in fixed in the Search field.

Available parameters:

Fields: this parameter is used to configure a constant length for each field. For example: 5,20,10,7 are 4 fields with the length 5, 20, 10 and 7 characters, respectively. If the value of this parameter is set to auto:
- when reading the file the connector will parse the header and set the length of each field automatically;
- when writing the file the connector will use the actual length of each field and if it is not known will set it to the value of the parameter Default field's length.
Default field's length: this parameter defines the default length of the field in the fixed-length file if the field's length is not set explicitly. It is only used when creating the fixed-length text files and the value of the parameter Fieldsis set to auto.
Enclosure Character: a character used to enclose fields in. By default, only the fields that contain Delimiter will be enclosed. Enable Always enclose to enclose all fields.
Always enclose: if this option is enabled, the system will always enclose fields in quotes (assuming that the Enclosure Character is configured). The default behavior is to enclose fields only if they contain the delimiter character.
Enclose header: if this option is enabled, the system will enclose fields names in the header in quotes (assuming that the Enclosure Character is configured). This option is disabled by default.
Escape double-quotes: if double-quotes are used to enclose fields, then a double-quote appearing inside a field will be escaped by preceding it with another double quote.
Escape Character: this character is used to parse files with escape enclosing quotes. If not entered then a double-quote appearing inside a field is assumed to be escaped by preceding it with another double quote.
Line Separator: a character used as a separator between lines.
Default Extension: the default extension is used when the file name doesn't have an extension. If not entered, dat is the default extension.
Ignore Byte order mark character (BOM): uncheck this option if you do not want the parser to ignore the byte order mark character (BOM). The default is ignore.
BOM size: the size of the BOM character. If Ignore Byte order mark character (BOM) is enabled for the format, the connector will attempt to detect the BOM character in the first line, and if it exists, remove it by executing line.substring(bom_char_size). The actual size of the BOM character depends on the file encoding but in some cases, the wrong BOM is used with an otherwise correctly encoded file (very common for UTF-8 encoded files), so the connector truncates more than it should. If you know the size of the BOM character, set BOM size to the positive value equal to the size of the BOM. The most common size for UTF-8 encoded files is 1.
Template: this is a template in the fixed-length text Format. If this field is not empty, Etlworks will use it to populate column names and data types. Template is an optional field. For example: first-name last-name dob.
Use First Row for Data: if checked, it is assumed that the file doesn't have a fields header.
Column names compatible with SQL: convert column names to SQL-compatible column names by removing all characters, except alphanumeric and spaces.
Noname Column: the name of the column when the file does not have a header row. The column name will be the value of the Noname Column + column index. Example: column1, column2 or field1, field2.
Skip First Row: if this option is enabled, the system skips the first row in a file. Typically, it is used together with Use First Row for Data.
Skip not Properly Formatted Rows: if this option is enabled, the system skips rows that do not conform to the CSV Format. For example, if a row has a different number of columns.
Skip Empty Rows: sometimes, CSV files contain completely empty rows with no values or delimiters. Etlworks can be configured to skip those rows; otherwise, it will generate an exception when reading that file.
Document has extra data columns: if this option is enabled, Etlworks will read CSV documents, even if the number of header columns is less than the number of data columns.
Start row: if this value is not empty, the system will start reading the file from the specified 1-based row and ignore previous rows.
End row: if this value is not empty, the system will stop reading the file after the specified 1-based row.
Transformation type: the default is filter, which allows you to write JavaScript code to filter out some of the rows in the CSV file, but you can also select preprocessor or header. Read more about using a preprocessor to modify the contents of the source document. If header is selected and Filter or Preprocessor or Header is not empty, the header will be added at the beginning of the file, followed by an end-of-line character, followed by the actual data.
Filter or Preprocessor or Header: JavaScript code that filters out lines and changes a line or the entire document. Read more about filters. If header is selected for Transformation type, the specified header will be added at the beginning of the file, followed by an end-of-line character, followed by the actual data.
All fields are strings: if this option is enabled (it is disabled by default), the system will create all fields with a string data type. Otherwise, it will parse the field's value and attempt to detect the data type.
Date and Time Format: a Format for timestamps (date+time).
Date Format: a Format for date (date only, no time).
Time Format: a Format for time (time only, no date).
Parse Dates: if this option is enabled, and date or time value is not recognized as one of the Formats defined above, Etlworks will try to parse it using one of the well-known date+time Formats.
Trim Strings: if this option is enabled, Etlworks will trim the leading and trailing white spaces from the value.
Treat 'null' as null: if this option is enabled, Etlworks will treat string values equal to null as actual nulls (no value).
Encode CLOB fields using Base64: if this option is enabled (default), Etlworks will encode fields with the CLOB data type (large TEXT fields) using the Base64 algorithm.
Remove EOL characters: if this option is enabled (default), the system will remove end-of-line (EOL) characters (\n, \r) from the field's value when creating a file.
Stript not printable characters: if this option is enabled (it is disabled by default) the system will strip not printable characters, such as null-character (/0), from the data row when creating a file.
Encoding: character encoding when reading and writing CSV files. No encoding means there will be no additional encoding.

Global find and replace when creating files

The Etlworks fixed-length text connector generates files line by line, with each column value stored as a string. You can perform a global find and replace operation within these string values by configuring the following parameters:

String to find (regex): the regular expression in Java format to match and replace a string when creating a file.
Replace it with: the string used to replace the text matched by the String to find (regex) regex. If this parameter is not set by the find regex is configured the connectors sets it to to empty string ("").

For example if you want to remove all single quotes which are not at the beginning and not at the end of the string (like this is a 'string' with quotes) you can set find regex to (?<!^)'(?!$) and keep the replacement parameter empty.

Using an example above the string this is a 'string' with quotes will be transformed to this is a string with quotes.

Articles in this section

When to use this Format

Create Format

Available parameters:

Global find and replace when creating files

Related articles