Once you’re familiar with the Mapping Editor, this article goes deeper into field-level behavior, rules, and flow-wide options that affect how mappings work across your transformations.

Overview

In Etlworks, each source-to-destination transformation can include a mapping that controls how data moves from one structure to another.

This mapping can include:

Direct field-to-field transfers
Calculated fields
Excluded fields
New fields that don’t exist in the source
Nested fields for hierarchical formats (JSON, XML)

While the Mapping Editor provides the interface, this article explains how mappings behave, how defaults work, and how to control field handling at the flow level.

Default Mapping Behavior

If you don’t explicitly map a field, Etlworks still tries to help.

When no mapping is defined at all

If the Mapping grid is empty:

All source fields are copied to the destination as-is.
Field names must match for this to work properly.
Destination must support auto-mapping (e.g., file, API, most databases).

This is ideal for flat, consistent schemas.

When some mappings are defined

If you’ve added a few field mappings manually:

Unmapped source fields are still automatically added to the destination.
You must explicitly exclude any fields you don’t want to include.

This behavior ensures schema evolution doesn’t silently drop fields — unless you ask it to.

Exceptions

Default mapping does not apply when:

You’re working with nested data structures (JSON, XML).
You’ve configured the flow to ignore or error on unmapped fields (see below).

Flow-Level Field Handling Settings

Global options in the Flow’s Parameters tab let you control how all mappings behave across transformations.

Error if Mapped Field is Missing in Source

If enabled: the flow fails if a mapped field doesn’t exist in the source.
If disabled (default): a warning is logged, and the field is skipped.

Use this when:

You want to enforce a strict source schema.
You expect all mapped fields to be present.

Handling Extra Fields in Source

Controls how source fields not present in the Mapping are handled.

Option	Behavior
Automatic (or not set)	Unmapped fields are included
Ignore extra fields	Unmapped fields are skipped entirely.
Error on extra fields	Flow fails before execution if unexpected fields are found.

Use this to:

Detect schema drift early.
Enforce clean, predictable transformations.
Avoid downstream issues from unexpected data.

Manual Field Adjustments

Beyond the field pairings you define in the Mapping Editor, you can:

Add fields to the destination that don’t exist in the source

Enable the Add Field checkbox
Set a default or calculated value using the Field Function Editor
Useful for constants, metadata, and calculated columns

Exclude fields

Use the Exclude checkbox to remove a field from the destination
Applies to default-mapped and explicitly mapped fields

Change the field data type

By default, the destination field gets its data type from the source field. To change the destination data type, click the Data type button, select the Data type from the drop-down, and if needed, enter a Database specific type (for example, varchar(255)). You can also specify if the field is nullable.

Apply transformations

Use JavaScript, Python, or SQL (if supported by the destination) to define field value logic

Example:

{"Last Name"} + ", " + {"First Name"}

Using scripting languages to calculate field's value

Column name sanitization

Three per-connector options control how column names are sanitized when the engine reads source data. All three are off by default — turn them on when the destination requires it. Settings are per-format (for file based connections and APIs) or per connection (for database connections), so a single flow can apply different rules on each side.

Column names compatible with SQL

Replaces characters that are not valid in SQL identifiers — anything other than alphanumerics and underscores — with safe substitutes. Use this when the destination is a database and the source columns may contain spaces, dashes, or other special characters that would otherwise need quoting in SQL.

Remove dots from column names

Some downstream systems (for example, certain databases or warehouses) cannot accept column names containing the dot character (.). For flows that target those systems, enable the Remove dots setting on the source format to strip dots from column names automatically.

Default: off — dotted names are preserved. This matters when the dots are meaningful (for example, when flattening a nested field into a single column like orders.items.price).
When on: dots are removed during metadata extraction. The sanitized names appear in the source tree of the mapping editor.

The setting is per-format, so a single flow can preserve dots on one side and strip them on the other.

Change column name using JavaScript

For renames beyond the deterministic options above — pattern-based replacement, prefix or suffix, case conversion, or lookups — use a JavaScript expression to compute column names. The script runs once per column at metadata-extraction time.

Bulk Column Name Transformation →

Reorder fields

Drag rows up/down to reorder fields in the output.

Selecting Source and Destination Fields

If the source or destination is connected to a live Connection:

The field dropdowns are auto-populated with metadata.
If fields don’t appear, click Clear Cache at the bottom of the screen to reload.

You can always manually type in field names if:

The metadata is not available
You’re using a template
The object is dynamic

Tip: You can also define field templates for faster reuse.

Working with nested datasets

When the source, the destination, or both are hierarchical (JSON, XML, etc.), you have two paths:

Option 1: Continue with flat mapping

Stay in Flat view mode and use flattening techniques on the source side. Flat mapping keeps streaming enabled, runs at load time, and is significantly faster on large volumes. Best when the destination is flat and the source can be flattened with one of the supported techniques.

Mapping when the source is nested and destination is flat — the supported use cases for flat mapping with a nested source.
Techniques for working with complex nested datasets — flattening patterns, normalization tasks, and other approaches.

Option 2: Switch to nested mapping

Toggle the mapping editor into Nested view mode. This is the unified mechanism for any transformation that requires explicit control over hierarchy:

Flat → Nested — build a nested document from flat records.
Nested → Flat — flatten a nested document with explicit field-level control.
Nested → Nested — restructure a nested document into a different nested shape, including denormalization.

Nested mode renders both source and destination as expandable trees and unlocks indent/outdent, add child / add sibling, drag-to-reorder within the hierarchy, and per-row data-type modifiers for JSON and XML. Streaming is disabled in this mode — enable Force Streaming when the destination is a database.

Special Case: Duplicate Destination Fields

In file formats like CSV or Excel, you can map multiple fields to destination columns with the same name. This is useful for:

Flattening structures
Repeating fields
Structuring tabular outputs

Note: Not supported for relational databases or cloud data warehouses.

UI-driven lookups

When a field's value needs to come from a reference table (customer tier from a customers table, tax rate from a region lookup, product code from a partner-supplied CSV), use the Lookup Builder instead of writing the JavaScript by hand.

See UI-driven lookups in Working with the Mapping Editor for the configuration walkthrough. The builder works the same way in flat, nested, and X12 mappings.

AI-assisted mapping

Etlworks supports two AI-assisted paths for building a mapping. The newer and more flexible path is conversational: open Simba chat from inside the mapping editor and ask Simba to propose a mapping. You can iterate — ask Simba to refine, exclude fields, add calculated fields — and when the mapping looks right, ask Simba to "apply this mapping". Simba writes the result into the editor as a draft; you review and save.

The older one-shot path is the Create Mapping button in the editor; it invokes AI only when both source and destination are flat.

See the dedicated article AI-assisted mapping with Simba for the full workflow, what Simba sees, and the cost model.

Handling null, empty strings, and the literal 'null'

Overview

This article explains how to consistently handle null values (null), empty strings (""), and the string "null" across ETL, CDC, and bulk load flows in Etlworks.

It covers:

How different connectors interpret and write nulls
Configuration options available in file-based and CDC connectors
Special considerations for cloud data warehouses like Snowflake and Redshift
Best practices for achieving consistent behavior end-to-end

When This Applies

These settings apply to flows that:

Read from files (CSV, JSON, fixed-width, etc.)
Write to files
Create intermediate files for bulk loading into cloud warehouses (Snowflake, Redshift, etc.)
Use CDC connectors that stream changes to files or cloud stages

They do not apply to direct transformations where data is transferred from source to destination (e.g., database to database) without any intermediate file.

File-Based and CDC Connector Options

Most file-based connectors (CSV, JSON, fixed-width, Excel) and CDC connectors support the following configuration options for handling nulls:

Convert 'null' to null when reading

If enabled, the string 'null' (case-sensitive) in input files is interpreted as an actual null.
Available in: file-based connectors (CSV, JSON, fixed-width, etc.) and CDC connectors (when wring to file).
Default: false

Convert empty string to null when reading

If enabled, "" (empty string) is converted to null when reading.
Also: when writing, actual nulls will be written as "".
Available in: file-based connectors (CSV, JSON, fixed-width, etc.)
Default: true (CSV), false (others)

Value for null when reading

Accepts comma-separated values (e.g., NULL,null,-)
If specified, any of these values will be treated as null.
Overrides other null-handling options during reading.
Available in: file-based connectors (CSV, JSON, fixed-width, etc.)

Value for null when writing

If specified, all nulls will be written using the given value.
Example: writing NULL or n/a instead of an empty field.
Overrides all other write-time null-handling options.
Available in: file-based connectors (CSV, JSON, fixed-width, etc.) and CDC connectors (when writing to file)

Do not enclose null in double quotes (CDC only)

If enabled, the string 'null' will not be enclosed in double quotes when written to file (i.e., null instead of "null").
Important:
- This setting is ignored if Value for null when writing is specified.
- Also ignored when CDC writes directly to a destination (e.g., database, message queue).
Use case:
- Particularly useful when loading into Greenplum and other databases that treat "null" as a literal string rather than SQL null.

Null Handling in Bulk Load Pipelines

When using bulk load pipelines (e.g., database → Snowflake), Etlworks generates intermediate files before loading data. This means:

All null-handling options above apply during:

File generation
File loading

However, some warehouses require additional or specific settings at load time:

Snowflake-Specific Settings

Snowflake’s bulk loader supports a setting called: String used to convert to and from SQL NULL

This setting must match the value used in Etlworks for Value for null when writing to ensure correct conversion.

Example:

'DATABASE_NULL_VALUE'

Redshift-Specific Settings

Redshift's bulk loader supports a setting called: String used to convert to SQL NULL

This must also be aligned with Etlworks null writing logic. If mismatched, Redshift may load "NULL" or "" as literal strings instead of actual nulls.

Example:

DATABASE_NULL_VALUE

Other Warehouses and Formats

Some systems (BigQuery, Synapse, etc.) may require explicit handling of null markers such as:

Setting "\\N" for nulls
Avoiding empty strings for numeric/date fields
Escaping special null indicators in delimited files

Always consult the target system’s file format requirements and configure Etlworks accordingly using the Value for null when writing setting.

Best Practices

Be consistent across read and write: use the same null indicators both ways.
Use Value for null when reading to handle legacy or non-standard data (e.g., n/a, -).
Use Value for null when writing when the target system expects a specific marker.
Use the warehouse-specific null settings (Snowflake, Redshift, etc.) to align with your file configuration.
Avoid relying on defaults when working with bulk pipelines — always set null behavior explicitly.

Articles in this section

Overview

Default Mapping Behavior

When no mapping is defined at all

When some mappings are defined

Exceptions

Flow-Level Field Handling Settings

Error if Mapped Field is Missing in Source

Handling Extra Fields in Source

Manual Field Adjustments

Add fields to the destination that don’t exist in the source

Exclude fields

Change the field data type

Apply transformations

Using scripting languages to calculate field's value

Column name sanitization

Column names compatible with SQL

Remove dots from column names

Change column name using JavaScript

Reorder fields

Selecting Source and Destination Fields

Working with nested datasets

Option 1: Continue with flat mapping

Option 2: Switch to nested mapping

Special Case: Duplicate Destination Fields

UI-driven lookups

AI-assisted mapping

Overview

When This Applies

File-Based and CDC Connector Options

Convert 'null' to null when reading

Convert empty string to null when reading

Value for null when reading

Value for null when writing

Do not enclose null in double quotes (CDC only)

Null Handling in Bulk Load Pipelines

Snowflake-Specific Settings

Redshift-Specific Settings

Other Warehouses and Formats

Best Practices

Related articles