Filter, sort, and join transformations – Etlworks Support

Row-level source-side operations that reshape the rows entering a source-to-destination transformation. Configure each one under Transformation → MAPPING → Additional Transformations.

Filter

Reject source rows that don't match an expression. Available when the source is a file or an API endpoint.

Etlworks runs the filter once per row in the source dataset and drops rows where the expression returns false. Write the expression in JavaScript or SQL.

Configure under Transformation → MAPPING → Additional Transformations → Filter.

Alternative: when the source is a relational database, put the predicate in a WHERE clause on the Source query instead — the database does the filtering and Etlworks never sees the rejected rows.

SQL source query

Use SQL to extract or transform data on the source side — for both relational and non-relational sources.

Enter the SQL in the Source query field on the transformation's Mappings tab.

Relational databases.

Use the source database's native SQL. SELECT statements work, and you can also call stored procedures and functions that return a cursor.

Enable Execute as Script in Source Query settings to execute anonymous blocks.

Non-relational sources (files, API responses)

Etlworks supports SQL over CSV, JSON, XML, Excel, Parquet, Avro, and similar — including nested structures. You get filtering, aggregation, and joins on data that doesn't natively live in a database.

Read about using SQL with no-relational data sources →

Order-by

Reorder the source dataset by one or more fields. Supports SQL-like ORDER BY expressions. Available when the source is a file or an API endpoint.

Configure under Transformation → MAPPING → Additional Transformations → Order by.

Alternative: when the source is a relational database, add the ORDER BY to the Source query — the database does the sort.

Set operations — JOIN, UNION, MINUS, INTERSECT, EXCEPT

Combine the results of multiple queries into a single result set.

Same database server. Use the SQL source query with the database's native set-operation syntax.

Different / heterogeneous sources (e.g., SQL Server + Postgres, or database + file + API). Stage everything into a temporary database first, then run the set operation against the staging copy:

Create connections for each data source.
Create the destination connection.
Create a connection to a temporary database (Etlworks auto-manages the tables).
Create an Any to Any ETL flow.
Add one source-to-destination transformation per data source; the destination of each is the temporary database (different table per source). Tables are created and dropped automatically.
Add a final source-to-destination transformation: source = the temporary database, destination = the real destination from step 2. Put the set-operation SQL (using SQLite query syntax) in the Source query field.

Articles in this section