Overview
Etlworks can read a single source dataset and load it into multiple database tables in one flow. This article shows the common pattern: normalizing a denormalized source into a parent / child schema.
Use case: split a denormalized source into linked tables
The source is a flat dataset where the first columns repeat across rows for related entities (for example, orders and items):
The values in the first six columns repeat for groups of rows. The values in the last two columns are unique per row. The dataset came from flattening a nested document.
The target schema is two linked tables: Order and Item, with DocumentXRef as the primary key in Order and a foreign key in Item. One order has many items.
If Item.ItemId is an auto-generated primary key, you don't need to populate it — the database does it for you.
How do I set this up?
The technique: read the same source twice, with two transformations that share the source and split the columns into the two destination tables. No scripting required.
- Create the source connection and format.
- Create the destination database connection.
- In Flows, click + and select File to database (or whatever matches your source — any flow where the source is a data object and the destination is a database).
- Create the first transformation, which loads the Order table. Pick the source connection and format and the destination connection. Enter the source filename (for example, test.xml) in FROM. Enter order in TO.
- Click MAPPING and create a mapping that includes only the Order columns and excludes the Item columns.
- Open the Additional Transformations tab, scroll to Remove Duplicates, and enter DocumentXRef. This drops duplicate rows so each order is inserted once.
- Uncheck Parallel and save.
- Add a second transformation that loads the Item table. Use the same source connection, format, and filename. Enter item in TO.
- Click MAPPING and create a mapping that includes only the Item columns and excludes the Order columns. Only DocumentXRef appears in both.
- On the Parameters tab, uncheck Parallel and save.
- Save the flow.