Overview
In Etlworks, you can MERGE (UPSERT) records extracted from the source with records in the destination table when working with relational databases.
MERGE
When creating a Flow that loads data into the relational database use the built-in MERGE
SQL action to UPSERT (INSERT new, UPDATE existing) records extracted from the source with records in the destination table. Just like INSERT the MERGE is executed for each record in the source dataset.
MERGE can be slow so consider using BULK MERGE or BULK DELETE/INSERT when possible.
Etlworks supports MERGE
for Oracle, MS SQL Server, PostgreSQL, MySQL, DB2, Informix, and Sybase databases.
Configure MERGE
Follow the steps below to configure MERGE:
Step 1. Create a Flow with at least one transformation where the database is a destination.
Step 2. When configuring a transformation, click MAPPING
and then select the Parameters
tab.
Step 3. Select MERGE
as the Action
.
Step 4. Enter a comma-separated list of field names in the Lookup Fields
which will be used to identify a unique record.
Predict Lookup Fields with MERGE
You can enable the Predict Lookup Fields as an alternative to configuring the Lookup Fields
. It is only recommended if you are extracting data from the database objects matching the wildcard name.
Alternative to Predict Lookup Fields with MERGE when processing by a wildcard
When using the wildcard processing with MERGE
and enabling the Predict Lookup Fields
option (which is not always accurate) is not an option, you can specify the list of table=fields pairs in the Lookup Fields
. Use the fully-qualified table names and ;
as a separator between table=field pairs.
Example:
test1.inventory=inventory_id,database_name;
test1.payment=payment_id,database_name;
test1.rental=rental_id,database_name;
MERGE with Postgres and MySQL
PostgreSQL before 15.0 and MySQL require a unique index for the Lookup Fields. Postgres 15+ supports native MERGE and does not require a unique index.
With some older versions of Postgres (15,16) the native MERGE can be slower than INSERT ON CONFLICT. It is possible to disable the native MERGE for Postgres 15+ by enabling the Use INSERT ON CONFLICT with Postgres 15+
when configuring the Postgres connection:
BULK MERGE
Unlike per-record MERGE (see above) BULK MERGE creates a temp table in the destination database, then inserts records into the temp table, and finally merges data in the actual table with the data in the temp table.
Pros of BULK MERGE
- Can be much faster (up to x100) than per-record MERGE
Cons of BULK MERGE
- The database must support temporary tables
- BULK MERGE is not supported for PostgreDB before 15.0 and MySQL.
For any database which does not support native MERGE use BULK DELETE/INSERT instead.
Configure BULK MERGE
To configure BULK MERGE set the Action
to BULK MERGE
. All other settings, including settings for lookup fields are the same as for MERGE
.
BULK DELETE/INSERT
BULK DELETE/INSERT is similar to BULK MERGE but instead of native MERGE statement it executes DELETE FROM ACTUAL_TABLE WHERE exists (select 1 from TEMP_TABLE WHERE CONDITIONS)
followed by INSERT INTO ACTUAL_TABLE SELECT FROM TEMP_TABLE
.
Use BULK DELETE/INSERT for databases which do not support native MERGE statement, for example PostgreDB and MySQL.
Configure BULK DELETE/INSERT
To configure BULK DELETE/INSERT set the Action
to BULK DELETE/INSERT
. All other settings, including settings for lookup fields are the same as for MERGE
.
IfExists
The alternative to MERGE
/BULK MERGE
/BULK DELETE/INSERT
is IfExist
action. Etlworks executes an SQL statement to check to see if the record already exists, and if it does, executes an UPDATE
. If the record does not exist, Etlworks executes INSERT
.
IfExist
is much slower than alternatives so use MERGE
/BULK MERGE
/BULK DELETE/INSERT
when possible.
Configure IfExists
To configure IfExists set the Action
to IfExists
. All other settings, including settings for lookup fields are the same as for MERGE
.
CUSTOM UPSERT
Some database connectors support the UPSERT SQL clause, which looks very much like INSERT
: UPSERT INTO table (columns)
VALUES (values)
. But the actual syntax of the UPSERT differs depending on the driver. With the CUSTOM
action, the developer can provide a template for UPSERT, which will be used at runtime to generate the actual SQL statement.
At this time, only some of our premium connectors support the UPSERT SQL clause, for example, our Salesforce with OAuth2 connector.
Configure CUSTOM UPSERT
To configure CUSTOM UPSERT, apply the following steps:
Step 1. Create a Flow with at least one transformation where the database is a destination.
Step 2. When configuring a transformation, click MAPPING
, and then select the Parameters
tab.
Step 3. Select CUSTOM
as the Action
.
Step 4. Enter a comma-separated list of field names in the Lookup Fields
which will be used to identify a unique record. Alternatively, you can enable the Predict Lookup Fields
.
Step 5. Enter the template for the UPSERT
into the Destination query
field.
Example:
UPSERT INTO LocationToUse ({COLUMNS}, ExternalIdColumn)
VALUES ({VALUES}, '{KEYS}')
The following tokens can be used in the Destination query
:
{COLUMNS}
: the comma-separated list of columns{VALUES}
: the values for the columns{KEYS}
: the lookup fields
Comments
0 comments
Please sign in to leave a comment.