Use this flow to extract data from any supported source, transform it, and load it into Oracle using the SQL*Loader utility. Etlworks stages CSV files locally, generates a SQL*Loader control file, and runs sql*loader against your Oracle database.
Which Oracle bulk-load flow should I use?
| Flow | Use when |
|---|---|
|
ETL into Oracle using SQL*Loader (this article) Database / File / Queue / Web service / Well-known API → Oracle |
You need to extract from any source, optionally transform, and load into Oracle. |
| Bulk load files into Oracle without transformation using SQL*Loader | The files already exist in server storage. No transformation needed. |
How does this flow work?
- Extracts data from the source.
- Writes CSV files to server storage.
- Generates a SQL*Loader control file and runs sql*loader against your Oracle database.
- Optionally cleans up staged files.
What do I need before I start?
- Oracle is reachable from your Etlworks instance.
- The sql*loader utility is installed on the same VM as Etlworks. Contact support@etlworks.com if you need help installing it.
- The Oracle user has INSERT on the target tables (and CREATE TABLE if Etlworks should auto-create tables).
Install and configure sql*loader
By default, Etlworks invokes sql*loader as:
sqlldr userid={USERID}/{PASSWORD}@{HOST}:{PORT}/{SERVICE} control={CONTROL_FILE}
You can override the command per flow under Flow → Parameters, or globally by editing TOMCAT_HOME/application.properties and restarting Etlworks:
oracle.load.command=/opt/oracle/instantclient_19_8/sqlldr userid={USERID}/{PASSWORD}@{HOST}:{PORT}/{SERVICE} control={CONTROL_FILE}
Note: If oracle.load.command is set, the per-flow override is ignored.
Step-by-step setup
Step 1. Create connections
- A source connection.
- A server storage connection for the stage.
- An Oracle connection.
Step 2. Create a CSV format
Create a CSV format.
Step 3. Create the flow
Open Flows, click Add flow, type Oracle in the gallery search, and pick the flow type that matches your source.
Step 4. Set the Oracle connection on the flow
Open the Connections tab and pick the Oracle connection from Step 1.
Step 5. Set source (FROM) and destination (TO)
FROM (varies by flow type):
- Database: table or view name. Wildcards supported, e.g. public.*.
- Well-known API: endpoint or file name.
- Web service: endpoint name.
- File: file name or wildcard pattern.
- Queue: queue name.
TO: the fully qualified Oracle table name. If FROM uses a wildcard, use a matching wildcard in TO.
The server storage connection is the destination connection on the transformation. Etlworks treats it as the stage.
Step 6. Set the CSV format
Select the format from Step 2 as the destination format.
Step 7. Set parameters
Load action
| Parameter | What it does |
|---|---|
| Load Mode | INSERT loads new rows. APPEND adds to an existing table. REPLACE deletes existing rows then inserts. TRUNCATE truncates the table then inserts. |
| Lookup Fields | Comma-separated list of columns that uniquely identify a record. Or enable Predict Lookup Fields for auto-detection. |
SQL*Loader control file
| Parameter | What it does |
|---|---|
| Content of the Control File | Hardcoded sql*loader parameters. Etlworks generates a control file automatically by default. Tokens you can use: {FILE_TO_LOAD}, {TABLE}, {MODE} (INSERT / APPEND / REPLACE / TRUNCATE), {COLUMNS}, {DELIMITER}, {QUOTE}, {UPDATE_COLUMNS}, {DIRECT} (DIRECT / CONVENTIONAL), {SKIP}, {ERRORS}. |
Auto-generated control-file options
| Parameter | What it does |
|---|---|
| Load Path | Conventional: uses SQL INSERT through the SQL engine; validates triggers, constraints, etc. Direct: bypasses the SQL engine and writes straight to Oracle data files. Much faster for bulk; recommended for large datasets. |
| Max # of Errors to Ignore | Maximum number of errors before the load aborts. |
| NULLIF clause | Treat specific values as NULL on load. Format: column=condition;column=condition. Column names can be in double quotes. Example: first_name = 'null';last_name = 'null'. |
ELT — SQL before and after load
| Parameter | What it does |
|---|---|
| Before LOAD SQL / Ignore errors / is a script | SQL to run on the Oracle connection before the load. |
| After LOAD SQL / Ignore errors / is a script | Same, but after the load. |
Source-specific parameters
- Database source: see Extract, Transform, and Load (ETL) data.
-
File or web-service source:
- Ignore Transformations: copies files directly to server storage. Faster; transformations are skipped.
- Delete loaded source files: deletes the source file after a successful load.
- Delete source files on error: deletes the source file if the load fails.
- Well-known API source: see the well-known API options.
Step 8. Handle source / destination schema differences
| Option | What it does |
|---|---|
| Reorder CSV Columns During Extract | Reorders columns to match the Oracle target. Ignored if the target doesn't exist. |
| Ignore Fields in Source Not in Target | Drops source columns the target doesn't have. |
| Alter Target Table If Source Has Extra Columns | Adds the extra columns to the Oracle table. |
| Recreate Target Table If Source Has Extra Columns | Drops and recreates the target. Destructive. |
| Insert NULL Into Target Fields Not in Source | Inserts NULL for any target column the source doesn't have. |
Step 9. Configure mapping (optional)
If you need explicit field-to-field mapping, set it up in Mapping. Auto-mapping by field name works for most flows.
Incremental change replication (high watermark)
Change replication using HWM is supported. Set the High Watermark Field, enable Change Replication, and optionally configure a calculated HWM value.
Load multiple tables with a wildcard
FROM
Create a single source-to-destination transformation and set FROM to a wildcard pattern, e.g. public.*.
TO
Set TO to schema.*, e.g. scott.*.
Include / exclude objects
- Exclude Objects: comma-separated list of objects to skip.
- Include Objects: comma-separated list of objects to keep.
Source query, before/after SQL
Use the {TABLE} token in Source Query, Before LOAD SQL, and After LOAD SQL.
MERGE with wildcards
Leave Lookup Fields empty and enable Predict Lookup Fields, or specify per-table lookup fields as semicolon-separated pairs.
HWM with wildcards
Use the {TABLE} token in the High Watermark Field Value.
Flow variables
Open the Parameters tab and add user-defined flow variables as key-value pairs. Use uppercase names without spaces.
Maximum number of parallel threads
Multiple source-to-destination transformations can run in parallel threads. Default 5; max 99. Change via Maximum Number of Parallel Threads on the Parameters tab.
Troubleshooting
See Common issues when loading data into cloud data warehouses for common errors and fixes that also apply to Oracle bulk loads.