Bulk load files into Oracle without transformation using SQL*Loader – Etlworks Support

If your CSV files are already in server storage and you don't need any transformation, use this flow. Etlworks reads the file list, optionally creates destination tables, generates a SQL*Loader control file, and runs sql*loader.

When to use this flow

You want to load CSV files that already exist in server storage into Oracle — no transformation needed.
You want to load CSV files produced by a CDC flow into Oracle.

For a flow that also extracts and transforms data on the way in, use ETL into Oracle using SQL*Loader instead.

How does this flow work?

Reads the names of all files matching the wildcard in server storage. Can recurse into subfolders.
Derives destination Oracle table names from the source file names.
Spawns one thread per destination table (up to a configurable limit) and loads them in parallel.
Generates a SQL*Loader control file and runs sql*loader for each table.
Optionally cleans up staged files.

Features

Feature	What it does
Monitor source schema changes	Auto-CREATE and ALTER destination tables as source schemas evolve.
Delete loaded source files	Remove successfully loaded source files after the load.
Load data in parallel	Multiple destination tables load in parallel threads.
Process files in subfolders	Walk nested subdirectories during the file scan.

What do I need before I start?

Oracle is reachable from your Etlworks instance.
The sql*loader utility is installed on the same VM as Etlworks. Contact support@etlworks.com if you need help installing it.
The Oracle user has INSERT on the target tables (and CREATE TABLE if Etlworks should auto-create tables).

Install and configure sql*loader

By default, Etlworks invokes sql*loader as:

sqlldr userid={USERID}/{PASSWORD}@{HOST}:{PORT}/{SERVICE} control={CONTROL_FILE}

You can override the command per flow under Flow → Parameters, or globally by editing TOMCAT_HOME/application.properties and restarting Etlworks:

oracle.load.command=/opt/oracle/instantclient_19_8/sqlldr userid={USERID}/{PASSWORD}@{HOST}:{PORT}/{SERVICE} control={CONTROL_FILE}

Note: If oracle.load.command is set, the per-flow override is ignored.

Step-by-step setup

Step 1. Create the server storage connection

Create a server storage connection. The path points to the directory containing the CSV files. sql*loader reads from the local file system, so the files must be on the same VM as Etlworks.

Step 2. Create the Oracle connection

See the database connector reference.

Step 3. Create a CSV format

Create a CSV format matching the file you're loading.

Step 4. Create the flow

Open Flows, click Add flow, type bulk load oracle in the gallery search, and select Bulk load files into Oracle without transformation using SQL*Loader.

Step 5. Configure the load transformation

Add a source-to-destination transformation. Source connection = the server storage from Step 1; destination connection = the Oracle connection from Step 2.

FROM: the wildcard pattern for the staged files (e.g. folder/*.csv).

TO: the destination table pattern. Use a wildcard like schema.* to derive the table name from each file.

Step 6. Set parameters

Source files and destination tables

Parameter	What it does
Calculate Oracle table name	How the destination table name is derived from each source filename.
Include files in subfolders	Recurse into nested directories.
Maximum number of files to process	Cap on files processed per run.
Maximum number of parallel threads	Default 5; max 99.

Load action

Parameter	What it does
Load Mode	INSERT, APPEND, REPLACE, or TRUNCATE.
Lookup Fields	Comma-separated list of columns that uniquely identify a record. Or enable Predict Lookup Fields.

SQL*Loader control file

Parameter	What it does
Content of the Control File	Hardcoded sqlloader parameters. Etlworks generates a control file automatically by default. Tokens you can use: {FILE_TO_LOAD}, {TABLE}, {MODE}* (INSERT / APPEND / REPLACE / TRUNCATE), {COLUMNS}, {DELIMITER}, {QUOTE}, {UPDATE_COLUMNS}, {DIRECT} (DIRECT / CONVENTIONAL), {SKIP}, {ERRORS}.

Auto-generated control-file options

Parameter	What it does
Load Path	Conventional: uses SQL INSERT through the SQL engine; validates triggers, constraints, etc. Direct: bypasses the SQL engine and writes straight to Oracle data files. Much faster for bulk; recommended for large datasets.
Max # of Errors to Ignore	Maximum number of errors before the load aborts.
NULLIF clause	Treat specific values as NULL on load. Format: column=condition;column=condition. Column names can be in double quotes. Example: first_name = 'null';last_name = 'null'.

ELT — SQL before and after load

Parameter	What it does
Before LOAD SQL / Ignore errors / is a script	SQL to run on the Oracle connection before the load.
After LOAD SQL / Ignore errors / is a script	Same, but after the load.

Handling source schema changes

Option	What it does
Alter target table if source has extra columns	Adds missing columns to the Oracle table.
Recreate target table if source has extra columns	Drops and recreates. Destructive.
Insert NULL into target fields not in source	NULL for any target column the source file doesn't have.
Ignore fields in source not in target	Drops source columns the destination doesn't have.

Debug and error recovery

Parameter	What it does
Log each executed SQL statement	Logs every generated and executed statement.
Purge File(s) if Error	On by default. Deletes staged files on load failure.

Step 7. Configure mapping (optional)

If you need explicit field-to-field mapping, set it up in Mapping.

Step 8. Add more transformations (optional)

Multiple source-to-destination transformations in one flow can load different file patterns into different table sets.

Handling processed files

Delete processed files

Enable Delete loaded source files. Source files are removed after a successful load.

Skip already-processed files

Enable Do not process files that have been already processed. Etlworks tracks loaded filenames internally.

Move processed files

Set Move files after load and pick a target folder to archive instead of deleting.

Troubleshooting

See Common issues when loading data into cloud data warehouses for common errors and fixes that also apply to Oracle bulk loads.

Articles in this section