Overview

Lookups are a fundamental part of ETL operations. They allow you to retrieve specific values or entire datasets based on input parameters.

For example, if you have a user_id, a lookup can return all associated phone numbers for that user by querying a database or another data source.

In Etlworks, you can perform lookups in multiple ways:

Extract and Lookup – Query a data source and retrieve values dynamically.
Lookup Using SQL on an Existing Dataset – Perform lookups on pre-loaded datasets using SQL.
Lookup Using Filters on an Existing Dataset – Apply filter conditions to in-memory datasets for faster lookups.

Extract and Lookup

Use this method if you need to extract data from a database, apply SQL, and return either a specific field’s value or an entire dataset.

When to Use?

Your lookup data is stored in a relational database.
You need to retrieve data dynamically while processing records.
The lookup results are used in calculations or data enrichment.

Example Scenario

Imagine two tables in different databases:

movies (contains director_id)
directors (stores director details like first_name, last_name)

Your goal is to combine these tables and load them into a single table in a data warehouse.

How to Implement?

Step 1. Use movies as the source table.

Step 2. Add calculated fields (e.g., director_first_name, director_last_name).

Step 3. Use a lookup method to retrieve the director’s details from the directors table.

Lookup Methods Available

Calculate the Field’s Value Using SQL: Directly apply SQL to calculate a field’s value.
Lookup Returning a Field’s Value: Retrieve a single field from a lookup.
Lookup Returning a Dataset: Fetch multiple rows matching the lookup criteria.

Calculate the Field’s Value Using SQL

This approach allows you to dynamically calculate a field’s value using SQL.

Looking Up Data using Source connection

If each record contains a director_id, use SQL to fetch the corresponding first name:

Step 1. Add a Calculated Field named first_name.

Step 2. Enter SQL Query in the Field Value Function:

SELECT first_name director_first_name FROM directors WHERE id = {director_id}

🔹 {director_id} is dynamically replaced with the value of the column director_id in the source dataset.

Looking Up Data using Destination connection

If the lookup data is in the destination connection, use prefix destination:

destination.SELECT first_name director_first_name FROM directors WHERE id = {director_id}

Looking Up Data using connection other than Source and Destination

If the lookup data is in connection other then source and destination, use prefix connection name:

temp.SELECT first_name director_first_name FROM directors WHERE id = {director_id}

Lookup Returning a Field’s Value

This method extracts data from a specific connection, runs a query, and returns a single field’s value.

Example: Fetch Director’s First Name from a Different Database

value = com.toolsverse.etl.core.engine.Extractor.lookup(
    etlConfig, scenario,
    "connection_for_directors_db", /* name of the connection for lookup */
    "directors", /* name of the dataset, cab be any */
    "first_name", /* name of the column */
    "SELECT first_name FROM directors WHERE id = {ID}", /* sql, can be null */
    new com.toolsverse.util.TypedKeyValue("ID", 
       dataSet.getFieldValue(currentRow, 'director_id') /* optional parameters */
    )
);

🔹 {ID} is dynamically replaced with the value of the column director_id in the source dataset.

Notes

SQL can be null
You can pass any numbers of parameters using new com.toolsverse.util.TypedKeyValue(key, value), including no parameters at all

Handling Cached and Non-Cached Lookups which return Field's value

Use Extractor.lookupNoCache if data changes between calls.
Use Extractor.lookup for performance optimization with cached results.

Using Callable SQL with lookup which returns Field's value

A callable SQL statement for executing blocks of code refers to executing multi-statement procedural logic directly in the database. These blocks typically support control flow constructs (e.g., loops, conditionals), variable declarations, error handling, and dynamic SQL execution.

Use: Extractor.lookupWithCallable for executing callable SQL statements during lookup.

Example: Lookup with callable SQL statement

var sql = 'begin\n' +
' select country from country where country_id = {country_id}\n' +
'end';

value = com.toolsverse.etl.core.engine.Extractor.lookupWithCallable(
   etlConfig, scenario, 
   'SQL Server', 
   'country', 
   'country', 
   sql, 
   new com.toolsverse.util.TypedKeyValue('country_id', 
dataSet.getFieldValue(currentRow, 'country_id')));

Lookup Returning a Dataset

This method retrieves an entire dataset instead of just one field.

Example: Fetch All Images Linked to a Property

var images = com.toolsverse.etl.core.engine.Extractor.lookup(
etlConfig, scenario, 
"connection for lookup", /* name of the connection for lookup */
"image", /* name of the dataset, can be any */
"SELECT filename AS url FROM property_pictures WHERE id = {ID}", /* sql, can be null */ 
new com.toolsverse.util.TypedKeyValue("ID", dataSet.getFieldValue(currentRow, 
'picture_id') /* optional parameters */)
);

🔹 {ID} is dynamically replaced with the value of the column picture_id in the source dataset.

Notes

SQL can be null
You can pass any numbers of parameters using new com.toolsverse.util.TypedKeyValue(key, value), including no parameters at all

Handling Cached and Non-Cached Lookups which return entire Dataset

Use Extractor.lookupNoCache if data changes between calls.
Use Extractor.lookup for performance optimization with cached results.

Using Callable SQL with lookup which returns entire Dataset

Use:Extractor.lookupWithCallable for executing callable SQL statements during lookup.

Example: Lookup with callable SQL statement

var sql = 'begin\n' +
' select * from country where country_id = {country_id}\n' +
'end';

var country = com.toolsverse.etl.core.engine.Extractor.lookupWithCallable(
   etlConfig, scenario, 
   'SQL Server', 
   'country', 
   sql, 
   new com.toolsverse.util.TypedKeyValue('country_id', 
dataSet.getFieldValue(currentRow, 'country_id')));

Lookup Using SQL on an Existing Dataset

Use this method to run SQL queries on an in-memory dataset, instead of querying an external database.

When to Use?

Lookup data stored in a nested JSON, XML, or API response.
Improve performance by avoiding repeated external queries.

Example: Lookup in a JSON Dataset which returns a Field's value

value = com.toolsverse.etl.common.CommonEtlUtils.getFieldValue(
scenario.getSources().get("DIRECTORS").getDataSet(), /* dataset to lookup in */
"select last_name where id = {ID}", /* SQL, can be null */
"last_name", /* field's name */
new com.toolsverse.util.TypedKeyValue("ID", /* optional parameters */
dataSet.getFieldValue(currentRow, 'director_id')));

🔹 {ID} is dynamically replaced with the value of the column director_id in the source dataset.

Notes

SQL can be null
You can pass any numbers of parameters using new com.toolsverse.util.TypedKeyValue(key, value), including no parameters at all

Example: Lookup in a JSON Dataset which returns entire dataset

var phones = com.toolsverse.etl.common.CommonEtlUtils.executeSql(
scenario.getSources().get("PHONES").getDataSet(), /* dataset to lookup in */
"select * where area_code=412" /* SQL, can be null */
);

Lookup Using Filters on an Existing Dataset

Instead of using SQL, this method applies filter conditions to an in-memory dataset.

When to Use?

Lookup data stored in CSV or Excel files.
Faster than SQL-based lookups but works only on flat datasets.

Example: Lookup in a CSV Dataset which returns Field's value

value = com.toolsverse.etl.common.CommonEtlUtils.lookup(
scenario.getSources().get("DIRECTORS").getDataSet(), /* dataset to lookup in */
"id=" + {director_id}, /* filter */
"last_name" /* field's name */
);

🔹 {director_id} is replaced dynamically at runtime with the value of the column director_id in the source dataset.

Example: Lookup in a CSV to Retrieve All Phone Numbers for an Area Code

var phones = com.toolsverse.etl.common.CommonEtlUtils.filter(
scenario.getSources().get("PHONES").getDataSet(),  /* dataset to lookup in */
"area_code=412" /* filter */
);

Articles in this section

Overview

Extract and Lookup

Calculate the Field’s Value Using SQL

Looking Up Data using Source connection

Looking Up Data using Destination connection

Looking Up Data using connection other than Source and Destination

Lookup Returning a Field’s Value

Example: Fetch Director’s First Name from a Different Database

Handling Cached and Non-Cached Lookups which return Field's value

Using Callable SQL with lookup which returns Field's value

Lookup Returning a Dataset

Example: Fetch All Images Linked to a Property

Handling Cached and Non-Cached Lookups which return entire Dataset

Using Callable SQL with lookup which returns entire Dataset

Lookup Using SQL on an Existing Dataset

Example: Lookup in a JSON Dataset which returns a Field's value

Example: Lookup in a JSON Dataset which returns entire dataset

Lookup Using Filters on an Existing Dataset

Example: Lookup in a CSV Dataset which returns Field's value

Example: Lookup in a CSV to Retrieve All Phone Numbers for an Area Code

Related articles