Scope

Variables in Etlworks are defined within the scope of a single execution unit, such as a running flow and any of its sub-flows. There are three distinct types of variables:

Global variables

Global variables are key/value pairs, where both the key and value are strings. Developers can reference a global variable in connections and transformations by using the syntax {variable_name}.

Global variables are accessible throughout the entire nested flow hierarchy, meaning they become available to all flows within that hierarchy immediately after they are set.

Common Use Cases for Global Variables

Global variables can be used in various scenarios to make your workflows more dynamic and flexible. Common use cases include:

Parameterization of Connections: Use global variables to dynamically set connection parameters such as URLs, usernames, or passwords.
Parameterization of Transformations: Global variables can be used to customize transformation source and destination based on dynamic values.
Parameterization of File Operations: Apply global variables in file operations (e.g., file paths, file names) to manage files more efficiently.
Parameterization of SQL Queries: Incorporate global variables into SQL queries to enable dynamic query conditions and values.

Programmatically Set and Get Global Variables

The developer can set the global variable programmatically as

com.toolsverse.config.SystemConfig.instance().getProperties().put('name', 'value')

The value must be a string

The value of the global variable can be accessed programmatically as

com.toolsverse.config.SystemConfig.instance().getProperties().get('name')

This expression returns either String or null if the variable is not set.

Set and get Global variables in parallel loop

This technique is used to set global variables in parallel loops and parallel transformations using a thread-safe container. These variables are only "visible" to the current ETL thread, so do not use this technique to set variables that can be accessed in other threads.

com.toolsverse.config.SystemConfig.instance().
addPropertyInCurrentEtlThread(key, value);

This technique is used to access the value of the variable set in a parallel loop or parallel transformation. If the variable is not found in the current ETL thread this method will return the value set in the main thread.

// props is a java.util.HashMap<String, String>
var props = com.toolsverse.config.SystemConfig.instance().getContextProperties();

// use the same key you stored in props.put(...)
var someValue = props.get("unique key");

Log global variables

Here is an example of the JavaScrip code which logs global variables. You can use this fragment of code anywhere in JavaScript.

// Get the properties map
var properties = com.toolsverse.config.SystemConfig.instance().getProperties();

// Use forEach to iterate over the entry set
for each (var entry in properties.entrySet()) {
    var key = entry.getKey();
    var value = entry.getValue();

    // Log each key-value pair
    etlConfig.log(key + '=' + value);
}

Set Global Variables

Variables set programmatically or using configuration:

Set global variables in a Script : Define and assign values to global variables using a script.
Database Loop Parameters : Set variables based on the results of a database query.
File Loop Parameters : Loop through the files matching the wildcard name and set variables to file name.
Flow variables set at the flow level : Set specific variables within the flow configuration.
Variables set when executing flow manually : When running a flow manually, variables can be entered directly before the flow starts.
Variables set when configuring the schedule : When setting up a scheduled flow, variables can be defined as part of the schedule configuration.
Variables set when configuring the flow to be executed by Integration Agent : If the flow is executed by an Integration Agent, variables can be set in the agent’s flow configuration.
URL parameters in call-flow-by-name API :Pass variables through URL parameters when triggering a flow by name.
Payload in call-flow-by-ID API : Provide variables via the payload when triggering a flow by ID.
User-defined API URL parameters : Use parameters defined in your custom API endpoints.
HTTP Preprocessor : Set variables dynamically using an HTTP preprocessor script.

Flow variables

Flow variables are key/value pairs, where both the key and value are strings. Developers can reference flow variables in SQL statements, JavaScript and Python scripts using the syntax {variable_name}.

Unlike Global variables, Flow variables are individually set for each flow within a nested flow hierarchy. However, there is one exception: when the main (parent) flow in a nested hierarchy starts, all flows within the hierarchy inherit the value of any flow variable set in the main flow.

Common Use Cases for Flow Variables

Parameterizing SQL Statements

Flow variables are often used to dynamically parameterize SQL queries. Instead of hardcoding values, you can use flow variables to inject dynamic data into your SQL statements, such as filter conditions, limits, or values for insert and update operations. This makes SQL queries flexible and adaptable to varying inputs during flow execution.

Example:

SELECT * FROM orders WHERE order_date >= '{start_date}' AND order_date <= '{end_date}'

Parameterizing Scripts

Flow variables can be embedded into scripts across different scripting languages supported by Etlworks, allowing you to customize and control flow behavior programmatically. These variables enable dynamic adjustments in your scripts depending on flow parameters or conditions at runtime.

Example JavaScript:

var recordCount = {RECORD_COUNT};
if (recordCount > {A_FEW}) {
    // code
}

Example Python:

recordCount = {RECORD_COUNT};
if recordCount > {A_FEW}) 
    # code

Example bash:

record_count=${RECORD_COUNT}
if [ "$record_count" -gt {A_FEW} ]; then
   # code
fi

Programmatically Set and Get Flow Variables

The developer can set the flow variable programmatically as

scenario.getVariable('name', value)

The value must be a string.

The flow variable can be accessed programmatically as

scenario.getVariable('name')

This expression returns either Variable or null if the variable is not set.

To get the value the developer can use the following code:

var variable = scenario.getVariable('name'); 
var value = variable != null ? variable.getValue() : null

Log flow variables

Here is an example of the JavaScrip code which logs flow variables:

// Get variables
var vars = scenario.getVariables();

// check if there are variables
if (vars != null && !vars.isEmpty()) {
   // iterate the variables by index
   for (var index = 0; index < vars.size(); index++) {
      // get the variable 
      var theVar = vars.getList().get(index);

      // log name and value
      etlConfig.log(theVar.getName() + '=' + theVar.getValue());
   } 
}

Set Flow Variables

Here are the different ways to set and use flow variables:

Flow Variables Passed as URL Parameters to User-Defined API Endpoints: Flow variables can be passed as URL parameters to any API endpoint created by the user, allowing external systems to control or customize the behavior of flows based on the parameters sent via the URL.
Flow Variables Passed as URL Parameters to the run Flow by name API: When triggering a flow using the run Flow by name API, flow variables can be passed as URL parameters. These parameters are dynamically injected into the flow execution context, providing flexibility in how the flow operates.
Flow Variables Passed as Payload to the run Flow by ID API: When invoking a flow using the run Flow by ID API, flow variables are passed in the API request payload. This method is particularly useful when complex or larger data sets need to be sent to the flow.
Flow Variables Defined in Nested Flows: Flow variables can be defined and passed when creating nested flows. This allows you to pass context or configuration values from the parent flow to the child flow, facilitating complex orchestration and control.
Flow Variables Set and Accessed Programmatically: Flow variables can be programmatically set and accessed during the execution of the flow. This is often done through scripts that dynamically adjust the flow behavior based on real-time conditions or data.
Flow Variables Set as Database Loop Parameters: Flow variables can be populated by looping through database records. These loop parameters are dynamically set during flow execution, allowing the flow to operate on specific data sets retrieved from a database query.
Flow Variables Set as File Loop Parameters: Similar to database loop parameters, flow variables can be set by looping through files matching a wildcard.
Flow Variables Set When Executing the Flow Manually: When running a flow manually, users can enter flow variables directly before execution. These variables allow for on-the-fly customization of flow behavior without needing to modify the flow definition.
Flow Variables Set When Configuring the Schedule: When setting up a scheduled flow, flow variables can be configured as part of the scheduling process. This allows for automated flows to adjust their execution dynamically based on the scheduled variables.
Flow Variables Set When Configuring the Flow to Be Executed by an Integration Agent: Flow variables can also be defined when setting up a flow to be executed by an Integration Agent. These variables allow agents to adjust the flow execution dynamically based on local conditions or parameters defined at the agent level.

Key/value pairs in a flow storage

Key/value pairs in flow storage are a map of objects where the key is a string and the value can be any type of object, such as a number, boolean, or other data types.

These key/value pairs become available throughout the entire nested flow hierarchy as soon as they are set. However, unlike global or flow variables, they cannot be referenced using the {variable_name}syntax.

Common Use Cases for Key/value pairs in a flow storage

One of the most common use cases for key/value pairs in flow storage is to use them in Conditions—to execute a step in a nested flow conditionally—or in Loop scripts—to repeatedly execute a step in a nested flow until the loop script returns a non-null value.

Another common use case is using key/value pairs as a cache. Developers can store any object or variable (such as a record counter) in the key/value storage and later retrieve and use it within the same flow or across other flows in the nested hierarchy. This allows for efficient reuse of data without reprocessing or recalculating values, improving flow performance and logic.

Programmatically Set and Get Key/value pairs

The key/value pairs can be set programmatically as

etlConfig.setValue('name', value)

The value can be an object with any data type, including number, boolean, Java class, etc.

The value of the key/value pair can be accessed programmatically as

etlConfig.getValue('name')

Access to etlConfig

Variable etlConfig is directly accessible within:

• Mapping transformations

• Scripting transformations

• The dedicated Scripting Flow

var param = etlConfig.getValue('name');

Preprocessors (e.g., Format Preprocessor, HTTP Connection Preprocessor, etc.) do not have access to etlConfig. In these cases, use:

var etlConfig = com.toolsverse.config.SystemConfig.instance().getEtlThreadContext().getData();

var param = etlConfig.getValue('name');

How Etlworks Resolves {tokens} at Runtime

When Etlworks encounters placeholders in the form {varname}, it resolves them based on the execution context. The resolution rules differ slightly depending on where the tokens appear.

In connections and transformations

When replacing {tokens} inside connection properties, format settings, or transformation configuration values, Etlworks evaluates them in the following order:

Replace tokens using static variables ({app.data}).
If tokens remain, replace them using global variables bound to the current ETL thread (thread-local globals).
If tokens remain, replace them using global variables not bound to the current thread (regular globals).
Flow variables and key-value pairs stored in flow storage are not used for token resolution in this context.

In SQL statements and code scripts (JavaScript, Python, etc.)

When replacing {tokens} inside SQL scripts or code scripts:

Replace tokens using flow variables.
If tokens remain, replace them using global variables bound to the current ETL thread.
If tokens remain, replace them using global variables not bound to the current thread.

Thread Safety of Variables in Parallel Execution

Etlworks can run many parts of a flow in parallel, including parallel steps in the nested flow, parallel loops, parallel source-to-destination transformations, parallel wildcard transformations, and parallel file operations. Each parallel execution type uses its own thread pool. The default limit is 5 active threads per pool, though this can be changed.

Because flows may run in parallel threads, different types of variables behave differently when accessed and modified at runtime.

Thread safety of Global variables

Etlworks maintains two containers for global variables: a regular container and a thread-safe container.

When setting global variables programmatically inside code that may run in parallel, developers must use the thread-safe container:

com.toolsverse.config.SystemConfig.instance().addPropertyInCurrentEtlThread(key, value);

Parallel loops use this automatically.

When reading global variables programmatically, developers must be aware of whether they want thread-local values:

var props = com.toolsverse.config.SystemConfig.instance().getContextProperties();

var someValue = props.get('name');

or the shared non-thread-safe values:

var someValue = com.toolsverse.config.SystemConfig.instance().getProperties().get('name')

Thread safety of Flow variables

Flow variables are inherently thread-safe. Each parallel execution receives its own copy of the flow object, and therefore its own independent set of flow variables. Updates inside one thread do not affect flow variables in another thread.

Thread safety of Key-value pairs in flow storage

Key-value pairs set through etlConfig.setValue are thread-safe in terms of concurrent access, but they are shared across all threads. Only one instance of each key exists for the entire nested flow hierarchy. Multiple threads can read and write these values, but developers should not expect isolation. A value changed in one thread may be visible in another, which may or may not be desirable depending on the flow design.

Articles in this section