When to use this Flow type
In Etlworks, a nested Flow allows you to combine multiple Flows of different types—such as those interacting with databases, files, web services, or executing scripts—into a single, cohesive workflow. This capability enables the creation of complex data integration scenarios where Flows can be executed sequentially or in parallel, based on specific conditions or loops.
Create a Nested Flow
Step 1: Initiate a Nested Flow
• Navigate to the Flows window.
• Click Add Flow.
• In the Select Flow Type field, enter Nested Flow.
Step 2: Add Child Flows
• Within the nested Flow, click Add Flow to incorporate existing Flows as steps in your pipeline.
Step 3: Manage Child Flows
• To remove a Flow from the pipeline, use the delete option next to the respective Flow.
• To disable a Flow without removing it, toggle the disable option accordingly.
Configure Child Flow Parameters
For each child Flow within the nested Flow, you can define specific parameters by clicking the pen icon adjacent to the Flow name:
Available child flow parameters
Parallel Execution:
Enable the Run this flow in parallel with other child flows option to execute the child Flow concurrently in its own thread.
Note: Use this feature cautiously, as parallel execution can lead to errors if the Flows have interdependencies or require sequential processing.
Conditional Execution:
In the Condition field, input a JavaScript expression that returns true or false.
The child Flow will execute only if the condition evaluates to true.
Loop Execution:
Set the Loop Type to define how the child Flow should repeat:
• Script: Executes while a JavaScript or Python script returns a non-null value.
• SQL: Executes for each row in the result set of a specified SQL query.
• Files by Wildcard: Executes for each file matching a specified wildcard pattern.
Specify the relevant Connection and Loop Script or File Path based on the selected loop type.
Define the Maximum Number of Iterations to prevent infinite loops, with a default of 100,000 iterations.
Execute Flow conditionally
Nested Flows can be configured to execute certain steps based on specific conditions.
This is useful for scenarios where subsequent actions depend on the outcomes of previous steps.
For detailed guidance, refer to the article on Executing a Flow Conditionally.
Read more about executing Flows conditionally without nested flow.
Execute Flow in a loop
You can configure child Flows within a nested Flow to execute in a loop, which is beneficial for tasks such as:
- Extracting data from HTTP endpoints with varying parameters.
- Configuring transformations and connections dynamically based on database parameters.
- Setting credentials for connections at runtime.
- Repeating a Flow a specified number of times.
For comprehensive instructions, consult the article on Executing a Flow in a Loop.
Execute Flow in parallel with other child flows (steps)
In Etlworks, you can configure any child flow (step) within a nested flow to execute in parallel with other child flows. To enable this functionality, use the “Run this flow in parallel with other child flows” parameter. This option is particularly useful when multiple child flows are independent and can run simultaneously to improve performance.
Use with Caution
While this parameter can significantly improve performance, it should be used with caution. If the child flows (steps) need to execute in a specific order or have dependencies on the completion of other steps, enabling this parameter may cause random errors or unpredictable results. Carefully review the relationships and dependencies between child flows before enabling this setting.
Difference Between Parallel Child Flows and Parallel Loop Iterations
Parallel Child Flows:
When you enable “Run this flow in parallel with other child flows,” Etlworks executes this specific child flow alongside other child flows in the nested flow concurrently. This setup is ideal for workflows where each child flow operates independently.
Parallel Loop Iterations:
If a flow is configured to run in a loop, the “Loop Threads” parameter controls whether iterations of that loop execute in parallel. This is distinct from running child flows in parallel and applies to repeated executions of the same flow within a loop structure.
For more details about running flows in parallel loops, see Executing Iterations of a Loop in Parallel.
Flow Variables
Within a nested Flow, you can define Flow Variables as key-value pairs, which can be referenced in SQL and JavaScript code using the {VARIABLE} syntax. This feature facilitates dynamic parameterization and enhances the flexibility of your workflows.
Use Flow variables
A Flow variable can be referenced in the Source query
as demonstrated below:
select * from connection
where lower(name) like '%{NAME}%'
and in a source to destination transformation:
A Flow variable can also be accessed within the JavaScript.
Read about variables in Etlworks.
Named Connections
Typically, a nested Flow inherits all the Connections from its inner Flows. However, you can add additional Named Connections under the Connections tab. These named connections can be referenced in scripts and used dynamically within the workflow. For detailed guidance, refer to Use Named Connections in JavaScript.
Use Named Connections
You can then use the named Connections in inner Flows, for example, in JavaScript.
Handling exceptions in nested flow
Etlworks allows you to configure exception handlers for nested Flows using JavaScript. This feature enables you to:
- Conditionally Ignore Exceptions: Decide whether to continue execution despite certain errors.
- Log Errors: Record errors into external logs for further analysis.
What happens when the exception is ignored
- If the nested flow with an exception handler is main flow it will finish and the status will be set to 'success'. The ignored error will be added to the flow log.
- If the nested flow with an exception handler is a part of the another nested flow the execution will continue and the ignored error will be added to the flow log.
- If the nested flow with an exception handler is executed in a loop the next iteration fo the flow will start and the ignored error will be added to the flow log.
Creating an exception handler
To create an exception handler for the nested flow go to On Exception
tab and enter JavaScript code in Execute on Exception
field.
Ignoring exception
Set the value
variable to 2
(ignore) in last line of the JavaScript code to configure flow to ignore the exception.
value = 2; // 2 = ignore, 1 - raise
Available variables
The following variables can be referenced by name from JavaScript code:
Name | Class name / JavaDoc | Package |
---|---|---|
etlConfig | com.toolsverse.etl.core.config.EtlConfig | com.toolsverse.etl.core.config |
scenario | com.toolsverse.etl.core.engine.Scenario | com.toolsverse.etl.core.engine |
exception | java.lang.Exception | java.lang |
Example of the exception hander
importPackage(com.toolsverse.etl.sql.util);
importPackage(com.toolsverse.config);
importPackage(com.toolsverse.util);
var con = etlConfig.getConnection("log");
var props = SystemConfig.instance().getProperties();
var fileId = Utils.str2Int(props.get('fileid'), 0);
var error = exception != null ? Utils.getStackTraceAsString(exception) : null;
var fileSql = "update etl_feed_file_log set ended = CURRENT_TIMESTAMP,
error = ? where fileid = ?";
try {
SqlUtils.executeSql(con, fileSql, error, fileId);
} catch (err) {
// already logged
}
value = 2;
Dynamic Flow
The Dynamic Flow feature allows you to execute any Flow by its name, determined at runtime. This is particularly useful for creating flexible and reusable workflows.
Create Dynamic flow
Step 1. In Flows
window, clicking Add flow
, and type in dynamic flow
into the Select Flow Type
box:
Step 2. Enter an actual flow name or a {VARIABLE_NAME}
in the Flow Name
field.
Step 3. Add Dynamic Flow as a step to any nested flow.
Use Dynamic Flow
Once the Dynamic Flow is created and added as a step to the nested flow it can be used by dynamically passing the value of the {VARIABLE_NAME}
.
Below are some of the examples:
When executing the Flow manually
Step 1. Click Run flow
button
Step 2. Click Add parameters
link
Step 3. Add {VARIABLE_NAME}
in KEY
and Flow Name in VALUE
When scheduling the Flow
Open the schedule, add {VARIABLE_NAME}
in KEY
and Flow Name in VALUE
in Flow patameters
:
When using Remote Integration Agent
Step 1. Click Edit flow patameters
Step 2. Add {VARIABLE_NAME}
in KEY
and Flow Name in VALUE
in Flow patameters
:
Error pipeline
The nested flow can be configured to be executed if there is an error anywhere within the nested pipeline.
A single nested on-error flow can be added to multiple nested flows and executed simultaneously.
To configure the nested flow to be executed in case of error follow these steps:
Step 1. Create nested flow to be executed on error.
Step 2. Add steps (other flows) to the nested flow.
Step 3. Select the Flow Control
tab, then select Execute if Error
.
Step 4. Add the flow created in steps 1-3 to the main nested flow.
Stop the flow programmatically without raising an exception
Problem
There are situations when you may want to stop the nested flow early based on conditions. For example, if there are no files to process. At this point, there could be multiple steps that have not been executed yet. You can add a condition to each step, but it requires a lot of manual labor.
Solution
Anywhere in the JavaScript or Python code, executed within the loop add the following line of code. This call is thread-safe.
etlConfig.setRequestFlowStop(true)
Example
if (dataSet.getRecordCount() == 0) {
etlConfig.log("Empty dataset, hence exiting the flow");
etlConfig.setRequestFlowStop(true);
}
Comments
0 comments
Please sign in to leave a comment.