When to use this Flow type
In Etlworks, it is possible to create Flows of different types:
- Flows working with databases.
- Flows working with files.
- Flows working with web services.
- Flows executing SQL and JavaScript.
- Flows sending emails.
- Many other Flow types.
There is an easy way to combine Flows of different types into a single nested Flow (pipeline), add conditions, loops, and parallel execution.
Using this technique, you can create very complex data integration scenarios where one Flow is used as input for another, where Flow is executed multiple times based on conditions, or where Flow is not executed at all when conditions are not met.
Create nested Flow
Assuming that you have already created Flows that you want to combine into a single nested Flow:
Step 1. Start creating a nested Flow by opening the Flows
window, clicking Add flow
, and typing Nested Flow
into the Select Flow Type
box:
Step 2. Continue by adding Flows to the nested Flow by clicking Add flow
.
Remove Flows from the pipeline
Disable Flows
Flow parameters
For each Flow in a pipeline (inner Flow), you can specify parameters by clicking the pen
icon.
Available parameters
Parallel
: if enabled (disabled by default), the inner Flow will be executed in parallel, in its own thread. Disable it if you want to execute Flows sequentiallyCondition
: a JavaScript condition, which, iffalse
is returned, disables execution of the inner Flow.Loop type
: the available loop types are:Script
(execute the inner Flow while JavaScript or Python code returns anot null
value),SQL
(execute SQL, execute the inner Flow for each row of the result set), andFiles by wildcard
(execute the inner Flow for all files matching a wildcard).Connection
: an optional field, which is available whenSQL
orFiles by wildcard
is selected for theLoop type
.Loop Script
: a JavaScript, Python, SQL, or wildcard loop condition.Maximum number of iterations
: this parameter is designed to prevent the system from creating infinite loops when the loop conditions are not properly defined. The default value is100000
. By setting the value to<= 0
(for example, to-1
), you can disable the prevention mechanism altogether, so be careful.Loop Threads
: a maximum allowed number of parallel loop executions.
Execute Flow conditionally
You can create a nested Flow where some of the steps (inner Flows) are configured to be executed conditionally.
Read how to execute steps in the nested Flow conditionally.
Alternatively, some Flows, such as the Flows below, can be executed conditionally even when they are not part of the nested Flow.
- Send email
- Execute SQL
- Merge data with template
- Send payload to the HTTP endpoint
- Execute shell (command line) script
Read more about executing Flows conditionally without nested flow.
Execute Flow in a loop
You can create a nested Flow where some of the steps (inner Flows) are configured to be executed in a loop.
Typical use cases for executing Flows in a loop:
- Extract data from the HTTP endpoints with parameters. The parameters can be provided at runtime by using the loop.
- Configure the transformations and Connections by reading configuration parameters from the database.
- Dynamically setting up the credentials for the Connections.
- Executing Flow a fixed number of times.
Read how to execute Flow in a loop.
Flow Variables
When creating a nested Flow, you can add Flow Variables
, KEY
/ VALUE
pairs, which can be referenced in SQL and JavaScript code as a {VARIABLE}
. For example, there is a Flow variable NAME
, defined for the nested Flow:
Use Flow variables
A Flow variable can be referenced in the Source query
as demonstrated below:
select * from connection
where lower(name) like '%{NAME}%'
A Flow variable can also be accessed within the JavaScript.
Named Connections
Typically, a nested Flow inherits all the Connections from the inner Flows. You can add additional named Connections under the Connections
tab.
Use named Connections
You can then use the named Connections in inner Flows, for example, in JavaScript.
Handling exceptions in nested flow
It is possible to configure JavaScript exception handler for nested flow.
When to use exception handler
- To conditionally ignore exceptions in any flow which is included in the nested flow.
- To record the error into external log.
What happens when the exception is ignored
- If the nested flow with an exception handler is main flow it will finish and the status will be set to 'success'. The ignored error will be added to the flow log.
- If the nested flow with an exception handler is a part of the another nested flow the execution will continue and the ignored error will be added to the flow log.
- If the nested flow with an exception handler is executed in a loop the next iteration fo the flow will start and the ignored error will be added to the flow log.
Creating an exception handler
To create an exception handler for the nested flow go to On Exception
tab and enter JavaScript code in Execute on Exception
field.
Ignoring exception
Set the value
variable to 2
(ignore) in last line of the JavaScript code to configure flow to ignore the exception.
value = 2; // 2 = ignore, 1 - raise
Available variables
The following variables can be referenced by name from JavaScript code:
Name | Class name / JavaDoc | Package |
---|---|---|
etlConfig | com.toolsverse.etl.core.config.EtlConfig | com.toolsverse.etl.core.config |
scenario | com.toolsverse.etl.core.engine.Scenario | com.toolsverse.etl.core.engine |
exception | java.lang.Exception | java.lang |
Example of the exception hander
importPackage(com.toolsverse.etl.sql.util);
importPackage(com.toolsverse.config);
importPackage(com.toolsverse.util);
var con = etlConfig.getConnectionFactory().getConnection("log");
var props = SystemConfig.instance().getProperties();
var fileId = Utils.str2Int(props.get('fileid'), 0);
var error = exception != null ? Utils.getStackTraceAsString(exception) : null;
var fileSql = "update etl_feed_file_log set ended = CURRENT_TIMESTAMP,
error = ? where fileid = ?";
try {
SqlUtils.executeSql(con, fileSql, error, fileId);
} catch (err) {
// already logged
}
value = 2;
Dynamic Flow
Dynamic flow allows users to configure a dynamic workflow step that can execute any flow by name.
Create Dynamic flow
Step 1. In Flows
window, clicking Add flow
, and type in dynamic flow
into the Select Flow Type
box:
Step 2. Enter an actual flow name or a {VARIABLE_NAME}
in the Flow Name
field.
Step 3. Add Dynamic Flow as a step to any nested flow.
Use Dynamic Flow
Once the Dynamic Flow is created and added as a step to the nested flow it can be used by dynamically passing the value of the {VARIABLE_NAME}
.
Below are some of the examples:
When executing the Flow manually
Step 1. Click Run flow
button
Step 2. Click Add parameters
link
Step 3. Add {VARIABLE_NAME}
in KEY
and Flow Name in VALUE
When scheduling the Flow
Open the schedule, add {VARIABLE_NAME}
in KEY
and Flow Name in VALUE
in Flow patameters
:
When using Remote Integration Agent
Step 1. Click Edit flow patameters
Step 2. Add {VARIABLE_NAME}
in KEY
and Flow Name in VALUE
in Flow patameters
:
Error pipeline
The nested flow can be configured to be executed if there is an error anywhere within the nested pipeline.
A single nested on-error flow can be added to multiple nested flows and executed simultaneously.
To configure the nested flow to be executed in case of error follow these steps:
Step 1. Create nested flow to be executed on error.
Step 2. Add steps (other flows) to the nested flow.
Step 3. Select the Flow Control
tab, then select Execute if Error
.
Step 4. Add the flow created in steps 1-3 to the main nested flow.
Stop the flow programmatically without raising an exception
Problem
There are situations when you may want to stop the nested flow early based on conditions. For example, if there are no files to process. At this point, there could be multiple steps that have not been executed yet. You can add a condition to each step, but it requires a lot of manual labor.
Solution
Anywhere in the JavaScript or Python code, executed within the loop add the following line of code. This call is thread-safe.
etlConfig.setRequestFlowStop(true)
Example
if (dataSet.getRecordCount() == 0) {
etlConfig.log("Empty dataset, hence exiting the flow");
etlConfig.setRequestFlowStop(true);
}
Comments
0 comments
Please sign in to leave a comment.