Overview
In Etlworks, most ETL, streaming, and file-based flows require selecting an existing source or destination connection, or both. There are hundreds of connection and format types to choose from.
However, in many complex data integration pipelines, you may not know at design time which source or destination connection will be used when the flow runs. This is especially true when:
- The actual connection and/or format is determined dynamically at runtime based on input parameters.
- The connection type itself (e.g., S3 vs. SFTP) is unknown at design time.
In such cases, the solution is a dynamic connection, which allows developers to configure pipelines where the actual connection (and optionally format) is resolved dynamically at runtime.
Types of Dynamic Connections
There are two types of dynamic connections in Etlworks. Moving forward, we recommend using Dynamic Connections, as they provide greater flexibility and eliminate the need to define all connections at design time.
Dynamic Connection (Recommended)
- Resolved before the flow starts.
-
The names of connections and formats can be passed to the flow as parameters, which can be configured through the run flow dialog, the scheduler settings, or the Integration Agent configuration.
- The system automatically finds the actual connection and format that matches the configured name.
- There is no need to add an actual connection as a named connection to the parent flow.
- More flexible, as connections do not have to be predefined.
Legacy Dynamic Connection (For Backward Compatibility)
- Requires adding actual connections as named connections to the parent nested flow.
-
For legacy dynamic connections, the names of connections are passed as global variables.
- The system resolves the dynamic connection by looking up predefined named connections at runtime.
- Less flexible, as all possible connections must be known at design time.
- Still available but not recommended for new integrations.
When to use dynamic connection
Dynamic connections are particularly useful in the following scenarios:
Customer-Specific Data Storage
When dealing with data owned by various customers, each having its own storage preferences, but with otherwise similar pipelines. You can configure a common pipeline with a dynamic source connection, enabling it to be shared across multiple customers. The actual connection is set at runtime when the pipeline runs.
Multiple Data Destinations
When sending data to various destinations (e.g., HTTP, SFTP, S3) after extracting and transforming it from the source. Instead of creating a separate source-to-destination transformation for each target, you can use a dynamic connection for the destination and set it at runtime as a parameter.
Diverse Source and Destination Handling
When reading data from various sources, transforming it, and sending it to different destinations. This can be efficiently implemented as a flow with dynamic source and destination connections, when both connections can be set at runtime.
Flows which support dynamic connections
Flow types which support Dynamic connection and Format
- All flow types where user can select a connection and/or format.
Flow types which support Legacy Dynamic connection
- Any ETL flow where the source or the destination is a file storage (local, networked or cloud)
- Any ETL flow where the source or the destination is web service
- Any-to-Any ETL flow
- Any file based-flow (COPY/MOVE/DELETE/etc.)
- Any flow which works with email servers (send and receive)
- Merge data with template
- Call HTTP endpoint
- Apply XSL style sheet to XML files
- Split files
- Merge files
- Any flow which works with HL7 messages
- Any flow which works with EDI (X12, EDIFACT, etc.) messages
Using Dynamic Connections
Create dynamic connection
Step 1. Create new dynamic connection.
Step 2. Set the Connection Name
. The Connection Name can include {tokens}
.
These tokens will be dynamically replaced with the values of parameters with the same name when the flow runs.
Optionally create Dynamic Format
Step 3. Create new dynamic format.
Step 4. Set the Format Name
. The Connection Name can include {tokens}
.
These tokens will be dynamically replaced with the values of parameters with the same name when the flow runs.
Create flow with dynamic connection
Step 5. Create new flow which requires connection and or/format. All flows support dynamic connections and formats.
Step 6. Configure transformation where source and/or destination are dynamic.
Set value of the Connection Name and/or Format Name property
Step 7. Use any of methods below the set the values of the properties used to configure the property Connection Name
set in step 2 and/or Format Name
set in step 4:
- Variables set when executing flow manually : When running a flow manually, variables can be entered directly before the flow starts.
- Variables set when configuring the schedule : When setting up a scheduled flow, variables can be defined as part of the schedule configuration.
- Variables set when configuring the flow to be executed by Integration Agent : If the flow is executed by an Integration Agent, variables can be set in the agent’s flow configuration.
- URL parameters in call-flow-by-name API :Pass variables through URL parameters when triggering a flow by name.
- Payload in call-flow-by-ID API : Provide variables via the payload when triggering a flow by ID.
- User-defined API URL parameters : Use parameters defined in your custom API endpoints.
Here is an example when running flow manually:
Using Legacy dynamic connections
Create legacy dynamic connection
Step 1. Create new legacy dynamic connection.
Step 2. Set the Connection Name
. The Connection Name can include {tokens}
.
These tokens will be dynamically replaced with the values of global variables with the same name when the flow runs.
Optionally configure legacy Dynamic Format
Step 4. Optionally set Format Source to Use Format configured in the named connection. By default, the flow uses the Format configured in the flow, applying the format specified directly in the flow configuration. When you select Use Format configured in the named connection, the format defined in the named connection is dynamically applied during execution.
Here is an example:
The format in the named connection is set to JSON:
The format in the flow is set to Any:
If Format Source is set to Use Format configured in the named connection, the flow will use the JSON format.
If Format Source is set to Use Format configured in the flow, the flow will use the Any format (or another format specified directly in the flow).
Create flow with legacy dynamic connection
Step 5.Create a flow that uses dynamic connections. You can choose any flow type that supports this feature.
Create named connections
Step 6. Create a nested flow and add flow created in step 5.
Step 7. Add as many named connections as needed which point to the actual connections, such as S3, SFTP, HTTP, etc. Any of these connections can be used in place of the Dynamic connection. The property Connection Name
set in step 2 will be used to lookup the actual connection by name.
Set value of the legacy Connection Name property at runtime
Step 9. Use any of these methods the set the values of the global variables used to configure the property Connection Name
set in step 2.
Using dynamic flow together with legacy dynamic connection
This applies to legacy dynamic connections only.
The example above assumes that while the connection is dynamic, you know all possible connections that can be used in place of the dynamic connection at design time. If this is not the case, or if you want to create a common pipeline that can be extended dynamically by adding new steps while keeping the rest of the pipeline intact, you can use dynamic flows together with dynamic connections.
In this scenario, you still need to create the dynamic connections and flows that use them. However, instead of adding all possible actual connections as named connections to the nested flow, you can include a step that executes the dynamic flow by name. This dynamic flow can run any other flow, such as one that adds dynamic connections as named connections at runtime. You can create multiple such flows for various use cases without ever updating the main flow that utilizes them. The main flow will have a single dynamic step that executes any of these auxiliary flows.
Comments
0 comments
Please sign in to leave a comment.