Overview
In Etlworks, most ETL, streaming, and file-based flows require selecting an existing source or destination connection, or both. There are hundreds of connection types to choose from, including over 20 connection types for local and cloud file storage systems such as AWS S3, Box, SFTP, etc.
When creating complex data integration pipelines with numerous input parameters, it is possible that you won’t know at design time what the source or destination connection will be when the flow runs. This decision is made at runtime based on certain conditions. If you know the type of the connection, such as SFTP, you can always add parameters when configuring the connection. However, if the connection type is not known at design time (e.g., it could be either S3 or SFTP), the solution for this edge case is a dynamic connection.
Dynamic connections allow developers to configure pipelines where the actual source or destination connection can be dynamically set at runtime.
When to use dynamic connection
Dynamic connections are particularly useful in the following scenarios:
Customer-Specific Data Storage
When dealing with data owned by various customers, each having its own storage preferences, but with otherwise similar pipelines. You can configure a common pipeline with a dynamic source connection, enabling it to be shared across multiple customers. The actual connection is set at runtime when the pipeline runs.
Multiple Data Destinations
When sending data to various destinations (e.g., HTTP, SFTP, S3) after extracting and transforming it from the source. Instead of creating a separate source-to-destination transformation for each target, you can use a dynamic connection for the destination and set it at runtime as a parameter.
Diverse Source and Destination Handling
When reading data from various sources, transforming it, and sending it to different destinations. This can be efficiently implemented as a flow with dynamic source and destination connections, when both connections can be set at runtime.
Flows which support dynamic connections
- Any ETL flow where the source or the destination is a file storage (local, networked or cloud)
- Any ETL flow where the source or the destination is web service
- Any-to-Any ETL flow
- Any file based-flow (COPY/MOVE/DELETE/etc.)
- Any flow which works with email servers (send and receive)
- Merge data with template
- Call HTTP endpoint
- Apply XSL style sheet to XML files
- Split files
- Merge files
- Any flow which works with HL7 messages
- Any flow which works with EDI (X12, EDIFACT, etc.) messages
Step by step process
Step 1. Create new dynamic connection.
Step 2. Set the Connection Name
. The Connection Name can include {tokens}
which will be replaced at runtime on the values of the global variables, therefore making this connection a dynamic.
Step 3.Create a flow that uses dynamic connections. You can choose any flow type that supports this feature.
Step 4. Create a nested flow and add flow created in step 3.
Step 5. Add as many named connections as needed which point to the actual connections, such as S3, SFTP, HTTP, etc. Any of these connections can be used in place of the Dynamic connection. The property Connection Name
set in step 2 will be used to lookup the actual connection by name.
Step 6. Use any of these methods the set the values of the global variables used to configure the property Connection Name
set in step 2.
Using dynamic flow together with dynamic connection
The example above assumes that while the connection is dynamic, you know all possible connections that can be used in place of the dynamic connection at design time. If this is not the case, or if you want to create a common pipeline that can be extended dynamically by adding new steps while keeping the rest of the pipeline intact, you can use dynamic flows together with dynamic connections.
In this scenario, you still need to create the dynamic connections and flows that use them. However, instead of adding all possible actual connections as named connections to the nested flow, you can include a step that executes the dynamic flow by name. This dynamic flow can run any other flow, such as one that adds dynamic connections as named connections at runtime. You can create multiple such flows for various use cases without ever updating the main flow that utilizes them. The main flow will have a single dynamic step that executes any of these auxiliary flows.
Comments
0 comments
Please sign in to leave a comment.