Continuous Delivery Model
Etlworks uses a continuous delivery model. With this approach, bug fixes, new features, and enhancements are released as soon as they are ready.
Updates are installed on an as-needed basis.
Numbered releases are automatically deployed to individual Etlworks instances on a rolling schedule, ensuring that all users have access to the latest improvements without delays.
What's New?
Version: 7.1.4
Etlworks is now available on the AWS Marketplace:
- Learn More and Get Started: Etlworks on AWS Marketplace
Improvements for Integration Agents:
-
Auto-Retry for Scheduled Flows: Integration Agents now support automatic retries for scheduled flows, enhancing reliability in case of transient issues. Read about configuring auto-retry for flows scheduled on Integration Agent.
-
Reupload Flow Execution Stats When Host Instance Is Back Online: Integration Agents will reupload statistics and logs for flows executed during host instance downtime. Additionally, webhook notifications will be triggered if they are configured. Read about configuring the reupload.
New Functionality:
-
Shopify and Airtable Connectors:
-
- Shopify connector now supports GraphQL APIs.
-
Airtable connector now supports authentication with auth tokens.
-
-
Improvements for ZOHO Creator connector:
-
-
Enhanced metadata retrieval.
-
Full support for all data types available in Zoho Creator.
-
-
-
Human-Readable Field Names in X12 Connector: When converting X12 to JSON, human-readable field names are now preserved. Previously, this functionality was limited to X12 to XML conversions.
Important Changes Under the Hood:
-
Optimized Large File Processing for Azure Storage Connector: Significant performance improvements when handling large files in Azure Blob Storage.
-
Improved X12 Processing Efficiency: Memory usage for X12 processing has been optimized, improving scalability and stability.
-
S3 Connector Auto-Retry for Metadata Service Errors: Auto-retry now covers scenarios where the connector fails to load credentials from the EC2 instance metadata service.
- Default Schedule Update: The default schedule is now set to one hour. This addresses issues caused when users saved the default one-minute schedule without adjusting it to a more reasonable cadence.
Important Bug Fixes:
-
Fixed "Exclude Column(s)" Property in MongoDB CDC Connector: Resolved an issue where the property to exclude specific fields during data extraction was not functioning as intended.
-
Resolved Scientific Notation Issues in CDC Connectors: CDC connectors now handle DOUBLE data type columns more accurately by removing scientific notation, ensuring precise and consistent data representation.
Version: 7.0.6
New Functionality:
-
Automatic Partitioning: When configuring ETL flows, Etlworks now enables the use of Partition SQL. This feature allows you to define custom partitioning conditions via an SQL query, automatically creating separate ETL transformations for each partition generated. These transformations can then execute in parallel, significantly improving efficiency. Learn more about Automatic Partitioning.
- 25 New Connectors: This update introduces 25 new connectors. It also expands support across the entire ZOHO suite of applications, including ZOHO CRM, ZOHO Creator, ZOHO Projects, and ZOHO Inventory.
-
HTML and PDF Read Connectors: The new functionality for HTML and PDF connectors allows these connectors to read as well as write documents. When reading, the connectors will attempt to identify and parse tables within the document. For single tables, a flat dataset is generated; if multiple tables are detected, a nested document is created for easier handling. More on the HTML and PDF read connectors.
-
SFTP2 Connector: Users can now choose between the SFTP and SFTP2 connectors. The new SFTP2 connector, currently in beta, utilizes a modern non-blocking IO library, enabling faster performance, especially for larger data transfers. Check out the details on SFTP2 Connector.
-
Automatic ISO Date Conversion in MongoDB: With this release, timestamps in MongoDB can be automatically converted to ISODate format. Storing timestamps as ISODate objects enables more efficient date-based queries and indexing, while allowing flexible configuration for optional conversion. Learn more about ISO Date Conversion.
-
Notifications for Interrupted Flows: To maintain consistency, Etlworks now sends notifications for flows that were running when the Etlworks instance restarted without a graceful shutdown. Upon reboot, previously running flows are marked as ‘failed’, triggering notifications and webhooks, if configured.
-
Core Dump API Endpoint: The new Core Dump API enables users to capture the JVM’s current state by generating a core dump file for diagnosing system issues. This API saves the core dump in {app.data}/errors and provides the file path. This file can be analyzed with tools like jmap, jstack, or JProfiler for debugging. Check out the Core Dump API.
-
X12 Connector Enhancements: The X12 connector now supports converting X12 documents to both JSON and XML formats, expanding integration options for users. Learn more about the X12 Connector. We also improved the resource allocations efficiency of X12 connector when processing large messages.
Important Changes Under the Hood:
- Improved Scripting Engine: The scripting engine is now 20% more efficient, which enhances the performance of flows utilizing JavaScript or Python to calculate column values on a per-record basis.
-
CDC Flow Commit Synchronization: CDC flows that write to a database now synchronize offset commits with destination commits, ensuring that offset (source database log position) isn’t committed until all changes reach the destination.
-
User Identification in Integration Agent: The Integration Agent now captures the name of users who manually execute flows, which is also available in webhook payloads. This improvement enables differentiation between manually triggered and scheduled flows executed by Integration Agent.
-
Enhanced Exception Handling for Table Creation: The user-configurable exception handler now extends to the “create table” SQL command, which is triggered when the flow determines that a table does not exist by attempting to retrieve its metadata. This enhancement allows users to better manage exceptions that occur during automatic table creation, providing greater control over error handling and flow execution. Learn more about exception handling in table ETL flows.
Important Bug Fixes:
- Excel XLSX Column Alignment: We fixed an edge case causing column misalignment when reading XLSX files created by certain third-party tools.
-
AS400 CDC Memory Leak: A memory leak in the AS400 CDC connector has been fixed, resulting in improved stability.
Version: 6.8.2
Improvements for Integration Agents:
- Fully Automated Update with Restart: Agents now support fully automated updates through a new Update service, ensuring that agents are kept up to date with minimal manual intervention. This includes automatically restarting the agent after updates.
- Remote Start, Stop, and Log Retrieval: Agents can now be started and stopped remotely, with the added ability to download log files directly from the UI for improved troubleshooting and management.
- Remote Configuration of JVM Parameters: You can now remotely configure JVM parameters for agents, offering better control over performance tuning without requiring direct access to the server running the agent.
- Linux Installer: In addition to the Windows installer, a Linux installer is now available, supporting a broader range of operating systems and environments.
- Improved Communication Protocol: The communication between agents and the central platform (mothership) has been optimized, resulting in a 5x reduction in outbound traffic. This leads to improved performance and lower bandwidth usage.
New Functionality:
- Connector for MQTT Brokers: The new MQTT connector enables seamless integration with MQTT brokers, allowing for the efficient sending and receiving of messages across MQTT topics. This connector works similarly to other message queue connectors in Etlworks and provides flexible configuration options for reliable message exchange.
- Flow Analysis for Performance Optimization and Structural Integrity: The Flow Inspection service analyzes executed flows for performance bottlenecks and structural issues. It generates detailed reports identifying critical, major, minor, and informational issues, helping users optimize performance and improve flow integrity.
- ETL Threads Info API: This new API provides visibility into the resource consumption (CPU and memory) of ETL processes across nodes. It’s a valuable tool for identifying performance bottlenecks in specific flows and for overall system monitoring.
- Using Credential Manager APIs with Connections: Secure and scalable credential management is now easier with the ability to use external credential managers (like AWS Secrets Manager or STS) to dynamically manage credentials for connections. The Script to run before connect parameter enables dynamic credential retrieval, enhancing security and ease of management.
- Chunked Download for S3 Connector: The S3 connector now supports chunked downloads, improving the performance and reliability of transferring large files.
- Optimized MongoDB Loading: New options are available to improve performance when loading documents into MongoDB, particularly when using batch operations and the new Insert method. Using Insert is faster than Update or Replace and is ideal for initial data loads.
- Byte Array Format: This new format is perfect for scenarios where message content must be transferred between systems (e.g., MQTT to ActiveMQ) without any modification. The raw byte data remains intact, preserving its original structure during transit between systems.
- Flow and Agent Scheduler Enhancements: Schedulers now support activation and deactivation dates, allowing for automatic enablement or disablement of scheduled flows based on these settings.
- Webhooks Enhancements: Webhooks now support the ability to include or exclude specific flows, providing more granular control over which flows trigger the webhook.
UI/UX Improvements:
- Version Control for Integration Agents: The UI now includes support for version control, making it easier to manage different agent versions across environments.
- Agent UI Enhancements: The Agent UI now includes the ability to duplicate agents and associated flows, and displays additional details such as version, OS, and IP address.
- Improved Tagging with Partial Matching: You can now add tags with partial matching, making it easier to categorize and filter items based on similar tags.
- Flow Execution Duration Display: Flow execution duration is now displayed across all screens that show flow statistics, in addition to the previously available start and end timestamps.
- Last Modified Date for Files in Explorer: The Explorer view now shows the last modified date for files, providing better visibility into file updates.
- Column Filters on Mapping Screen: A new column filter has been added to the mapping screen, allowing for more efficient data manipulation and mapping configuration.
Important Changes Under the Hood:
- Message Queue Connectors: Test Connection Support: All message queue connectors now support testing the connection, making it easier to verify configurations before running.
- Message Queue Connectors: Streaming Support: Streaming support has been expanded to all message queue connectors (previously limited to Kafka, Azure Event Hubs, and Google Pub/Sub).
- Improved Logging Clarity: Enhancements to logging provide clearer, more informative messages, especially during error conditions.
- Improved Exception Handling: Exception messages now include more detailed information, such as the source of the error (column name, transformation, etc.), improving troubleshooting.
- Parquet Schema Generation: Automatic schema generation for Parquet files has been improved, ensuring correct schema creation when files are generated without user-defined schema.
- Linux Installer for On-Premise Multi-Node Deployment: The Linux installer now supports on-premise multi-node deployments, previously only available for AWS, and includes support for Red Hat 8.
- Excel Connector for XLSX Files: The Excel connector now supports reading XLSX files with XML namespaces, which are commonly created by .NET applications.
- Oracle Connector Enhancements: The Oracle connector now supports additional data types, including REFCURSOR and STRUCT, providing broader compatibility.
- Auto-Retry for AWS Connectors: The auto-retry functionality for AWS connectors (S3, Kinesis, SQS) now covers a broader range of exceptions, improving fault tolerance and error recovery.
- Acknowledged Messages for Kafka and Azure Event Hubs: Messages sent to Kafka and Azure Event Hubs now use acknowledgments, ensuring that message delivery is confirmed rather than relying on a fire-and-forget approach.
Important Bug Fixes:
- Scientific Notation in Field Calculations: Calculations performed using field functions will no longer generate results in scientific notation, ensuring more accurate numeric output.
- Excel Files with Large Values: Fixed the error “read data but the maximum length for this record type is 100,000,000” when reading Excel files with extremely large values in certain cells.
- NPE in Message Queue Source Query: Resolved an issue causing a NullPointerException (NPE) when no records were available to stream from message queues using Source query.
- SFTP File Deletion: Fixed a bug causing errors when deleting files from SFTP if the full path was specified in the transformation instead of the connection.
- Flow Statistics Timezone Discrepancy: Fixed a bug that caused discrepancies in flow statistics display when automatically adjusting to the browser’s timezone.
Version: 6.0.3
New functionality
In this release, we have introduced several new functionalities to enhance the capabilities of Etlworks, making it more powerful and versatile. Here are the key additions:
- AI Assistant. Etlworks now includes AI Assistant powered by the GPT-4 model. The AI assistant is trained on Etlworks’ knowledge base and other web assets. Unlike the standard ChatGPT, it is continuously retrained on Etlworks data, ensuring accurate and reliable responses without hallucination. Learn more.
- Oracle bulk load. In this release we have added new flows optimized for loading data into Oracle using sql*loader utility. Learn about ETL and bulk file load flows which leverage the fast data load capabilities of the Oracle sql*loader utility.
- Dynamic connections. In this update, we have added the ability to set connections for transformations and flows dynamically at runtime based on conditions. Learn more.
- Execute script on a remote host via SSH connection. We have added the ability to execute shell scripts on a remote host via an SSH connection. Learn more.
- More metrics. In this update, we have added the ability to track metrics for each executed step in a nested flow and independently track metrics for each iteration in a loop. Previously, we were recording metrics only for ETL transformations and file-based operations and we were aggregating metrics for the loops.
- Improvements for the scheduler. In this update we have added the ability to configure the schedule to not run on specific weeks of the month and days of the weeks. It is also now possible to automatically deactivate the schedule on a specific date.
- Auto-retry for native AWS connectors. We have added the ability to auto-retry individual requests to the AWS API across all native AWS connectors: S3, SQS, Kinesis.
- API keys for all roles. In this update we have added the ability to create API Keys for all types of roles. Previously, this feature was available only for the API-user role. The most common use case for an API key is when Etlworks is configured to use SSO, and you want to avoid frequent password changes enforced by the SSO identity provider. The API key will remain valid until it is revoked, even if the password is changed.
- Nested Error pipeline. The nested flow can now be configured to execute if there is an error anywhere within the pipeline. Previously, this feature was available only for single flows, such as sending emails or deleting files. Learn more.
-
Improvements for high-watermark change replication
- Save HWM on success only. With high watermark change replication (HWM), there is an edge case to consider: when HWM replication flow is a part of a nested flow with many steps, and the developer wants to configure the flow to save the HWM value only if all steps are executed successfully. In this release, we have added the ability to save the HWM value on success only, ensuring that if the entire nested flow fails, it will restart from the last known “good” position. Learn more.
-
HWM column with token. It is now possible to set the name of the HWM column in high watermark change replication transformation using
{token}
, replaced on the value of the global variable at runtime.
-
Escape character for double quotes in CSV format. Some CSV files may use a non standard character to escape enclosing double quotes, for example
\
. In this update we have added the ability to configure the escape character for CSV format. - Improvements for webhooks. It is now possible to configure exclusions and inclusions for webhooks. This allows developers to fine-tune the conditions under which webhook events will be triggered.
- New API. In this release we have added the new /threads API which can be used to get a list of the currently running flows.
- Improved handing of MERGE (UPSERT) for PostgreSQL 15+. In this release we re-added the ability to execute INSERT UDPATE on CONFLICT for PostgreSQL version 15 and up. Learn more.
- Improvements for loops. It is now possible to stop the loop if it exceeds a certain time limit. Learn more.
-
More configuration options for logging in CDC flows. It is now possible to set the property
max.poll.log.period
to control how often the CDC flow logs the number of processed records. This property controls the maximum interval between logging attempts. Previously, it was hardcoded to once every two hours. -
Improvements for parametrization. Both global and flow variables can be now used as
{parameters}
in connections, transformations and scripts, including SQL scripts. Previously, global variables could only be used in connections and transformations, while flow variables were limited to scripts. - Order of columns when creating a table. By default, when a flow creates a table and there is user-defined mapping, it keeps the original order of columns in the source and adds columns that do not exist in the source to the end, ignoring the order of columns in the defined mapping. In this update, we have added the ability to maintain the order of columns as defined in the mapping when creating new tables and executing the Test transformation. Learn more.
UI/UX improvements
In this release, we have introduced several UI/UX improvements to enhance the user experience and make interacting with Etlworks more intuitive and efficient. Here are the key updates:
- Stop the flow from flow statistics dashboard. In this release we have added the ability to stop the flow from statistics dashboard.
- Improved navigation. It is now possible to directly navigate from flow statistics page to the flow editor and vise versa.
- More information and filters. The Flow Statistics dashboard now includes the Execution ID and Started By labels and filters. This allows users to filter flow executions by who or what started them.
- Duplicate the schedule and flow. It is now possible to duplicate the schedule and flow from the editor window.
New functions for Scripting
In this release, we have introduced new functions that can be used in scripting to enhance the flexibility and functionality of your flows. Here are the details:
Important changes under the hood
In this release, we’ve made several significant improvements and optimizations to enhance the performance, stability, and functionality of Etlworks. Here are the key changes:
- We have upgraded the AWS SDK to the latest version, improving the performance and stability of AWS-native connectors such as S3, Kinesis, and SQS.
- We have resolved issues related to running Etlworks in a multi-node setup when specific nodes or infrastructure elements, such as the load balancer, are terminated without a graceful shutdown or in a wrong order.
- We have improved the handling of manual flow cancellation, specifically when the flow is cancelled right after it started.
- We now allow super admin users to call a listener in any tenant.
- We have improved the error handling when the dynamic flow points to not existing actual flow.
Important bug fixes
In this release, we have addressed several bugs to improve the functionality and reliability of Etlworks. Here are the key fixes:
- We have fixed the bug that prevented the Redshift bulk load flow from loading files from subfolders in an S3 bucket.
- We have fixed the issue in the any-to-any ETL flow that prevented it from using multiple streaming and non-streaming transformations within the same flow.
- We have fixed issues with handling Oracle XMLTYPE and NUMBER (without precision) and Postgres JSONB data types in various scenarios.
- We have fixed an issue with the AS400 CDC connector that was causing data loss in certain scenarios when an error occurred later in the pipeline, such as during the transformation of the CDC event and loading it into the destination.
New how-to articles in documentation
In this release, we have added several new how-to articles to our documentation to help you get the most out of Etlworks. These articles provide step-by-step guides for common tasks and advanced features, ensuring you can leverage all the capabilities of Etlworks effectively. Here are the new additions:
- Collecting flow stats and parameters in realtime, integration with third-party data collection tools such as Grafana.
- Monitoring Etlworks instance health.
- Variables in Etlworks.
- How to programmatically stop the flow without raising an exception.
Version: 5.9.0
New connectors
In this update, we added the following new connectors:
- Google Pub/Sub connector. It supports ETL, streaming, and Change Data Capture use cases. Read more.
- Microsoft Dataverse connector. Read about Dataverse.
New functionality
Integration Agents. Integration Agents now feature automatic updates and the ability to update to a specific version. Learn more.
Beginning with version 5.9.0-SNAPSHOT
the Agent's version is now aligned with that of the application. The automated Agent update process ensures compatibility between the Agent and the main application.
When working with Integration Agents, it is now possible to set Agent in Context in Connection, Flow Builder, and Explorer. When the Agent in Context is set, you can seamlessly work with on-premise data, including data on your local laptop, as if it were in the cloud:
- Test connections configured to work with on-premise data.
- Upload private and public keys for the on-premise connections.
- Test transformations and perform interactive mapping for the source and destination connections configured to work with on-premise data.
- View data and metadata, execute SQL statements, upload and download files for the on-premise connections.
Native Postgres MERGE. Our Postgres connector now supports native PostgresSQL MERGE for PostgreSQL 15 and newer. Learn more about MERGE.
CDC. We added a new inline CDC function cdc_boolean_soft_delete(true)
. Unlike the same function without the parameter which returns true
or null
, this function returns true
or false
for the CDC delete events.
Formats. We added Function to calculate Filename
for all supported Formats. This JavaScript function can be used to change the name of the file to read and/or the name of the file to create by the ETL transformation.
CSV and Fixed-length formats now support Starting Index
used as a suffix for files generated when splitting the file into chunks with a fixed number of records. The previous behavior was to always start with the suffix _0
.
Important changes under the hood
CDC. We have updated the binlog reader used by MySQL CDC connector from 0.28.2
to the latest version 0.29.2
. Change log.
Parametrization. It is now possible to {reference}
global variables in SQL Queries and scripts. Previously, it was only possible with flow variables.
Important bug fixes
Connectors. We fixed an issue in which the database connectors were returning values for columns with very large decimal numbers in the scientific notation.
SFTP connector now supports URLs that start with sftp.
prefix but do not include sftp://
, for example sftp.test.com
.
ETL. IfExists SQL action is now case insensitive: the names of the Lookup Fields can be in any CaSe.
Loops. We fixed an issue where the flow executed in a loop was failing to start if it contained nested mapping with calculated fields.
Version: 5.6.1
New functionality
BULK MERGE. In this release we added new SQL Actions: BULK MERGE and BULK DELETE/INSERT. BULK actions significantly improve the performance of MERGE (UPSERT) in the flows which load data into relational databases. Read more.
Important changes under the hood
User request: Zip/Unzip with subfolders. Zip and Unzip flows now support folders in FROM and TO. It is now possible to reuse the same source and destination connection and set folders in the transformation.
Extra logging. We added extra logging for the case when user manually stops the flow. The following line is added to the TOMCAT_HOME/logs/etlprocess.log: User requested
to stop the flow with audit id %s.
In a past we only logged the failed requests to stop the flow. Note that we always record the stopFlow action in Audit Trail so it can be used to trace who and when requested to stop the running flow.
Important bug fixes
We fixed the bug introduced by the support for duplicated column name functionality added in the previous release ( 5.5.29) Behavior before the fix: if the Source query contains intentionally duplicated column names, for example SELECT ID, '1' NAME, '2' NAME, '3' NAME FROM TABLE
the flow returns duplicated column names: ID, NAME, NAME, NAME
. Behavior after the fix (restored to pre 5.5.29 ) - the flow returns column names with incremented counter: ID, NAME, NAME1, NAME2
.
Version: 5.5.29
New functionality
Inline unzip. In this release we added an ability to directly read files in .gz
and .zip
archivers when performing an ETL transformation when the file is a source. No need to extract the file from the archive before processing it. Read more.
Support for duplicated column names. When the destination is not a database, the mapping editor now allows the creation of a destination document (for example an Excel file) which has multiple columns with the same name. Read more.
For Each Row transformation in streaming flows. CDC to Database and Message queue to Database flows now support a scripting transformation, which is executed when the row is added to the destination table. The typical use cases are:
- Trigger: execute SQL statement on other table or tables when a row is added/updated/deleted in a destination table.
- Per-row logging.
CDC to Redshift flow doesn't require configuring S3 storage in CDC connector . CDC to Redshift flow now uses settings set for S3 stage connection and no longer requires configuring S3 storage in CDC connector. Previously it was required to configure it in two places: Stage connection and CDC connection.
Important changes under the hood
AS400 CDC connector limits retries to errors.max.retries. When streaming CDC changes AS400 CDC connector now limits auto-retries to the value of property errors.max.retries. The default value is -1 which means no auto-retries. The previous behavior was to auto-retry forever in case of certain errors.
Important bug fixes
Fixed bug in single thread bulk load. In this release, we fixed a bug causing the bulk load flow to fail if the flow is configured to load data files in a single thread.
Version: 5.5.18
New functionality
We now automatically set the HTTP method (GET/POST/PUT/DELETE) when a user selects the Etlworks API Endpoint path from the dropdown. Read more.
It is now possible to override the default Change Tracking (CT) SQL generated by the flow, which supports CT data replication. Read more.
Execute SQL flow now supports opening a new database connection on each run instead of reusing the existing connection. It is specifically useful if you want to execute the SQL flow in a loop that updates database connection parameters (for example, database name) on each iteration. Read more.
CDC connectors now allow configuring the level of logging when the connection is set to capture transaction markers. Read more.
We added a new publicly available Etlworks API endpoint: Run flow by ID. It supports the same set of parameters as the previously available API endpoint Run flow by name. Unlike Run flow by name, Run flow by ID uses the flow ID, which never changes after the flow is created.
UX/UI improvements
Data grids in Explorer now display the row number.
Important changes under the hood
Our generic EDI format is now free for all users. This connector previously required an annual subscription. The connector supports the following EDI dialects: X12, EDIFACT, NCPDP, VDA, and HL7 2.x.
We extended the HTTP timeout for all connectors built on top of MS Graph API (for example, connectors for Sharepoint and OneDrive for Business) from the default 100 seconds to one hour. It allows executing long-running requests to the API without timing out, for example, when the flow creates a file with billions of rows.
We fixed an issue in the environments with the multiple symmetrical nodes when the disabled schedule was still active for 60 seconds.
Version: 5.5.6
New functionality
User request. In this release, we added nested mapping. It allows developers to create complex nested documents in all supported data exchange formats using a drag-and-drop mapping editor without writing a single line of code in scripting languages or SQL. Read about nested mapping.
User request. We have improved the Account Dashboard by adding the ability to see the aggregated stats and flow executions across multiple selected (or all) tenants. Read more. We also added an ability to filter flow executions by status and name.
Important changes under the hood
We have improved the recovery process in multi-node environments when the load balancer goes down or raises a timeout exception for long-running synchronous HTTP requests which trigger the flow execution.
In SQL Server to BigQuery ETL flows, we now automatically map SQL Server TIMESTAMP columns to INTEGER columns in BigQuery.
Important bug fixes
We have fixed a bug in the Postgres CDC connector introduced in the previous release (5.4.14). It was causing the connector to create the replication slot with the name calculated by appending the names of the monitored tables.
We have fixed a bug in the streaming flows (CDC to database and Queue to the database), which was causing the flow to execute the code for calculated fields twice when the flow is configured to MERGE data into the destination table.
Version: 5.4.14
New functionality
In this release, we have added the ability to patch flows. Flow patching allows you to copy changes from one flow to others across tenants. The primary use case for Patch functionality is when you want multiple copies of the same flow across different tenants and want to propagate changes made in any of them to other instances to keep them in sync. Read more about patching.
We have added an ability to change the order of files in the file loop. Read more about sorting order modifiers.
Connectors
We have added a new HTTP connection type for calling built-in and user-defined Etlworks APIs from the Etlworks Integrator. Read about Etlworks API connector.
We have added support for SSH-2 encryption keys and new key exchange (kex) algorithms to SFTP and SSH connectors. The connectors still support SSH-1 encryption keys and deprecated key algorithms.
The Excel XLSX connector now supports setting the starting column (in addition to the previously available starting row) when creating/updating Excel worksheets.
Noticeable new features of the CDC engine
Oracle CDC connector now supports the embedded Infinispan cache for the Log Mining buffer. Enabling this feature significantly reduces the memory footprint of the connector when processing long-running and large transactions. Read more.
It is now possible to preserve the structure of CDC events, including before/after/source and other elements when serializing CDC events as JSON files. Read more.
We have added data conversion functions to the CDC function processor. Read more.
UX/UI improvements
The code editor font family switched to monospace. The code editor now includes a font size selector.
Important changes under the hood
Improved Parquet schema auto-generation when creating Parquet files.
Added support for subfolders in file operations and source-to-destination transformations for SFTP, S3, Azure Storage, and Google Cloud Storage connectors: the TO part of the transformation can now include subfolders. Previously, subfolders would need to be defined at the connection level.
Reaching the Maximum number of iterations when executing the loop now stops the loop instead of throwing an exception.
Important bug fixes
Fixed an exception when processing a result set generated by executing an SQL query that contains multiple columns with the same name.
Fixed converting true/false values to 1/0 by CDC connector. Previously (if the option to convert is enabled), all columns with true/false values were converted to 1/0. The new behavior is to convert only columns with a boolean or bit data type.
Version: 5.3.2
UX/UI improvements
We have updated the UX/UI to make the experience more consistent across the board and to highlight the important UI elements, such as icons and navigational buttons. For example, all grids now have sticky headers. The scope of changes is relatively large, so we bumped the minor version of the release to 5.3.2. Note: Etlworks uses semantic versioning.
All CDC flows now support Select All/Deselect All
when selecting source tables for CDC.
Connectors
We have updated the SQL Server JDBC driver to 12.4, which is the latest version available at the time of the release.
IBMI (AS400) CDC connector has been graduated from betta to stable release.
We have updated SAP ERP connector.
New functionality
We have added a new configuration option Drop staging tables in first loop iteration,
to the flow that normalizes nested datasets as flat staging tables. Read more.
Important bug fixes
We fixed the race condition bug in MySQL CDC connector, which was causing issues when triggering ad-hoc snapshots using signaling collection and adding new tables.
This update is required for all self-hosted customers who use MySQL CDC connector.
Important changes under the hood
We have removed an option to configure legacy implementation for MySQL CDC connector. We kept it for a while to maintain backward compatibility with older Etlworks CDC implementations. We finally dropped it in this release.
Version: 5.2.5
Built-in change data capture (CDC) engine was upgraded to the latest Debezium 2.4. All improvements and bug fixes implemented since Debezium 2.2 final and Debezium 2.3 final were ported to Etlworks CDC engine.
Noticeable new features of the CDC engine:
- Parallel snapshots. Enabling parallel snapshots can improve the performance of the initial load and re-load by a factor of 10x.
- Ad-hoc snapshot switched to blocking mechanizm which significantly improves the performance.
- MongoDB connector now uses cluster URL instead of the list of individual replica sets or shards.
- MongoDB connector now supports ad-hoc snapshots.
- User request. Added CDC connector for AS400. Etlworks AS400 CDC connector is based on community driven open source project Debezium connector for IBMI . It uses the IBM I journal as a source of CDC events. The corrector is currently in betta.
We added configurable Force execution by the scheduler . Read more. The previous behavior was to always force execution which is still the case for schedules created prior to the update. All schedules created after the update disable the flag by default.
Added file path modifiers for the file loop. Read more.
All CDC pipelines where the destination is a cloud data warehouse (such as Snowflake, Redshift) now support Stage connection. The Stage connection is used to set the location of the files created by the connector and loaded by the pipeline into the cloud data warehouse. Perviously it was required to configure the storage and location in CDC connector. Supported Stage locations:
Version: 5.1.1
This is major release. All self-hosted customers are encouraged to update.
Improved upgrade process (Linux installer only)
In this update we introduced an ability to upgrade to the latest (default) or selected version. Read more.
We also added a new cli command which allows users to check which version is currently installed. Read more.
Important changes under the hood
We have upgraded multiple internal Java libraries to the versions which include important security fixes.
We have added an ability to enforce SSL encryption for Redis. It is specifically important when configuring Etlworks to run in multi-node AWS environment with AWS ElasticCache with in-transit encryption (TLS) . Read more.
We have added an ability to decrypt signed PGP messages.
It is now possible to use importPackage without performance penalty. We deprecated importPackage in the previous release but have un-deprecated it in 5.1.1. It is now implemented as an inline JavaScript function instead of part of larger dynamically loaded package (which was causing slowdowns).
New functionality
Excel XLSX connector now supports updating existing spreadsheets. Read more.
We have added OAuth authentication for Dropbox connector. Read more.
We have added an ability to use JavaScript to programmatically set variables for Before/After SQL in bulk load flows. Read more (the link is for Snowflake bulk load flow but it works identically for all other destinations).
We have added Dimension and Metrics filters for Google Analytics 4 connector. Read more.
We have added an ability to backup CDC history and offset files. Read more.
We have added OAuth authentication for OData connector. Read more.
We have added an ability to troubleshoot sending and receiving emails using inbound and outbound email connections. Read more.
Built-in email sender which is used for sending notifications now supports authentication with Microsoft (Office 365) and Google (Gmail). Read more.
We have added an ability to programmatically handle exceptions in nested flows. Read more.
Source-to-destination transformation where the destination is database now supports MERGE on Error exception handler. It allows scenarios when user wants to update record if insert failed.
Important bug fixes
We have fixed bug which was causing duplicates when using ETL with bulk load flows, configured to insert data into the table which does not exist.
We have fixed bug which was preventing the system to record flow and file metrics when executing highly nested flows.
We have fixed a bug which was causing the Snowflake bulk load flow not alter the table under specific edge conditions.
We have fixed bug in inbound and outbound email connectors which was causing striping character "=" from the password.
We have fixed path calculation in SMB Share connector.
We have fixed an error caused by using string when updating Postgres UUID field.
In this update we introduced major changes to embedded scripting engines (JavaScript and Python). Read more.
New functionality
MongoDB connectors now support SQL for extracting data. Read more.
It is now possible to configure email notifications when creating or modifying a schedule for the flow executed by Integration Agent. The email notifications also trigger the webhooks, if configured. The Agent must be updated to support this feature.
Connectors
In this update we added HTTP connectors for Google and Microsoft services which require or support an interactive OAuth2 authentication. Read more.
We added two new premium connectors:
- Google Ads
- Trello
Improvements under the hood
We modified the logic of the Snowflake bulk load flow when the Direct Load
is enabled and Use COPY INTO with a wildcard pattern
is disabled. The change is designed to better handle the situation when the source and the destination schemas are different.
Connectors
We have added free inbound (IMAP) and outbound (SMTP) email connectors for Office 365 (Exchange Online) and Gmail. All connectors use OAuth2 for authentication.
Improvements
We have improved the global search based on feedback from users. It now uses an algorithm that produces more accurate results based on a sequence of words in a search string. We also now display the relevancy score in the search results.
MySQL, SQL Server, Oracle and DB2 CDC connectors now automatically enable property
schema.history.internal.store.only.captured.tables.ddl
. It reduces the startup when the CDC connector needs to read a schema or database with a large number of tables (thousands). We were previously enabling property database.history.store.only.monitored.tables.ddl
which has been removed in Debezium 2.0.
New functionality
All flows optimized for Amazon Redshift now support native MERGE, which is currently in preview.
Improvements under the hood
Flows optimized for Snowflake and BigQuery now use native MERGE when Action
is set to MERGE
or CDC MERGE
. Before this update, the How to Merge
was set to DELELE/INSERT
by default.
Flows optimized for Snowflake are no longer querying the Snowflake to look up the primary key in the Snowflake table when Predict Lookup Fields
is enabled. It should improve the performance of flows which execute MERGE when the source table doesn't have a primary key or a unique index.
MongoDB extract flow now supports exacting by a wildcard.
Changes
The MongoDB
connector has been renamed to MongoDB document
. The MongoDB
streaming
connector has been renamed to the MongoDB
. Read more.
New functionality
We have significantly improved support for SQL Server Change Tracking. It is fully automated and no longer requires manually configuring SQL to capture changes. Read more.
The new configuration option String used to convert to SQL NULL
has been added to all Redshift flows. It allows configuring a string that is used for SQL NULL values. Setting this option is highly recommended for flows that stream CDC events into Redshift.
UX improvements
User request. We have added a global search. Read more.
Searchable attributes
- Object title
- Description
- Tags
- All user-editable fields
- Code
- Macros
Objects included in the Search results
Improvements under the hood
We improved the handling of the global variables, referenced as {tokens} in connections and transformations. Specifically, it is now possible to use multiple global vars in the same attribute. Example: {folder}/{filename}
.
Security patch
We have fixed two security vulnerabilities found in third-party libraries used by Etlworks. Read more.
Platforms
Etlworks now runs in Docker. Here is a link to the Etlworks image in Docker Hub. Our Docker image natively supports Intel and ARM64 silicons, meaning Etlworks now officially runs on Macs with M1 and M2 processors and Windows and Linux computers with ARM-based processors.
Our Windows installers for Etlworks and Integration Agent are now signed by the new software signing certificate, which should prevent the "Unverified vendor" warning when installing or updating the software.
Connectors
We have added a new read/write EDI connector that supports the majority of the EDI dialects, including X12, EDIFACT, NCPD, HL7, and VDA. Read more.
Our Excel XLS and XLSX connectors now support reading the worksheet names. It works in flows and Explorer. We also added an ability to read Excel worksheets that don't have a dedicated row for column names.
Important bug fixes and improvements under the hood
We have fixed a conversion issue when CDC connectors read zoned timestamps serialized as yyyy-MM-dd'T'HH:mm:ss.SS'Z'
.
We have fixed an issue with multipart upload into S3 when the software runs on Mac silicon.
We have fixed an issue with the "Database to Snowflake" flow type, preventing loading data from the S3 stage with subfolders.
We have fixed an issue preventing a DropBox connector from deleting files in folders other than the root folder.
We have improved error handling by CDC connectors.
February 28, 2023
Connectors
In this release, we have added two new connectors:
We also added the ability to connect to the sandbox account when using Salesforce with the OAuth2 connector.
Streaming from message queues
In this release, we have added the ability to stream real-time data from Kafka and Azure Event Hubs to practically any destination. Before this update, Etlworks only supported extracting data from queues in micro-batches.
New functionality
We significantly improved the streaming of the CDC events from the message queues to any supported destination:
- Streaming CDC events that were ingested by Etlworks CDC connector.
- Streaming CDC events that were ingested by standalone Debezium.
We have added preprocessors to Kafka and Azure Event Hubs connectors:
- Consumer preprocessor - use this preprocessor to change the message streamed from the queue.
- Producer preprocessor - use this preprocessor to modify the message added to the topic.
We have added the Posptocessor to the HTTP connector. The Postprocessor can be used to change the response content programmatically.
We have added a new configuration option to the CSV format, which allows reading CSV files with non-standard BOM characters.
We have added an option to automatically add an Excel worksheet name to the file name in Explorer and Mapping. It simplifies working with Excel files which include multiple worksheets. Read more.
We have added an option that allows the Excel connector to read data from worksheets that don't have a formal "columns" row.
We have added the ability to start the flow as a Daemon. Read more.
Important bug fixes and improvements under the hood
We improved the performance of the S3 SDK connector when reading the list of files available in the bucket.
We have fixed the edge case when the CDC Binlog reader was not disconnecting on error.
We have fixed the issue with using bind variables for columns with UUID data type.
Self-managed on-premises installers
Etlworks is a cloud-native application that works perfectly well when installed on-premises. Prior to this update, your would need to contact Etlworks support in order to receive a link to download an installer and a unique license generated for your organization.
In this update, we introduced a self-managed web flow that allows you to create a company account and download a fully automated installer for Linux or/and Windows. The installer includes a unique license generated for your organization. You can use the same installer to upgrade Etlworks Integrator to the latest version.
Supported operating systems are Amazon Linux 2, Ubuntu 18.04, Ubuntu 20.04, CentOS 7, Red Hat 7, Red Hat 9, Windows Server (2012-2022), and all editions of Windows Windows 10 and Windows 11.
Windows installer
It was always possible to run Etlworks on Windows. Still, unlike running Etlworks on Linux, it required manual installation of all components needed to run Etlworks, such as Java, Tomcat, Postgres, and Redis. In this release, we have added official support for all modern server and desktop versions of Windows.
- Install Eltworks Integrator on Windows.
- Automatically update Etlworks Integrator installed on Windows.
Connectors
In this release, we have added eleven new connectors:
- Clickhouse.
- Excel as database (premium).
- Microsoft Dynamics 365 (premium). This connector supports the following Dynamics editions: Sales, Customer Service, Field Service, Fin Ops, Human Resources, Marketing, Project Operations.
- Oracle Cloud CSM (premium).
- Oracle Cloud ERP (premium).
- Oracle Cloud HCM (premium).
- Oracle Cloud Sales (premium).
- Monday (premium).
- JDBC-ODBC bridge (premium). This connector allows you to access ODBC data sources from Etlworks.
- FHIR as database (premium).
- GraphQL (premium).
We have also updated the following existing connectors:
- Upgraded Google BigQuery JDBC driver to the latest available from Google.
- Added ability to login with Azure quest user to SQL Server connector. Read more.
- Added API Token and Basic Authentication to premium Jira and Jira Service Desk connectors.
Upgraded CDC engine
We have upgraded our CDC engine (Debezium) from 1.9 to the latest 2.1.
New functionality
- User request. It is now possible to add named connections to all source-to-destination flows. Read more.
- It is now possible to override the command which executes the Greenplum gpload utility. Read more.
- It is now possible to connect to a read-only Oracle database when streaming data using CDC. Read more.
- We have added more configuration options for CSV and JSON files created by CDC flows. Read more.
- We have improved logging for CDC connectors when capturing transaction markers is enabled. Read more.
- We have improved logging for loops by adding begin/end markers.
Important bug fixes
- SSO JWT expiration is now the same as regular JWT expiration (which is configurable by end-users). Before this fix, customers with enabled SSO were experiencing frequent logouts under certain conditions.
- We fixed an issue with FTPS connector, which was unable not connect if FTPS server was running behind the load balancer or proxy.
- We fixed an edge case when AWS credentials were exposed in the flow log when the Snowflake flow failed to create the Snowflake stage automatically.
UX improvements
It is now possible to quickly create Connections, Listeners, Formats, Flows, Schedules, Agents, Users, Tenants, and Webhooks from anywhere within the Etlworks UI without switching to a different window. Read more.
New functionality
We significantly improved support for PGP encryption:
- It is now possible to generate a pair of PGP keys using a designated flow type. Read more.
- All Etlworks file storage connectors now support automatic decryption of the encrypted files during ETL operations. Read more.
We improved the mapping when working with nested datasets. It now supports the case when the source is a nested document, but you only need data from the specific dimension. Read more.
Connectors
We added OAuth authentication (Sign in with Microsoft) to our Sharepoint storage and OneDrive for Business connectors.
We added a Stripe premium connector.
We upgraded the built-in SQLite database from version 3.34.0.0 to the latest version 3.40.0.0. SQLite is used as a temporary staging db. Read more about SQLite releases.
Documentation
We completely rewrote a section of the documentation related to working with nested datasets.
UX improvements
It is now possible to create Connections, Formats, and Listeners right in the Flow editor without switching to the Connections window. Read more.
New functionality
Etlworks now supports Vertica as a first-class destination and as a source. Read more.
We have added point-to-point Change Data Capture (CDC) flows for multiple destinations. After this update, you can create a CDC pipeline using a single flow instead of separate extract and load flows.
- Change Data Capture (CDC) data into Snowflake.
- Change Data Capture (CDC) data into Amazon Redshift.
- Change Data Capture (CDC) data into BigQuery.
- Change Data Capture (CDC) data into Synapse Analytics.
- Change Data Capture (CDC) data into Vertica.
- Change Data Capture (CDC) data into Greenplum.
- Change Data Capture (CDC) data into any relational databases.
- Change Data Capture (CDC) data into any relational databases using bulk load.
We have added bulk load flows for several analytical databases:
- Bulk load files in Google Cloud Storage into BigQuery
- Bulk load files into Vertica
- Bulk load files in server storage into Greenplum
All flows optimized for Snowflake now support the automatic creation of the internal stage and external stage on AWS S3 and Azure Blob. Read more.
Changes under the hood
We improved the reliability of the message queue in the multi-node environment.
This is a required update.
We have fixed the memory leak in the Amazon S3 SDK connector. We also fixed a similar memory leak in AWS-specific connectors (Kinesis, SQS, RebitMQ, ActiveMQ) which use IAM role authentication. All instances managed by Etlworks have been upgraded. Self-hosed customers are highly advised to upgrade as soon as possible. Customers which have Integration Agents are encouraged to update the agents as well.
UX improvements
It is now possible to manage the Etlworks billing account and subscriptions from the Etlworks app. Read more.
It is now possible to access this changelog from the Etlworks app. Read more.
It is now possible to search for information in the Documentation and submit support requests from the Etlworks app. Read more.
It is now possible to resize all split panels (such as in Explorer, Connections, etc.) further to the right. It allows users to see long(er) filenames, connection names, and other objects.
We have added links to the running flows in the Suspend flow executions window.
We now display a warning message when a user is trying to create a non-optimized flow when the destination is Snowflake, Amazon Redshift, Synapse Analytics, Google BigQuery, or Greenplum. The warning message includes a link to the relevant article in the documentation.
New functionality
We have added a new flow type that can be used to create dynamic workflows which change based on user-provided parameters. Read more.
It is now possible to enter secure parameters (passwords, auth tokens, etc.) when adding parameters for running flows manually, by the scheduler, and by Integration Agent.
It is now possible to split CSV files using a user-defined delimiter or regular expression. Read more.
We have added the following new configuration options for the SMB share connector: SMB Dialect
, DSF Namespace
, Multi Protocol Negotiation
and Signing Required
. Read more.
CDC flows now never stop automatically unless they stopped manually or fail. Read more. Note that the behavior of the previously created CDC flows did not change.
We have improved the algorithm for creating the transaction markers by CDC flows. They now use the actual start/commit/rollback events emitted by the source database. Previously we were using the change of the transaction id as a trigger. It was creating a situation where the flow was waiting for a new transaction to start before creating an "end of previous transaction" event.
It is now possible to use flow variables as {parameters} in transformations and connections. Previously only global variables could be used to parameterize transformations and connections.
It is now possible to change the automatically generated wildcard pattern when bulk-loading files by the wildcard into the Snowflake and Synapse Analytics.
User request. A new option has been added to the Synapse Analytics bulk load flow which allows creating a new database connection for loading data into each Synapse table. Read more.
We have updated the HubSpot connector which now supports new authorization scopes introduced by HubSpot in August.
Various bug fixes and performance improvements under the hood.
Documentation
We have added new tutorials for creating CDC pipelines for loading data into Snowflake, Amazon Redshift, Azure Synapse Analytics, Google BigQuery and Greenplum. Read more.
Our redesigned main website (https://etlworks.com) went live.
MySQL CDC connector now supports reading data from the compressed binlog. Read more.
It is now possible to disable flashback queries when configuring the Oracle CDC connection. This could greatly improve the performance of the snapshot in some environments. Read more.
CDC connectors can now be configured to capture NOT NULL
constraints. Read more.
Legacy S3 and Azure Storage connectors have been deprecated. The existing legacy connections will continue to work indefinitely but new connections can only be created using S3 SDK and Azure Storage SDK connectors.
Bulk load flows are now ignoring the empty data files.
Bulk load flows which are loading data from Azure Storage now support traversing all subfolders under the root folder.
The flows that extract data from the nested datasets and create staging tables or files can be now configured to not create tables for dimensions converted to strings. Read more.
User request. It is now possible to add record headers when configuring Kafka and Azure Events Hubs connections. Record headers are key-value pairs that give you the ability to add some metadata about the record, without adding any extra information to the record itself.
The BigQuery connector now maps the ARRAY data type in the source database (for example Postgres) to STRING in BigQuery.
Fixed bug which was causing a recoverable NullPointerException (NPE) when saving flow execution metrics.
Various bug fixes and performance improvements under the hood.
Single Sign On (SSO) is now available to all Etlworks Enterprise and On-Premise customers. Read more.
We have added a bulk load flow for loading CSV and Parquet files in Azure Storage into Azure Synapse Analytics. It provides the most efficient way of loading files into Synapse Analytics. Read more.
We have optimized loading data from MongoDB into relational databases and data warehouses such as Snowflake, Amazon Redshift, and Azure Synapse Analytics. It is now possible to preserve the nested nodes in the documents stored in MongoDB in the stringified JSON format. Read more.
The Flow Bulk load files into Snowflake
now supports loading data in JSON, Parquet, and Avro files directly into the Variant
column in Snowflake. Read more.
The Override CREATE TABLE using JavaScript now supports ALTER TABLE as well. Read more.
It is now possible to connect to Snowflake using External OAuth with Azure Active Directory. Read more.
The Azure Events Hubs connector now supports compression. Read more.
The Flows Executions Dashboard now displays the aggregated number of records processed by the specific flow on the selected day. It could be useful when monitoring a number of records processed by the CDC pipeline, which typically includes 2 independent flows (each with its own record tracking capabilities): extract and load.
We have improved the Flow which creates staging tables or flat files for each dimension of the nested dataset. It is now possible to alter the staging tables on the fly to compensate for the variable number of columns in the source. We have also added the ability to add a column to each staging table/file that contains the parent node name. Read more.
It is now possible to authenticate with SAS token and Client Secret when connecting to Azure Storage using the new Azure Storage SDK connector. Note that the legacy Azure Storage connector also supports authentication with SAS token but does not support Client Secret.
We have updated the Sybase JDBC driver to the latest version.
It is now possible to use global variables when configuring parameters for split file flows.
We have fixed the soft deletes with CDC. This functionality was broken in one of the previous builds.
User request. It is now possible to override the default Create Table SQL generated by the flow.
User request. The Flow Executions dashboard under the Account dashboard now includes stats for flows executed by the Integration Agent.
User request. It is now possible to use global and flow variables in the native SQL used to calculate the field's value in the mapping.
It is now possible to filter flows associated with the Agent by name, description, and tags.
It is now possible to configure and send email notifications from the Etlworks instance for flows executed by the Agent.
It is now possible to bulk load CSV files into the Snowflake from the server (local) storage. Previously it was only possible to bulk load files into the Snowflake from the S3, Azure Blob, or Google Cloud storage. The flow Load files in cloud storage into Snowflake
was renamed to Bulk load files into Snowflake
. Note that it was always possible to ETL files into the Snowflake from server (local) storage.
The flow Bulk load CSV files into the Snowflake now supports loading files by a wildcard pattern in COPY INTO and the ability to handle explicit CDC updates when the CDC stream includes only updated columns.
MySQL CDC connector now supports useCursorFetch property. When this property is enabled the connector is using the cursor-based result set when performing the initial snapshot. The property is disabled by default.
All CDC connectors now test the destination cloud storage connection before attempting to stream the data. If the connection is not properly configured the CDC flow stops with an error.
The Debezium has been upgraded to the latest 1.9 release.
It is now possible to add a description and flow variables to the flows scheduled to run by the Integration Agent. Read about parameterization of the flows executed by Integration Agent.
We have added a new premium Box API connector.
Snowflake, DB2, and AS400 JDBC drivers have been updated to the latest and greatest.
We introduced two major improvements for Change Data Capture (CDC) flows. The previously available mechanism for the ad-hoc snapshots using a read/write signal table in the monitored schema has been completely rewritten.
- It is now possible to add new tables to monitor and snapshot by simply modifying the list of the included tables. Read more.
- It is now possible to trigger the ad-hoc snapshot at runtime using a table in any database (including a completely different database than a database monitored by CDC flow) or a file in any of the supported file storage systems: local, remote, and cloud.
Webhooks now support custom payload templates. The templates can be used to configure integration with many third-party systems, for example, Slack.
We added a ready-to-use integration with Slack. It is now possible to send notifications about various Etlworks events such as flow executed, flow failed, etc., directly to the Slack channel.
The S3 SDK connector now supports automatic pagination when reading files names by a wildcard.
Amazon Marketplace connector now supports Sign in with Amazon
and Selling Partner API (SP-API). MWS API has been deprecated and is no longer available when creating a new connection.
Magento connector now supports authentication with Access token.
Etlworks Integrator now supports Randomization and Anonymization for various domains, such as names, addresses, Internet (including email), IDs, and many others.
We added a new flow type: Bulk load files in S3 into Redshift. Use the Bulk Load Flow when you need to load files in S3 directly into Redshift. This Flow is extremely fast as it does not transform the data.
The Redshift driver now automatically maps columns with a SMALLINT
and TINYINT
data types to INTEGER
. It fixes the issue when Redshift is unable to load data into the SMALLINT
column if the value is larger than 32767
.
CSV connector can now read the gzipped files. It works in Explorer as well.
The connector for fixed-length format can now parse the header and set the length of each field in the file automatically. Read more.
It is now possible to override the default key used for encryption and decryption of the export files.
Users with the operator
role can now browse data and files in Explorer.
It is now possible to override the storage type, the location, the format, whether the files should be gzipped, and the CDC Key set in the CDC connection using TO-parameters in source-to-destination transformation. Read more.
We added a new S3 connector created using the latest AWS SDK. It is now a recommended connector for S3. The old S3 connector was renamed to Legacy
. We will keep the Legacy connector forever for backward compatibility reasons.
It is now possible to see and cancel actions triggered by the end-user to be executed in Integration Agent. When the user triggers any action, such as Run Flow, Stop Flow, Stop Agent the action is added to the queue. The actions in a queue are executed in order on the next communication session between the Agent and the Etlworks Integrator. Read more.
Comments
0 comments
Please sign in to leave a comment.