In Etlworks, it is possible to read and write data in numerous data exchange Formats, such as CSV, JSON, XML, Excel, etc. The actual files may be in file storage, cloud storage, key-value storage, a NoSQL database, or an email server.
What can you do with files in Etlworks
Copy, move, rename, delete, zip, and unzip files, create folders Etlworks can work with files directly. |
ETL with files Extract, transform and load data when the file is a source or destination. |
Apply XSL style sheet to XML files Start creating the Flow by typing |
Split/Merge files Start creating the Flow that splits files by opening the |
Work with HL7 messages Etlworks fully supports all existing HL7 Formats and protocols. |
Work with EDI documents Etlworks supports X12, Edifact and other EDI formats |
Videos
Working with Files in Etlworks Watch how to create ETL flows for files in various formats and also how to copy, move, delete, and rename files |
|
How to extract data from any file and load it into any database
|
How to work with files without transforming data Etlworks supports all sorts of file operations, which can be performed on almost all types of Connections, except relational databases. |
|
How to convert any file to any Format
|
Related resources
Automatically encrypt files in server storage ETL Flows that create files in the server storage can automatically encrypt all created files using the PGP algorithm. |
Create XML and JSON documents using JavaScript Create a Flow where the destination is a file (JSON or XML). |
Filter and modify rows in a source CSV file You can filter out or modify some of the rows in the CSV file before passing the data to the transformation. |
Web/HTML scraping with JavaScript and Python For Web/HTML scraping, Etlworks includes a Java library jsoup. It is one of the best HTML parsers around. |
Related case studies
Integrate data from CSV files, SQL Server databases, and APIs into a centralized cloud system, leveraging remote agents and reusable templates for rapid deployment across hundreds of customer environments. |
OpenGov, a leading provider of cloud software for local governments, needed a scalable solution to integrate data from customer environments into their centralized cloud system. With Etlworks, OpenGov implemented hundreds of highly complex pipelines featuring multi-step workflows, robust error handling, and integration with Grafana for real-time monitoring. Leveraging Etlworks’ flexible architecture and professional services, they developed reusable templates that reduced deployment time for new customers by 90%. |
Process XML and JSON files into MySQL, copy images from XML to S3 dynamically, and connect to OData data sources. |
Leading Real Estate Companies of the World (LeadingRE) is a global network of premier real estate firms. With hundreds of partners providing data in various formats, LeadingRE needed a robust, flexible, and scalable solution to manage data integration, process images, and connect to external data sources efficiently. |
Integrate cloud storage, APIs, Salesforce, SharePoint, Exchange, Google BigQuery, and more into Snowflake. |
Capital Rx, a leading healthcare technology company revolutionizing pharmacy benefits, leverages Etlworks to integrate diverse data sources into Snowflake. By seamlessly connecting cloud storage systems, APIs, Excel, Salesforce, SharePoint, Exchange, Google BigQuery, and more, Etlworks enables Capital Rx to maintain a unified data ecosystem, supporting advanced analytics and decision-making. |
|
Connect to a data storage, NoSQL database, or email server
- Amazon S3
- Google Cloud Storage
- Microsoft Azure Storage
- Server Storage
- FTP
- FTPS
- SFTP
- Box
- Dropbox
- Google Drive
- OneDrive for Business
- SharePoint
- WebDAV
- SMB Share
- Inbound email
- Outbound email
- Redis
- MongoDB
Test a Connection
To test a Connection, click Test Connection
on the Connection
screen. This is only available for cloud and file storage.
In addition to actually connecting to the storage, Etlworks attempts to read file names using a configured folder and file name. Etlworks supports wildcard file names.
Formats
When working with files, it is required that you describe the data exchange Format. Etlworks supports the most commonly used Formats:
- CSV
- Fixed Length Text
- JSON
- JSON dataset
- XML
- XML dataset
- Excel
- HL7 2.x
- HL7 FHIR
- X12 and EDIFACT Formats
- Key=Value Format
- Avro
- Parquet
- CLOB Format
- HTML
Browse files and view data
Use the the Etlworks Explorer to browse files and view data.
Step 1. Create a file Connection.
Step 2. Choose a Format.
Step 3. Open the Etlworks Explorer, select the Connection created in Step 1 and link it to the Format chosen in Step 2.
Step 4. Explore the metadata (files and fields), view data in a grid, query data, and discover dependencies using SQL.
File operations
- Copy, move, rename, delete, zip, and unzip files.
- Transform XML files using XSLT.
- Read how to split files.
- Read about merging files.
ETL with files
Expose file as an API endpoint
- Read how to expose a dataset as an API endpoint.
Tips and tricks when working with files
Here are some tips and tricks when working with files.
Comments
0 comments
Please sign in to leave a comment.