In the Etlworks Integrator, it is possible to read and write data in numerous data exchange Formats, such as CSV, JSON, XML, Excel, etc. The actual files may be in file storage, cloud storage, key-value storage, a NoSQL database, or an email server.
What can you do with files in Etlworks Integrator
Copy, move, rename, delete, zip, and unzip files, create folders
The Etlworks Integrator can work with files directly.
Start creating the Flow that splits files by opening the
Apply XSL style sheet to XML files
Start creating the Flow by typing
ETL with files
Extract, transform and load data when the file is a source or destination.
How to extract data from any file and load it into any database
How to work with files without transforming data
The Etlworks Integrator supports all sorts of file operations, which can be performed on almost all types of Connections, except relational databases.
How to convert any file to any Format
Automatically encrypt files in server storage
Create a Flow where the destination is a file (JSON or XML).
Filter and modify rows in a source CSV file
You can filter out or modify some of the rows in the CSV file before passing the data to the transformation.
For Web/HTML scraping, Etlworks Integrator includes a Java library jsoup. It is one of the best HTML parsers around.
Connect to a data storage, NoSQL database, or email server
- Amazon S3
- Google Cloud Storage
- Microsoft Azure Storage
- Server Storage
- Google Drive
- OneDrive for Business
- Inbound email
- Outbound email
Test a Connection
To test a Connection, click
Test Connection on the
Connection screen. This is only available for cloud and file storage.
In addition to actually connecting to the storage, the Etlworks Integrator attempts to read file names using a configured folder and file name. The Etlworks Integrator supports wildcard file names.
When working with files, it is required that you describe the data exchange Format. The Etlworks Integrator supports the most commonly used Formats:
- Fixed Length Text
- JSON dataset
- XML dataset
- HL7 2.x
- HL7 FHIR
- X12 and EDIFACT Formats
- Key=Value Format
- CLOB Format
Browse files and view data
Use the the Etlworks Explorer to browse files and view data.
Step 1. Create a file Connection.
Step 2. Choose a Format.
Step 3. Open the Etlworks Explorer, select the Connection created in Step 1 and link it to the Format chosen in Step 2.
Step 4. Explore the metadata (files and fields), view data in a grid, query data, and discover dependencies using SQL.
- Copy, move, rename, delete, zip, and unzip files.
- Transform XML files using XSLT.
- Read how to split files.
- Read about merging files.
ETL with files
Expose file as an API endpoint
- Read how to expose a dataset as an API endpoint.
Tips and tricks when working with files
- Working with local files
- Using server storage as a staging area
- Processing files in a folder by a wildcard filename
- Processing the file and then deleting it
- Reading and writing files from/to specific folder using the same Connection
- Reading from and writing into different Excel worksheets using the same Excel Format
- Reading from and writing into different worksheets using the same Google Sheets Connection
- Filtering rows in a flat file
- Disabling automatic setting of the field data type when working with files
- Working with very large files
- Merging files
- Reading non-standard CSV files
- Splitting a CSV file when creating it
- Adding a custom header to a CSV file
- Disabling encoding CLOB fields in CSV files
- Configuring XML parser
- Reading an Excel worksheet with a header, not in the first line
- Using an Excel worksheet index when reading Excel workbooks
- Working with region-specific S3 buckets
- Archiving files before copying them to a cloud storage
- Adding metadata when creating files in S3 and Google Storage