In Etlworks Integrator, it is possible to read and write data in numerous data exchange formats, such as CSV, JSON, XML, Excel, etc. The actual files may be in file storage, cloud storage, key-value storage, a NoSQL database or on an email server.
Connecting to data storage
- Amazon S3
- Google Cloud Storage
- Microsoft Azure Storage
- Server Storage
- Inbound email
- Outbound email
Testing a connection
Test Connection button on the connection screen to test the connection.
Test Connection is only available for Cloud and File storage. In addition to actually connecting to the storage, Integrator attempts to read file names using a configured folder and file name. Integrator supports wildcard file names.
When working with files it is required that you describe the data exchange format. Etlworks Integrator supports the most commonly used formats:
Browsing files and viewing data
Use the Explorer to browse files and view data.
Step 1. Create a file connection.
Step 2. Choose the format.
Step 3. Open Explorer, select the connection created in Step 1 and link it to the format chosen in Step 2.
Step 4. Explore the metadata (files and fields), view data in a grid, query data and discover dependencies using SQL.
- Copy, move, rename, delete, zip, and unzip files
- Transforming XML files using XSLT
- Read how to split files.
- Read about merging files.
Extracting, transforming and creating files
Expose file as an API endpoint
- Read how to expose a dataset as an API endpoint.
Tips and tricks when working with files
- Modifying the source before passing it to the transformation
- Working with local files
- Using server storage as a staging area
- Process the file and then delete it
- Using a field value as a filename
- Send a file as a payload for an HTTP endpoint
- Using templates to define fields
- Handling field names with special characters
- Handling columns with a blank name
- Working with duplicated column names
- Calculating field value
- Reading non-standard CSV files
- Splitting the datasets into multiple files based on the value of a column
- Adding a custom header to a CSV file
- Disabling encoding CLOB fields in CSV files
- Configuring XML parser
- Using XSLT when reading and writing XML documents
- Reading an Excel worksheet with a header not in the first line
- Using an Excel worksheet index when reading Excel workbooks
- Using an Excel worksheet name as a field value
- Working with region-specific S3 buckets
- Archive files before copying them to a cloud storage
- Performance tips when working with files