Amazon Redshift is a fast, fully-managed data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing Business Intelligence (BI) tools. Read more about Amazon Redshift. Etlworks Integrator fully supports Amazon Redshift.
What can you do with Amazon Redshift in Etlworks Integrator
Load data into Amazon Redshift Set the Flow type and source-to-destination transformations parameters |
Load multiple tables by a wildcard name You can load multiple database objects (tables and views) by a wildcard name (without creating individual source-to-destination transformations) |
Setup incremental change replication using change data capture (CDC) Once the CDC is configured for the source database, you can create a CDC pipeline where the source is one of these databases, and the destination is Amazon Redshift. |
Setup Change Replication using high watermark As in any other Flow type, it is possible to configure a change replication using high watermark |
Related resources
Extract, transform, and load data in Amazon Redshift Using Redshift-optimized Flows, you can extract data from any supported sources and load it directly into Redshift. |
Directly load files in Redshift This Flow loads CSV files directly into Redshift. The files can be anywhere, so long as Etlworks Integrator can read them. |
ELT with Amazon Redshift The Etlworks Integrator supports executing complex ELT scripts directly in Redshift, which greatly improves the performance and reliability of the data ingestion.
|
Work with Amazon Redshift as a relational database You can use any |
Data type Mapping for Redshift It is important to understand how we map various JDBC data types for the Redshift data types. |
|
Related case study
Professional social network Load data into the Amazon Redshift from multiple sources
|
"A typical CDC Flow can extract data from multiple tables in multiple databases, but having a single Flow pulling data from 55000+ tables would be a major bottleneck as it would be limited to a single blocking queue with a limited capacity. It would also create a single point of failure." |
Configure Redshift
Configure the Firewall
Typically, the TCP port 5439
is used to access Amazon Redshift. If Amazon Redshift and Eltworks Integrator are running on different networks, it is important to enable inbound traffic on port 5439
.
Learn how to open an Amazon Redshift port for inbound traffic.
Configure permissions for Redshift
- The user used to access the Redshift must have
INSERT
privilege for the Amazon Redshift table. - If you are loading data into Redshift from an Amazon S3 bucket, you will need to grant access to S3 for the Redshift user.
- For access to Amazon S3 with the ability to use
COPY
andUNLOAD
, choose either theAmazonS3ReadOnlyAccess
role or theAmazonS3FullAccess
role.
Connect to Redshift
To work with Redshift, you will need to create a Connection to Amazon Redshift.
Comments
0 comments
Please sign in to leave a comment.