Overview
Amazon Redshift is a fast, fully-managed data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing Business Intelligence (BI) tools. Read more about Amazon Redshift. Etlworks includes several flows optimized for Amazon Redshift.
Flows optimized for Redshift
Flow type | When to use |
|
When you need to extract data from any source, transform it and load it into Redshift. |
Bulk load files in S3 into Redshift | When you need to bulk-load files that already exist in S3 without applying any transformations. The flow automatically generates the COPY command and MERGEs data into the destination. |
Stream CDC events into Redshift | When you need to stream updates from the database which supports Change Data Capture (CDC) into Redshift in real-time. |
Stream messages from queue into Redshift | When you need to stream messages from the queue which supports streaming into Redshift in real-time. |
COPY files into Redshift | When you need to bulk-load data from the file-based or cloud storage, API, or NoSQL database into Redshift without applying any transformations. This flow requires providing the user-defined COPY command. UnlikeBulk load files in S3 into Redshift, this flow does not support automatic MERGE. |
Videos
ETL and CDC data into Redshift Watch how to create flows to ETL and CDC data into Amazon Redshift |
Related resources
Reverse ETL with Amazon Redshift You can use any |
ELT with Amazon Redshift Etlworks supports executing complex ELT scripts directly in Redshift, which greatly improves the performance and reliability of the data ingestion. |
Data type Mapping for Redshift It is important to understand how we map various JDBC data types for the Redshift data types.
|
Configure Redshift Configure permissions and firewall. |
Connect to Redshift Create a connection for the Redshift cluster. |
Load multiple tables by a wildcard name You can ETL data from multiple database objects (tables and views) into Redshift by a wildcard name without creating individual source-to-destination transformations. |
Setup Change Replication using a high watermark (HWM) Using HWM replication you can load only new and updated records into Redshift. |
|
|
Related case studies
Collect half a billion records daily from SQL Server, Marketo, Salesforce, and Smartsheet into Redshift in real time.
|
Sermo, a global medical community platform, needed to aggregate massive volumes of data from diverse sources, including SQL Server databases, SaaS platforms like Marketo, Salesforce, and Smartsheet, and load it into Amazon Redshift in near real-time. Within two weeks of subscribing to Etlworks, Sermo was collecting and processing over half a billion records daily—an achievement made possible by Etlworks’ advanced features and unparalleled support. |
Convert X12 messages to JSON and Parquet, load them into Redshift, and integrate Salesforce.
|
Xsolis, a leader in healthcare data analytics, leverages Etlworks to manage their complex data integration needs. Etlworks powers XSOLIS to seamlessly process hundreds of thousands of massive X12 messages—each tens of megabytes in size—by converting them into JSON and Parquet formats and loading them into Amazon Redshift. Additionally, XSOLIS uses Etlworks to streamline integrations with Salesforce, enabling a unified data ecosystem. |
Configure Redshift
Configure the Firewall
Typically, the TCP port 5439
is used to access Amazon Redshift. If Amazon Redshift and Eltworks are running on different networks, it is important to enable inbound traffic on port 5439
.
Learn how to open an Amazon Redshift port for inbound traffic.
Configure permissions for Redshift
- The user used to access the Redshift must have
INSERT
privilege for the Amazon Redshift table. - If you are loading data into Redshift from an Amazon S3 bucket, you will need to grant access to S3 for the Redshift user.
- For access to Amazon S3 with the ability to use
COPY
andUNLOAD
, choose either theAmazonS3ReadOnlyAccess
role or theAmazonS3FullAccess
role.
Connect to Redshift
To work with Redshift, you will need to create a Connection to Amazon Redshift.
Comments
0 comments
Please sign in to leave a comment.