Redshift is a columnar database, optimized for data retrieval operations, so loading data into it using DML statements (INSERT/UPDATE/DELETE) can be quite slow, especially for larger datasets.
While it is possible to load data in Redshift using regular Flows, such as
Database to database, it is highly recommended to use Redshift-optimized Flow.
A typical Redshift Flow performs the following operations:
- Extracts data from the source.
- Creates CSV files.
- Compresses files using the gzip algorithm.
- Copies files into Amazon S3 bucket.
- Checks to see if the destination Redshift table exists, and if it does not, creates the table using metadata from the source.
- Dynamically generates and executes the Redshift COPY command.
- Cleans up the remaining files, if needed.
Read how to efficiently load large datasets into Redshift with the Etlworks Integrator.