Etlworks uses the Snowflake
COPY INTO command to load data into Snowflake tables.
COPY INTO requires a named internal or external stage. Stage refers to the location where your data files are stored for loading into Snowflake.
Etlworks flows optimized for Snowflake can automatically create an
internal named stage and
external named stage in AWS S3 and Azure Blob.
The flow cannot automatically create a stage in Google Cloud storage. You will need to create a stage in Google Cloud storage manually.
Step 1. Set the
or in the source-to-destination Transformation when configuring the Flow which loads data in Snowflake:
The Stage name set at the transformation level overrides the Stage name set for the connection. Snowflake flow cannot be created if the Stage name is not set in either of these two places.
Step 2. Make sure that the parameter
Check if stage exists set for the Snowflake connection is enabled. Note that it is enabled by default.
How it works
Check if stage exists is enabled the flow performs two extra steps before loading data into Snowflake:
- The flow checks if the stage with the given name and parameters exists. When checking parameters the flow verifies the type of the stage (internal, external), the location of the stage (bucket or blob), and the credentials used to create the stage.
- If the stage with the given name and parameters does not exist the flow automatically creates the stage.
We recommend keeping the
Check if stage exists enabled. The Snowflake
COPY INTO command will not fail if the stage exists but is not configured properly, for example not pointed to the right bucket. In this case, the flow will not load any data.
Handling errors and edge cases
Create a stage manually and slightly improve the performance of the load
You can slightly improve the performance of the load by creating the stage manually and disabling
Check if stage exists.
Working with stages in Google Cloud Storage
Etlworks flows, optimized for Snowflake cannot automatically create a stage in Google Cloud storage. You will need to create it manually.
Ignore errors when the Flow cannot create the stage
Check if stage exists is enabled and the stage with the given name and parameters does not exist the flow will attempt to create it. If an attempt to create the stage fails the flow fails as well. In some rare cases when the SQL for loading data is provided as a part of the flow (instead of using the automatically generated
COPY INTO command) you can configure the Snowflake connection to ignore errors generated when the flow cannot create the stage by disabling
Fail load if flow cannot create stage.
Create stage manually
Read how to manually create a named Snowflake stage: