Overview
With Etlworks, it is possible to create a Connection to a temporary, in-memory database and use it as a staging area for the data. The SQLite engine is used for these temporary databases.
Cleaning up or dropping the tables in the temp database is not required. They will be automatically dropped when the Flow is executed, and the connection to the temp database is closed.
Use cases
Store the source document in the staging database: Transform using SQL
You can store the source file or a response from a web service in the staging database and then use SQL, running against the temp database, to transform the data.
Step 1. Create a Connection to a temporary database.
Step 2. Create a Flow where the source is a file or a web service, and the destination is a Connection created in step 1. Let the system automatically create the staging table.
Step 3. Combine the Flow created in Step 2 with other Flows to transform the source.
Store multiple source documents: Create a single output file in a different Format
To process multiple source data sets, follow the steps below.
Step 1. Create a Connection to a temporary database.
Step 2. Create a Flow that reads multiple source files and inserts data into the staging database using the Connection created in Step 1.
Step 3. Create a Flow that reads data from the staging database and creates an output file.
Step 4 Create a nested Flow, which combines the Flows created in Steps 2 and 3.
Use a temporary database in a database loop
Read about the database loop.
Step 1. Create a Connection to a temporary database.
Step 2. Create a Flow that loads data into the staging database (for example, read multiple source files and insert data into the staging database) using the Connection created in Step 1.
Step 3. Create a database loop using the Connection created in Step 1.
Switching between temporary database and staging database with data retained between runs
One of the common use cases when developing flows is to use a SQLite connection for staging data instead of temporary DB connection. Reason: unlike temp db connection, with SQLite connection as a destination or source it is possible to see the data in the database after the flow is executed. When the flow is ready for production the developer needs to change all staging connections back to temp db. Which can be a lot of places if the flow is complex.
Here is a recommended approach which greatly simplifies the process:
Step 1. Instead of using temp db connection start with a staging SQLite connection. Use it everywhere you would normally use temp db connection.
Use unique name for the connection and for the database so it won't interfere with other flows and developers.
Step 2. If the staging connection is used as a destination you can optionally add a step which deletes the database file in first step. It will simulate the behavior of then temp db connection when the flow starts. You can use the Delete files flow type to delete the database file. For example, if the database for SQLite connection is configured as {app.data}/test-staging.dat
you can create a Delete files flow where the Connection is a Server storage connection pointed to app.data (default) and the From file name is test-staging.dat
:
Step 3. When you are ready to move the flow to production modify the SQLite connection to make it temporary and in-memory: set URL to jdbc:sqlite:
and leave the Database empty:
You can always switch back to staging db by modifying the SQLite connection again.
Comments
0 comments
Please sign in to leave a comment.