ELT with Databricks – Etlworks Support

Once data is loaded into Databricks, the heavy transformation work usually runs inside Databricks itself — Spark SQL is fast, Delta tables support efficient UPDATEs and MERGEs, and the data never leaves the lakehouse. Etlworks supports two ELT patterns: Before / After SQL on a regular load flow, and a dedicated SQL flow.

ELT using Before and After SQL

Create a flow to load data into Databricks. Let Etlworks auto-create the staging Delta table.
On the transformation, use Before COPY SQL and After COPY SQL to run Spark SQL in the destination Databricks workspace. Multiple ;-separated statements are supported.

Typical post-load steps:
```
OPTIMIZE analytics.gold.orders ZORDER BY (customer_id);
VACUUM analytics.gold.orders RETAIN 168 HOURS;
ANALYZE TABLE analytics.gold.orders COMPUTE STATISTICS FOR ALL COLUMNS;
```
Optionally configure the flow to ignore errors when running Before / After SQL.
Schedule the flow to run periodically.

ELT using an SQL flow

For multi-step transformations that don't belong to a single load flow, build a dedicated SQL flow targeting the Databricks connection. Multiple ;-separated statements are supported. Combine the load flow and the SQL flow into a single nested flow.

This pattern works well for medallion architectures (bronze → silver → gold) where each stage is a separate flow chained in order.

Articles in this section

ELT using Before and After SQL

ELT using an SQL flow

Related articles