Etlworks Terminologies – Etlworks Support

A

Admin: a type of user that has full control over the data and can create as well as manage Flows, Connections, and Formats.

Amazon Redshift: a fast, fully managed data warehouse on AWS.

API: a set of functions and procedures allowing the creation of applications that access the features or data of an operating system, application, or another service.

API endpoint: a point at which an application program interface (API) — the code that allows two software programs to communicate with each other — connects with the software program.

Audit Trail: for security and audit purposes, Etlworks records and preserves all important events in the database.

C

Change data capture (CDC): is an approach to data integration that is based on the identification, capture, and delivery of the changes made to the source database and stored in the database redo log (also called transaction log).

Change replication: in Etlworks, it is possible to create a Flow that tracks changes in a data source and loads the changed records into the destination. Another name for change replication is data synchronization.

Connection: necessary to perform data integration operations. It's required for connecting to external data sources.

Connector: pertains to a type of Connection, such as a Database Connector, Twitter Connector, etc.

D

Database loop: you can create a pipeline (nested Flow) where some of the steps (inner Flows) are configured to be executed in a loop. The idea is that you can write a SELECT SQL statement which, when executed, creates a cursor driving the loop. The inner Flow in a database loop is executed as many times as the number of records returned by the cursor.

Data exchange Format: same as Format

DataSet: when Etlworks reads the source, it creates an internal data structure, called a DataSet. A DataSet is similar to a database table. It has fields and the rows are stored in DataSetData. Each value in the row can be either a generic value (integer, string, date, etc.) or a DataSet by itself. This allows datasets to have an infinitely complex structure, supporting all types of real-life use cases.

Destination: the TO part of a source-to-destination transformation.

Destination Connection: final destination Connection in all bulk load Flows. This is different from the TO Connection which is used to stage files that are loaded into the database.

Dimension: the inner dataset in a complex nested data object called DataSet.

E

ELT: Extract-Load-Transform (ELT) is a technique in which the transformation step is moved to the end of the workflow, and data is immediately loaded to a destination upon extraction.

Etlworks Integrator: an all-in-one cloud-native ETL tool for all data integration projects, regardless of the data's location, Format, or volume. It is a completely online solution and does not require any local software installations, except for a web browser.

Event-driven Flow: a Flow executed when a certain event occurs and the Listener receives an event.

F

Filters: a way to quickly filter Connections, Flows, Formats, Schedules, etc., based on the name, type, and other attributes.

Format: a data exchange Format, such as CSV, JSON, Excel, etc. When data objects are involved, just like when working with web services or files, you need to describe all the Formats required for the data exchange.

Flow: a data integration Flow, which combines one or more transformations.

Flow variables: key-value pairs passed as URL parameters in the user-created API endpoints or added by the user as parameters in the nested Flow.

G

Global variables: key-value pairs that are set either by JavaScript code or are automatically set when running Flows.

H

High-level transformation: a transformation of the entire dataset, which is performed as a single operation — for example, deduplication.

High Watermark: the highest peak in value that a field has reached.

HTTP GET: the HTTP method for requesting data from the server. Requests using the HTTP GET method should only fetch data.

HTTP POST: the HTTP method for updating data on the server.

HWM query: a SQL query that returns the current maximum field value in the High Watermark Field Value. Example:

SELECT max(audit_trail_id) FROM audit_trail_updates

I

Instance: a physical box or VM running Etlworks.

J

JSON: is a lightweight data-interchange Format.

K

Key/value storage: a common facility for objects, programmatically stored as key/value pairs and accessible from anywhere within JavaScript. Unlike global variables and Flow variables, you can add any object to the key/value storage.

L

Listener: a special type of Connection that is "listening" for a specific event. For example, an inbound HTTP request, which then triggers the execution of data integration Flows. Listeners allow the user to create event-driven Flows.

Lookup: one of the common ETL operations, typically performed to calculate a field's value, is a Lookup. Given the list of input parameters, the system should be able to return a field's value or the entire dataset, by querying a database or other data source.

M

Mapping: the Mapping between fields in the source and fields in the destination.

Memory Connection: the Connection for storing extracted dataset. It is typically used for storing dictionaries that can be later used in lookup operations.

Metadata: the metadata associated with the Connection: objects (files and tables) and columns.

N

Nested Flow: a Flow that includes other (inner) Flows. The inner Flows can be executed sequentially, in parallel, conditionally, and in the loop.

P

Parametrization: a way to dynamically configure Connections and transformations based on input parameters. The input parameters can be Global variables or Flow variables.

Payload: the actual data pack that is sent with the POST and PUT method in the HTTP request.

R

Redshift Connection: final destination Connection in all Redshift-specific Flows. This is different from the TO Connection which is used to stage files that are loaded into the Redshift.

S

Schedule: a way to automatically execute Flows based on pre-configured time intervals.

Scrips: a code in JavaScript, Python or SQL.

Snowflake Connection: final destination Connection in all Snowflake-specific Flows. This is different from the TO Connection which is used to stage files that are loaded into the Snowflake. It has the same idea, but they are different Flow types.

Source-to-destination transformation: the most common transformation. Etlworks extracts data from the source, transforms it, and loads it into the destination.

Source: the FROM part of a source-to-destination transformation.

SQL Server Change Tracking: SQL Server Change Tracking, also known as CT, is a lightweight tracking mechanism introduced for the first time in SQL Server 2008. It can be used to track the DML changes performed in SQL Server database tables.

Stage: temporary storage for data that can be used to load data into the final destination, for example, cloud data warehouses such as Snowflake.

Sticky filter: Etlworks can be configured to remember all your active filters so that the next time you open Connections, Formats, Flows, or the Scheduler, all the previously configured filters will be preset and active.

SuperAdmin: a role under the main account. Users with the SuperAdmin role have unrestricted system access.

T

Tag: end-user-defined metadata which can be assigned to Connections, Formats, Listeners, Flows, Schedules, and Macros. Tags enable users to categorize resources in different ways, for example, by purpose, owner, or environment.

Tag filter: an ability to display only Connections/Formats/Flows/Listeneters/Schedules that have specific tags assigned to them.

Tenant: a sub-account under the main account. Each tenant has a separate list of users, Flows, Connections, Formats, and Listeners. Tenants are completely isolated from each other.

Throughput: the number of requests to the user-defined API that can be processed per minute.

Transformation: a series of functions and sets of rules applied to the extracted data in the source to convert it into the required Format in the destination.

U

UUID: globally unique identifier.

W

Well-known API: the business application or API for which Etlworks has a native connector — for example, Google Analytics, Google Sheets, etc.

X

XML: a markup language that defines a set of rules for encoding documents in a data exchange Format with the same name.

Articles in this section

A

C

D

E

F

G

H

I

J

K

L

M

N

P

R

S

T

U

W

X

Related articles