Overview
CDC storage connectors are used when creating and managing Change Data Capture (CDC) pipelines in Etlworks. Both connectors are based on the Server Storage connector.
CDC Events Connector
This connector is used as a destination for CDC extract flow that creates files in the local or attached network storage. It can also be used as a source for flows that load CDC events from the local or attached network storage into the destination, such as another database, data warehouse, and data lake. CDC events are created by the CDC connector when it performs the initial or ad-hoc snapshot and when it streams data from the transaction log. The events are stored in the CDC files (multiple events per file) in CSV or JSON format.
Advantage of CDC Events connector vs. Server Storage connector
CDC Events connector is, for all intents and purposes, a standard Server Storage connector. It does, however, provide a few advantages when used in CDC pipelines:
1. There is no need to remember or configure the standard location of the CDC events. By default, the connector points to {app.data}/debezium_data/events
which is a standard location.
2. When used as a source in the flows which load CDC files into databases, data warehouses, and data lakes, it is possible to select one of the default wildcard templates in FROM:
The following wildcard templates are available:
CDC events in CSV format
: *_cdc_stream_*.csvCDC events in JSON format
: *_cdc_stream_*.jsonCDC events archived using GZip
: *_cdc_stream_*.gz
How to create CDC Events connection
Step 1. In theConnections
window, click+
, and selectcdc events
.
Step 2. Select CDC Events
.
Step 3. In most cases, you can just click Save
and let the system create a connection with default parameters. It is unlikely that you will ever need to make any changes, but here is a list of available parameters that you can modify.
CDC Offset and History
This connector is used to reset the CDC extract flow (restart the CDC process from scratch). CDC history file includes the DDL statements that are applied to the database and used to identify the structure of the tables at the time of each insert, update, or delete operation. CDC offset file includes the current position in the transaction log and is used to identify where the connector should resume reading after restarting.
Advantage of CDC Offset and History connector vs. Server Storage connector
CDC Offset and History connector is, for all intents and purposes, a standard Server Storage connector.
It does, however, provide an advantage when used in CDC pipelines - there is no need to remember or configure the standard location of the offset and history files. By default, the connector points to {app.data}/debezium_data
which is a standard location.
How to create CDC Offset and History connection
Step 1. In theConnections
window, click+
, and selectcdc offset
.
Step 2. Select CDC Offset and History
.
Step 3. In most cases, you can just click Save
and let the system create a connection with default parameters. It is unlikely that you will ever need to make any changes, but here is a list of available parameters that you can modify.
Comments
0 comments
Please sign in to leave a comment.