Overview
This article explains how to configure lifecycle settings for CDC flows, manage offset and history files, monitor active CDC streams, and access runtime metrics via the UI and API.
Automatically Stopping a CDC Flow
By default, CDC flows run continuously and do not stop on their own — this is the recommended and most reliable configuration. However, in certain scenarios, you may want to automatically stop the CDC flow when no new CDC events have been detected for an extended period.
Auto-stop options (configurable in the CDC Connection):
-
Number of retries before giving up: Stops the flow after this many retries if no new CDC events are found.
-
Retry N minutes before giving up: Stops the flow after waiting this number of minutes without receiving new CDC events.
-
Always stop after N minutes: Stops the flow after this many minutes, regardless of whether events are still arriving.
NOTE: Use auto-stop settings with caution. CDC connectors may take time to initialize and begin streaming events. Prematurely stopping the flow could prevent data from being captured.
If none of these options are set, the CDC flow will continue running until manually stopped or it encounters an error.
To manually stop a running CDC flow, click Stop / Cancel in the UI.
Resetting a CDC Flow
Need a smaller change than a full reset? If the goal is to move the connector past one specific bad position or to roll back to a slightly earlier known-good position, you can edit the offset file in place using the in-UI editor — see Edit CDC Offset Files. A full reset (described below) drops all progress and re-snapshots every monitored table; an offset edit only changes where the connector resumes from.
Etlworks CDC connectors track progress using two files:
-
Offset file: records the current position in the transaction log.
-
History file: stores the history of DDL changes for monitored tables.
Typically, a CDC flow begins by:
-
Taking a snapshot of the monitored tables (if enabled),
-
Or starting from the oldest known transaction log position,
-
Then continuously streaming change events.
If the flow is restarted, it will resume from the last recorded position.
However, you may want to reset the CDC flow, forcing it to reprocess the data from the beginning.
Resetting the CDC flow:
-
Create a new Offset and History connection. If this connection type is not available in your Etlworks version simply create Server Storage connection which points to {app.data}/debezium_data.
-
Stop the CDC flow if it is currently running.
-
Open the Offset and History connection in Explorer and delete the associated.dat files for this CDC connection.
- If you want to fully re-snapshot monitored tables set snapshot type in the source CDC connection to ad-hoc initial.
-
Restart the CDC flow.
-
If snapshotting is enabled, the flow will take a fresh snapshot.
-
Otherwise, it will start from the beginning of the transaction log.
-
Automatically Backing Up Offset and History Files
You can configure automatic backups of offset and history files for recovery or audit purposes. Backups are stored as.zip files in the designated folder (default is {app.data}/debezium_data/backup).
Backup settings (in CDC Connection → Backup Offset):
-
Automatically backup offset every (minutes): Enables periodic backup every N minutes.
-
Delete offset backups older than (minutes): Automatically deletes old backups to save storage.
-
Backup file timestamp format: Customize the filename timestamp. Default is MMddyyyyHHmmss.
Available format tokens:
-
MM: month
-
dd: day
-
yyyy: year
-
HH: hour (0–23)
-
mm: minutes
-
ss: seconds
-
SSS: milliseconds
Backup filenames
Backup filenames follow the pattern: connectionName_timestamp.zip
Example: cdc_prod_09262025124501.zip
Changing the destination for backup files
By default, backups are stored in {app.data}/debezium_data/backup.
You can change this by going to the Flow → Connections → Offset Backup tab and selecting a different storage connection.
Recovery and Resumability
CDC flows in Etlworks recover from interruptions automatically in the vast majority of cases. When the integrator restarts, the flow fails and retries, or a host is replaced and the storage is restored, the connector resumes from the last committed position in the source change feed. The remainder of this section explains how automatic recovery works on the happy path, and the small set of edge cases where manual intervention is required.
How automatic recovery works
Every CDC connector keeps two small files on disk while it runs:
- The offset file records the connector's current position in the source change feed (binlog file and position for MySQL, LSN for SQL Server and PostgreSQL, SCN for Oracle, journal sequence for AS/400, oplog timestamp for MongoDB, and so on).
- The history file records the schema of every monitored table at the time each DDL change was observed, so that records emitted under earlier schema versions can still be interpreted correctly.
Both files live under {app.data}/debezium_data by default and are flushed on a regular interval. When the flow stops — whether by a graceful stop, a JVM crash, a node failover, or anything in between — the last flushed offset is what the connector reads on restart, and streaming resumes from that point.
Etlworks CDC delivers events at least once. After a recovery the connector replays from the last flushed offset, not from the actual last-processed event, so it is normal to see a small number of duplicate events at the seam. The standard pattern is to use the MERGE action on the load step (or the equivalent upsert on NoSQL destinations) so that duplicates are absorbed idempotently. The recipe articles for each destination type already use this pattern.
While the connector is recovering it may show up in the UI as Running without producing any throughput — the flow is alive, the snapshot or replay is in progress, but no records have been forwarded yet. This is sometimes called the “fake running” state. It resolves on its own once the recovery catches up. If it persists for longer than expected, check the flow log for the source database or storage error described in the edge cases below.
Edge cases where manual intervention is needed
The cases below break automatic recovery. Each one has a recognizable symptom in the flow log and a defined manual procedure.
Offset or history file was deleted
Symptom. On restart the connector behaves as if it has never run: it begins a fresh snapshot of every monitored table, or starts streaming from the oldest available position depending on snapshot mode. In the log you will see a message indicating that no offset was found.
Cause. Either the file was deleted manually (most common), the storage backing {app.data}/debezium_data was wiped or replaced without restoring its contents, or the file was kept outside the auto-backup tree and the host that held it was replaced. See CDC settings reference → Offset File Name and DDL History File Name for why keeping files outside {app.data}/debezium_data is unsafe.
Resolution. If you have an offset backup from before the loss, restore the most recent backup into {app.data}/debezium_data and restart the flow. If you do not, the only option is a full reset (see Resetting a CDC Flow) followed by a re-snapshot. Whether the destination needs to be cleared first depends on whether your load step is idempotent — MERGE-based loads can absorb the re-snapshot without manual cleanup.
Offset or history file is corrupted
Symptom. The flow fails at startup with a deserialization or parse error referencing the offset or history file. Successive restarts produce the same error.
Cause. Usually a power loss or kernel panic that interrupted a flush. Less commonly, an external process modified the file.
Resolution. Restore the most recent good backup of the file from {app.data}/debezium_data/backup. If automatic backup was not configured, the only option is a full reset and re-snapshot.
Storage backing {app.data}/debezium_data is unavailable
Symptom. The flow fails at startup with an I/O error before it can read the offset. Or, if it does start, the first checkpoint attempt fails with a permission or path error.
Cause. The volume that holds {app.data} is unmounted, the storage backend (NFS, EFS, cloud disk) is unreachable, or the directory permissions changed.
Resolution. Fix the underlying storage problem — remount the volume, restart the storage service, restore permissions, or fail over the host. Once the connector can read and write {app.data}/debezium_data the flow resumes automatically from the last good offset. Do not attempt to work around the problem by relocating the offset and history files; the auto-backup machinery only covers the default location.
Storage volume is read-only
Symptom. The flow starts and runs, but offset flushes silently fail or log warnings. On the next restart the connector resumes from a much older position than expected, sometimes triggering a re-snapshot.
Cause. The volume backing {app.data} is mounted read-only. We have seen this happen after an emergency boot, a misconfigured cloud disk attachment, or after a filesystem error caused the kernel to remount read-only.
Resolution. Stop the flow, fix the mount (remount read-write, replace the disk, or recover from an outright filesystem failure), then restart the flow. The connector will resume from the last successful flush. Some duplicate events at the seam are expected and absorbed by MERGE on the load step.
Position in the offset file refers to a record the connector cannot process
Symptom. The flow starts, the connector resumes from the offset, and the first event triggers an unrecoverable parse error or schema mismatch. Successive restarts reproduce the error at the same position.
Cause. Rare. Possible causes include a source database change that the schema history did not capture (manual DDL with replication disabled, then re-enabled), a binary event that the upstream Debezium engine cannot interpret in the current configuration, or a known-bad event that needs to be skipped past.
Resolution. Two options:
- If the bad event is well-understood and can be safely skipped, edit the offset file to point just past it, then restart. Use the in-UI editor described in Edit CDC Offset Files — it backs up the current file before writing and validates the new JSON. If you are unsure what value to write, contact Etlworks support with the flow log and the current offset content (the editor displays both safely).
- If skipping is not safe, perform a full reset (see Resetting a CDC Flow) and re-snapshot. The bad event will not reappear because the snapshot reads the source's current state, not the historical change feed.
Recovery readiness checklist
To keep recovery automatic in as many situations as possible:
- Keep Offset File Name and DDL History File Name on the CDC connection empty (or, if you must override them, set them to paths under {app.data}/debezium_data). See CDC settings reference.
- Configure automatic offset backup with a retention window that covers the longest credible outage you need to recover from.
- Make sure the destination load step uses an idempotent action (MERGE or upsert) so that the small number of replayed events at recovery seams do not produce duplicates downstream.
- Monitor the flow with the metrics described below. A persistent “Running, no throughput” state is the most useful indicator that recovery is in progress or stuck.
Monitoring CDC Flows (Real-Time Metrics)
Etlworks captures real-time performance metrics for each active CDC stream. Metrics are updated every 60 seconds.
Available metrics
-
Last Update (timezone): Timestamp of the last internal checkpoint (in server timezone).
-
Last Record Processed (time ago): Human-readable time since the last CDC event was processed.
-
Messages/sec: Rate of change events processed per second during the last interval.
-
Last Record Latency: Difference (in ms) between the source event timestamp and the time it was processed.
-
Max Latency / Min Latency: Highest/lowest latency recorded since last restart.
-
Avg Latency: Average latency since last restart.
-
Total Records Processed: Total number of events processed since last restart.
-
Records Processed Since Last Update: Number of events processed in the last 60 seconds.
Accessing CDC Metrics via the UI
To view CDC metrics in the Etlworks UI:
-
Go to Flows or Schedules and click Running next to the CDC flow.
-
Click View Running Tasks.
-
Scroll to the bottom of the Running Tasks window to see metrics.
You can also:
-
Click Refresh to update the metrics manually.
-
Enable Auto Refresh for real-time updates.
Accessing CDC Metrics via API
Get CDC Metrics for a Specific Flow
Endpoint:
GET /etl/rest/v1/tasks/{audit_id}/?type=CDC%20stream
Example:
https://app.etlworks.com/etl/rest/v1/tasks/80610/?type=CDC%20stream
Headers:
Authorization: Bearer <access-token>
Response format (JSON):
[{
"requestId": "string",
"owner": "string",
"name": "string",
"started": 1661534958477,
"duration": 1023,
"code": "CDC metrics",
"type": "CDC",
"tenant": "cdc_prod",
"messagesPerSecond": 240.7,
"latency": 150,
"maxLatency": 300,
"minLatency": 100,
"avgLatency": 180,
"maxLatencyDate": 1661534999999,
"recordsProcessed": 14500,
"recordsProcessedSinceLastCheck": 950,
"lastCheck": 1661535040000,
"lastTimeRecordsReceivedDate": 1661535038000
}]
Response codes:
-
200: Success
-
401: Not authorized
-
403: Forbidden
-
500: Internal server error
Get CDC Metrics for All Running Flows
Endpoint:
GET /etl/rest/v1/tasks/?type=CDC%20stream
Example:
https://app.etlworks.com/etl/rest/v1/tasks/?type=CDC%20stream
Headers are the same as for the individual flow endpoint.
Response format:
A list of CDC metrics objects, one per running flow. Each includes the same fields as above.