Overview
The CDC offset file records the position from which a Debezium-based CDC flow resumes reading the source database's change feed — for example, the MySQL binlog file and position, the PostgreSQL or SQL Server log sequence number, the Oracle SCN, or the MongoDB oplog timestamp. The connector writes to this file as it streams events, and reads from it on every restart.
In normal operation users do not need to look at offset files. Recovery is automatic: when a flow restarts after a stop, a JVM crash, or a node failover, the connector picks up from the last flushed position and replays from there. See Recovery and Resumability for how automatic recovery works.
A small number of edge cases require manual editing of the offset file — most often because the recorded position refers to data the connector can no longer read, or because a source-side recovery has moved the consumable window past the connector's last known position. In the past these edits required server-side filesystem access. Etlworks now provides an in-UI editor that does the same work safely, with automatic backup and JSON validation. This article documents that editor.
Two ways to reach the editor
The same offset editor is reachable from two places:
From the CDC connection editor
Click the Offset File button in the bottom toolbar. The editor loads the offset file associated with that specific connection.
From Explorer
Browse to a CDC Offset and History connection, right-click an offset file, and choose Edit CDC Offset. The editor loads that specific file.
Both paths open the same modal and write through the same backend with the same safety checks.
When to use this feature
Editing the offset file is an advanced troubleshooting operation. Use it when one of the following applies:
- The recorded position points at data the source can no longer provide. For example, MySQL has rotated past the binlog file referenced in the offset, PostgreSQL has reclaimed the WAL position behind the connector's replication slot, or SQL Server has truncated the change-tracking range. The flow fails on startup with a "position not found" or equivalent error.
- The recorded position contains a record the connector cannot process. The flow fails repeatedly at the same position with a deserialization error, a schema mismatch, or an unrecoverable parse failure. After confirming that the offending record can be safely skipped, you can move the position past it.
- A source-side recovery, restore, or migration has changed the log position you should resume from. For example, the source database was restored from a backup, point-in-time recovery moved the consumable window, or the database was failed over to a replica with a different log position.
- Support is guiding you through a recovery. Etlworks support sometimes asks customers to confirm or edit the current offset content. The editor lets you do that without granting filesystem access to the integrator host.
- You need to inspect the current resume position without writing to the file. Open the editor and use Close to leave the file unchanged.
If the symptom is “the flow keeps re-snapshotting on every restart” or “the flow shows Running but no events are produced,” review Recovery and Resumability first — the cause is often elsewhere (deleted offset file, unsafe offset path, read-only volume) and editing the offset will not help.
Warnings and prerequisites
Read this section before opening the editor.
- Stop the CDC flow first. Editing the offset while the connector is running can leave the file in an inconsistent state and the connector may overwrite your edit on its next flush. Stop the flow from the UI before making any changes.
- An incorrect edit can cause skipped events or duplicates. If the new position is ahead of the actual last-processed event, events between the old and new positions will not be delivered to the destination. If the new position is behind, events between the new and old positions will be replayed. Replayed events are absorbed by destinations that load with MERGE or upsert; skipped events are not recoverable without re-snapshotting.
- An invalid position can fail the flow. A position the source database cannot resolve will fail the flow on the next start with the same kind of error you may already be debugging. The editor validates that the JSON is well-formed, but it does not validate that the position is meaningful to the source.
- Understand the source's position format before editing. The position is a JSON object whose keys depend on the source database — for example, MySQL uses file and pos, PostgreSQL uses lsn, SQL Server uses commit_lsn and change_lsn, Oracle uses scn, MongoDB uses sec and ord. The editor preserves whatever shape the connector already wrote; do not change keys, only their values.
- Etlworks creates a backup before saving, but you are responsible for validating the edit. The backup makes a bad edit reversible. It does not make the edit correct.
When in doubt, contact Etlworks support before saving. The editor's read-only display of the current offset is safe to share; it does not contain credentials.
Edit offsets from a CDC connection
Use this path when you are working on a specific CDC connection and want the editor to load the offset file Etlworks would use at runtime for that connection.
- Open the connection from Connections in the Etlworks UI.
- Stop the CDC flow that uses this connection, if it is currently running.
- In the connection editor's bottom toolbar, click the Offset File button.
- The CDC offset file modal opens with the offset file for this connection loaded. The path shown at the top of the modal is the resolved location, which is either the value of the connection's Offset File Name field if set, or the default name Etlworks derives from the connection name. In both cases the file lives under the tenant-aware {app.data}/debezium_data directory.
- If the offset file does not exist yet (the connection has never run), the modal still opens but the Status reads as not found and there is nothing to edit. This typically means the flow has not run yet, or the offset file has been deleted — see Offset or history file was deleted for the recovery path in that case.
The rest of the workflow — reading the entries, editing the position, validating and saving — is the same as the Explorer path and is described in The CDC offset editor below.
Edit offsets from Explorer
Use this path when you are working in Explorer, browsing a CDC Offset and History storage connection that may contain offset files for multiple CDC connections side by side.
- Open Explorer in the Etlworks UI.
- Expand the CDC Offset and History connection that holds the offset file you want to edit. See CDC Storage Connectors for how this connection type is configured.
- Stop the CDC flow whose offset file you are about to edit, if it is currently running.
- Right-click the offset file in the file list and choose Edit CDC Offset. The action is enabled only for offset files. History files (and any other file in the same folder) do not show the action — see Limitations and safety checks.
- The CDC offset file modal opens with the chosen file loaded.
An offset file in this view is recognized by its name: the filename contains offset and does not contain history. For a CDC connection whose name produces files like orders_pipeline_offset.dat and orders_pipeline_history.dat, only the first is editable through this menu.
The CDC offset editor
Both entry points open the same modal, titled CDC offset file. The modal shows the resolved offset file metadata at the top and one editable section per offset entry below it.
File metadata
The header shows four read-only values:
- Offset file — the resolved absolute path of the file on the integrator host, under the tenant-aware {app.data}/debezium_data directory.
- Status — either Found (the file exists and was read successfully) or a not-found state when the connection has never run or the file has been removed.
- Size and Last modified — current file size in bytes and the timestamp of the last write. These are useful for confirming that the file is the one the connector is actually using.
- Backup folder — the directory Etlworks will write a backup copy to before applying your edit. This is always under the tenant-aware {app.data}/debezium_data/backup directory.
Offset entries
Below the header the modal shows one Offset entry N block for each entry in the file. Most CDC connections have a single entry; configurations that capture multiple databases or named partitions can have more than one. For each entry the modal displays:
- Key — the entry's identifying key, shown as read-only context. This typically includes the logical connector name and a server descriptor. Do not attempt to change the key — the connector matches entries by key on read.
- Position — the editable JSON value that describes where to resume. The format depends on the source database. This is the field you change. The editor is a standard Etlworks code editor with JSON syntax highlighting and dark-mode support.
The Position JSON keys are connector-specific. Typical examples:
| Source | Position keys you will see |
|---|---|
| MySQL | file, pos, ts_sec, transaction_id |
| PostgreSQL | lsn, txId, ts_usec |
| SQL Server | commit_lsn, change_lsn, event_serial_no |
| Oracle | scn, commit_scn |
| DB2 / AS/400 | change_lsn (DB2), seq (AS/400 journal sequence) |
| MongoDB | sec, ord, h |
Edit only the values that need to change. Leave any keys you do not understand at their current values. The connector ignores unrecognized keys, but missing keys it expects will cause it to fail on restart.
JSON validation and saving
The Save Offset button at the bottom of the modal is disabled until two conditions hold:
- The position JSON parses successfully (well-formed JSON).
- The position JSON differs from the value that was loaded (you have actually made an edit).
If the JSON is malformed — missing a closing brace, an unquoted key, a trailing comma — the editor highlights the error and the button stays disabled.
The Reset button at the bottom-left restores the position to the value that was loaded from disk, discarding any edits made in this session. Close dismisses the modal without saving.
Clicking Save Offset opens a confirmation dialog that summarizes the action and reminds you that a backup will be created in the debezium_data/backup folder. Confirm to write the new value. On success the UI reports the path of the backup copy — record it in case you need to revert.
Backup and recovery
Etlworks writes a backup of the current offset file before applying any edit. The backup is a copy of the file as it stood at the moment of save, written into the {app.data}/debezium_data/backup directory shown in the modal header. The save operation itself is atomic: the new file is written to a temporary location, validated as readable, and then swapped into place. If any step fails, the original file is left unchanged.
To revert an edit, replace the live offset file with the corresponding backup. You can do this through any path that has access to the CDC Offset and History connection — copy the backup over the live file in Explorer, or restore the backup zip if the backup retention has rotated to that. See Automatically Backing Up Offset and History Files for the broader backup configuration.
The same retention policy that applies to scheduled offset backups also applies to backups created by the editor.
Limitations and safety checks
The editor enforces several rules to prevent accidental damage. Both the connection-editor entry point and the Explorer entry point share the same backend, so the same rules apply regardless of how the editor was opened.
- History files are not editable. The Edit CDC Offset action does not appear for history files in Explorer, and the backend rejects any attempt to open one. History files have a different on-disk format and a different purpose — they record the schema of every monitored table at each DDL change so that historical events can still be interpreted — and they are not safe to edit through a simple JSON editor.
- The Explorer action is only available on connections of type CDC Offset and History. Other file-storage connection types do not show the action, and the backend rejects requests originating from any other connection type.
- File names with path separators are rejected. Filenames containing / or \, blank filenames, or anything that looks like a path-traversal attempt (such as .. segments) is rejected by the backend before the file is opened.
- Files outside the tenant's debezium data folder are rejected. The backend resolves the file inside the tenant-aware {app.data}/debezium_data directory and rejects any request that would resolve outside it. This applies to both entry points.
- Files that do not look like offset files are rejected. The filename must contain offset and must not contain history. This rule is enforced server-side, not just in the UI.
- The file must exist before it can be edited. If the connection has never run, the offset file has not been created yet, and there is nothing for the editor to load.
Troubleshooting notes
The Offset File button is missing on a connection
The button appears in the bottom toolbar of the connection editor for Debezium-backed CDC connections. If you do not see it on what you believe is a CDC connection, the connection type is probably something else — for example, a regular database connection that was created without the CDC flow type. Check the connection's connector class in Connections.
Edit CDC Offset is missing from the Explorer right-click menu
Three possible reasons:
- The connection is not a CDC Offset and History connection. The action only appears for that connection type.
- The file is not an offset file. The action only appears when the filename contains offset and does not contain history.
- The right-clicked item is a folder, not a file.
Save Offset stays disabled
The button is disabled until the JSON is well-formed and it differs from what was loaded. Check both conditions. The most common cause is an unbalanced brace or a trailing comma in the edited position.
The flow fails on restart after an edit
If the flow fails at the position you set, the source database almost certainly cannot resolve it — for example, the binlog file you named has been purged, the PostgreSQL LSN is outside the replication slot's window, the Oracle SCN is too old, and so on. Either move the position to one the source can still serve, or restore the backup and contact support. To restore the backup, replace the live offset file with the corresponding backup from {app.data}/debezium_data/backup and restart the flow.
The modal opens but there are no offset entries
The file exists but is empty or contains no recognized entries. This usually means the connector started, created the file, and then failed before flushing any position. Stop the flow, delete the empty offset file (or rename it out of the way), and restart the flow — the connector will re-create it according to the configured snapshot mode. See Resetting a CDC Flow for the procedure.
See also
- Recovery and Resumability — how the connector resumes automatically and the edge cases that require intervention.
- Automatically Backing Up Offset and History Files — how scheduled backups complement the per-edit backup created by this editor.
- CDC settings reference — Offset File Name and DDL History File Name — the connection-level settings that control where these files live.
- CDC Storage Connectors — the connection type used by the Explorer entry point.