When to use MongoDB connectors
MongoDB is a document-oriented database designed for scalability, flexible schemas, and efficient querying and indexing.
Etlworks provides native MongoDB connectors that allow you to:
-
Read data from MongoDB
-
Write data into MongoDB
-
Capture document-level changes using Change Data Capture (CDC)
Depending on your use case, Etlworks offers multiple MongoDB connectors optimized for different access patterns.
Related articles:
MongoDB connectors
MongoDB Connector
Use this connector when working with large collections where documents have a consistent structure.
Typical use cases include ETL and streaming pipelines that process data record by record.
Behavior:
-
When loading data into MongoDB
Each source record is written as a separate MongoDB document.
-
When extracting data from MongoDB
Documents are streamed one by one to the destination.
This approach is memory efficient and suitable for large collections.
MongoDB document connector
Use this connector when working with specific documents or when you need to filter documents and process them as a single logical payload.
Behavior:
-
When loading data into MongoDB
The connector creates one MongoDB document that contains all records from the source dataset.
-
When extracting data from MongoDB
All matching documents are read, combined into a JSON array, and sent to the destination as a single payload.
This connector is typically used for document-level operations, configuration data, or file-style document management.
Change Data Capture MongoDB connector
The MongoDB CDC connector monitors MongoDB for document-level changes and emits them as events.
Supported MongoDB deployments:
-
Replica sets
-
Sharded clusters
CDC events can be streamed to databases, data warehouses, message queues, or other destinations.
Read how to create and configure a MongoDB CDC Connection.
Creating a MongoDB Connection
Create a MongoDB Connection from the Connections window:
-
Click +
-
Type mongo
-
Depending on your use-case select either:
- MongoDB CDC
-
MongoDB
-
MongoDB Document
The selected connection type determines how documents are read and written.
Connection parameters
Core Parameters
-
URL
Required.
MongoDB connection string.
See MongoDB documentation for supported connection string formats.
-
Database
Required.
Name of the MongoDB database.
-
Collection
Required.
Name of the MongoDB collection within the database.
-
User
Optional.
MongoDB username.
-
Password
Optional.
MongoDB password.
Object ID Handling When Reading
What to do with the Object ID (_id) when reading
Controls how the _id field is handled when documents are read.
Options:
remove (default)
The _id field is removed from the dataset.
keep
The _id field is kept unchanged.
flatten
Converts ObjectId to a string value.
Example:
"_id": ObjectId("54759eb3c090d83494e2d804")
becomes:
"_id": "54759eb3c090d83494e2d804"
Writing Behavior
What to do with the existing document when writing
Controls how documents are handled when a matching document already exists.
Options:
insert
Always insert new documents.
Fastest option. No lookup is performed.
update
Update only fields provided in the payload.
Existing fields not present in the payload remain unchanged.
replace
Replace the entire existing document with the new payload.
CDC Delete Handling
Perform delete on matching MongoDB document for CDC delete operation
When enabled, the connector performs a delete operation when a CDC event with operation type d (delete) is received.
-
Enabled: delete events remove matching MongoDB documents
-
Disabled (default): delete events are ignored
Create (c) and update (u) CDC operations are always processed.
Performance and Reliability Settings
-
Batch Size
Number of documents processed per batch.
If not set or set to <= 1, batching is disabled.
-
Write Concern
Controls acknowledgment level for write operations.
Available options:
-
Unacknowledged
-
Acknowledged
-
Journaled
-
Replica Acknowledged
-
Majority
Higher levels increase durability but may reduce performance.
-
Timestamp Conversion
Enable Timestamp Conversion
-
Enable
Converts timestamp strings to MongoDB ISODate objects using a Java regex.
-
Disable
Leaves timestamp values as strings.
Timestamp Conversion Regex Pattern
If conversion is enabled and no pattern is provided, the default ISO 8601 pattern is used:
^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(\.\d{3})?(Z|[+-]\d{2}:\d{2})$
Explorer Settings
Number of Documents in Explorer
Maximum number of documents displayed in the Etlworks Explorer.
-
Default: 1000
-
Maximum: 9999
This setting affects Explorer only and has no impact on flow execution.
Filter (Deprecated)
Specifies which documents to retrieve.
Options:
-
Fully qualified document name (document _id)
-
Wildcard document name
Example: sales_orders*
-
MongoDB query in JSON format
Example: {"first":"Simba"}
This parameter is deprecated.
Encoding
Optional encoding applied when creating MongoDB documents.
Use an SSH tunnel when MongoDB is not directly accessible due to firewall restrictions.
SSH Tunnel Parameters
-
SSH Host
Hostname or IP address accepting SSH connections.
-
SSH Port
Default: 22
-
SSH User
SSH username.
-
SSH Password
Optional SSH password.
-
Private Key File
Optional PEM or PPK file for key-based authentication.
Keys can be uploaded via the UI.
-
SSH Passphrase
Optional passphrase for the private key.
Configuration Notes
-
Use the actual MongoDB hostname and port in the connection URL.
-
Etlworks automatically maps the connection through localhost and an available local port.