Apache Kafka is a distributed streaming platform. Read about Kafka.
When to use this connector
- to read messages from and write messages to a given Kafka topic(s).
- to implement a log-based CDC with a message queue
- to implement a real-time change replication with Kafka and Demezium.
Creating a connection
Step 1. In the Connections window, click the
+ button, type in kafka
Step 2. Select Kafka
Step 3. Enter the connection parameters
- Bootstrap server(s) - a connection string in a form: host1:port1,host2:port2,. Since these servers are just used for the initial connection to discover the full cluster membership (which may change dynamically), this list need not contain the full set of servers (you may want more than one, though, in case a server is down).
- Topic(s) - a topic to read messages from or write messages to. For reading, the wildcard topic names, for example, inbound.*, or comma-separated topic names, for example, topic1,topic2 topics are supported.
- Host and Port - alternatively the list of the bootstrap server(s) you can enter a Kafka server host and a port.
- Simple Authentication and Security Layer Mechanisms - choose between scram (default) and plain.
- User - a user name used for PLAIN authentication. Read about Kafka security.
- Password - a password used for PLAIN authentication.
- Properties - the additional properties for the Kafka consumer, Kafka producer, and Kafka security. The properties must be in a format key1=value1;key1=value1.
- Auto Commit - if enabled, the Kafka consumer will periodically commit the offset when reading the messages from the queue. It is recommended to keep it disabled so the system can commit the offset right after the messages have been processed.
- Starting Offset - a starting offset at which to begin the fetch.
- Key Deserializer - the deserializer for the key.
- Value Deserializer - the deserializer for the value. When the value is a document in Avro format use either Avro (when processing messages enqueued by Etlworks Integrator) or Avro Record (when processing messages enqueued by the third-party application). The latter requires an Avro Schema.
- Max number of records to read - the total maximum number of records to read in one micro-batch. The default limit is 1000000.
- Poll duration - how long (in milliseconds) the consumer should wait while fetching the data from the queue. The default is 1000 milliseconds.
- Max number of records to poll - the maximum number of records which can be fetched from the queue in a single poll call.
- Number of retries before stop polling - the number of retries before stop polling if the poll returns no records. The default is 5.
- Integration with CDC providers - select the CDC provider(s) if you are planning to use this connection for capturing and processing CDC events. Currently only Debezium is supported.
- Key Serializer - the serializer for the key.
- Value Serializer - the serializer for the value. Use Avro when writing messages in Avro format.
- Compression - the compression algorithm used when writing messages.