Collect Apache Kafka data into your data warehouse or ours. The Matatika pipelines will take care of the data collection and preparation for your analytics and BI tools.
Apache Kafka is a distributed streaming platform.
Apache Kafka is a tool that allows for the real-time processing of data streams, enabling the transfer of large amounts of data between systems and applications in a scalable and fault-tolerant manner. It provides a publish-subscribe messaging system that allows producers to send messages to a topic, which can then be consumed by one or more consumers. Kafka also offers features such as data replication, fault tolerance, and horizontal scalability, making it a popular choice for building real-time data pipelines and streaming applications.
The name of the Kafka topic to connect to.
The ID of the consumer group to which the client belongs.
The list of Kafka brokers to connect to.
The field(s) used as the primary key for the messages.
Whether to use the message key as the primary key.
The timestamp from which to start consuming messages.
The maximum amount of time the client will run before shutting down.
The interval at which the client will commit offsets.
The maximum amount of time the client will wait for new messages.
The maximum amount of time a consumer session can be inactive before being considered dead.
The interval at which the client will send heartbeat messages to the broker.
The maximum number of records to fetch in a single poll.
The maximum amount of time to wait for new records in a single poll.
The format of the messages being consumed.
The schema for the Protobuf messages being consumed.
The directory containing the generated Protobuf classes.
Extract, Transform, and Load Apache Kafka data into your data warehouse or ours.