Kafka Explained: Architecture, Producers, Consumers and Best Practices

2 minute read

Kafka Streams

Photo by Caspar Camille Rubin on Unsplash

🎯 What is Kafka?

Official definition:

A distributed streaming platform for publishing, subscribing to, storing, and processing streams of events in real time.

In practical terms:

Kafka is a distributed messaging system that:

  • Handles millions of events per second
  • Stores events durably
  • Allows multiple independent consumers
  • Guarantees ordering within partitions
  • Scales horizontally

πŸ“ Core Architecture

Main Components

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  Kafka Cluster                  β”‚
β”‚                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚  β”‚ Broker 1 β”‚  β”‚ Broker 2 β”‚  β”‚ Broker 3 β”‚       β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
β”‚                                                 β”‚
β”‚  Topic: flight-bookings                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚  β”‚Partition 0 β”‚Partition 1 β”‚Partition 2 β”‚       β”‚
β”‚  β”‚  Leader:B1 β”‚  Leader:B2 β”‚  Leader:B3 β”‚       β”‚
β”‚  β”‚Replicas:   β”‚Replicas:   β”‚Replicas:   β”‚       β”‚
β”‚  β”‚  B1,B2,B3  β”‚  B2,B3,B1  β”‚  B3,B1,B2  β”‚       β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         ↑                           ↓
    Producers                   Consumers
     (publish)                   (consume)

Topics and Partitions

A topic is a category or feed of messages.

Example:

Topic: "flight-bookings"

β”œβ”€β”€ Partition 0: [msg0, msg3, msg6, msg9]
β”œβ”€β”€ Partition 1: [msg1, msg4, msg7, msg10]
└── Partition 2: [msg2, msg5, msg8, msg11]

Characteristics:

  • A topic is divided into partitions
  • Each partition is an immutable ordered sequence
  • Messages are appended to the end (append-only log)
  • Each message has a unique offset

Partitioning Key

The producer decides the partition using a key.

producer.send(new ProducerRecord<>(
    "flight-bookings",
    bookingId,
    bookingData
));

Important rule:

Messages with the same key β†’ same partition β†’ ordering guaranteed

Example:

booking-123 β†’ Partition 0
booking-456 β†’ Partition 1
booking-123 β†’ Partition 0

Replication

Kafka replicates data for fault tolerance.

Topic: payments (replication-factor: 3)

Partition 0

Broker 1 (LEADER)
[msg0, msg1, msg2]

Broker 2 (FOLLOWER - ISR)
[msg0, msg1, msg2]

Broker 3 (FOLLOWER - ISR)
[msg0, msg1, msg2]

Key concepts:

Leader

  • Handles all reads and writes
  • Only one leader per partition

Followers

  • Replicate data from the leader
  • Can become leader if the current one fails

ISR (In-Sync Replicas)

Set of replicas that are fully synchronized.


Durability with acks

props.put("acks", "all");
props.put("acks", "1");
props.put("acks", "0");

Producers

Basic Configuration

Properties props = new Properties();

props.put("bootstrap.servers", "kafka1:9092,kafka2:9092");
props.put("key.serializer",
    "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer",
    "org.apache.kafka.common.serialization.StringSerializer");

props.put("acks", "all");
props.put("retries", 3);
props.put("compression.type", "snappy");

KafkaProducer<String, String> producer =
    new KafkaProducer<>(props);

Asynchronous Send

producer.send(record, (metadata, exception) -> {

    if (exception != null) {
        logger.error("Error sending message", exception);
    } else {
        logger.info(
            "partition={} offset={}",
            metadata.partition(),
            metadata.offset()
        );
    }

});

Synchronous Send

RecordMetadata metadata = producer.send(record).get();

Consumers

Basic Configuration

Properties props = new Properties();

props.put("bootstrap.servers", "kafka1:9092");
props.put("group.id", "booking-processors");

props.put("key.deserializer",
    "org.apache.kafka.common.serialization.StringDeserializer");

props.put("value.deserializer",
    "org.apache.kafka.common.serialization.StringDeserializer");

props.put("enable.auto.commit", "false");
props.put("auto.offset.reset", "earliest");

KafkaConsumer<String, String> consumer =
    new KafkaConsumer<>(props);

Poll Loop

while (true) {

    ConsumerRecords<String,String> records =
        consumer.poll(Duration.ofMillis(100));

    for (ConsumerRecord record : records) {

        process(record.value());

    }

    consumer.commitSync();
}

Consumer Groups

Example with three partitions:

Topic: flight-bookings

Consumer Group: booking-processors

Consumer 1 β†’ Partition 0
Consumer 2 β†’ Partition 1
Consumer 3 β†’ Partition 2

Offset Management

Manual commit example:

enable.auto.commit=false

consumer.commitSync();

Monitoring

Producer Metrics

record-send-rate
record-error-rate
request-latency-avg
buffer-available-bytes

Consumer Metrics

records-consumed-rate
fetch-latency-avg
commit-latency-avg
records-lag-max

Most Important Metric: Consumer Lag

Lag = Latest offset - Current offset

Example:

GROUP              TOPIC           PARTITION  CURRENT  LOG-END  LAG
booking-processors flight-bookings 0          150      150      0
booking-processors flight-bookings 1          200      250      50

Conclusion

Kafka is a core building block of modern event-driven architectures.

Key strengths:

  • Horizontal scalability
  • High durability
  • Massive throughput
  • Strong ecosystem

Used heavily in:

  • fintech
  • ecommerce
  • streaming platforms
  • data platforms
I won't give your address to anyone else, won't send you any spam, and you can unsubscribe at any time.
Disclaimer: Opinions are my own and not the views of my employer

Updated:

Comments