⚫ home ⚫ about ⚫ pale of notes ⚫ articles by date

Apache Kafka notes

Getting started

Documentation is really awesome, quckstart works.

Advice: don’t use root ZooKeeper path: --zookeeper localhost:2181/kafka works.

TBD: what if we have ZK cluster? Probably the syntax is --zookeeper localhost:2181,localhost:2182/kafka (path is mentioned only once, at the very end).

Deployment considerations

Kafka and Zookeeper

Netflix says scary things — like, kafka cluster is not really good in surviving ZK ensemble break down. They deploy ZK cluster per kafka cluster.

Offsets replication factor

You should increase replication factor of __consumer_offsets topic to at least 3. For example:


If we do, we need to make sure that when this topic is created (e.g. when the first consumer is connected) we have 3 alive brokers. Otherwise, we may need to manually increase the replication factor for this topic.

Monitoring and management UIs



The Kafka REST Proxy can also be used.

Operations notes

Getting the list of active brokers

Somewhat ugly, but works when you need it, and you don’t have a way to compile Java code handy:

bin/zookeeper-shell.sh localhost:2181/kafka/local ls /brokers/ids \
    | tail -n1 | egrep -o '[[:digit:]]+'

Managing partitions

Partitions can be reassigned manually:

  • if you need to change the replication factor of the existing topic;
  • if, with brokers leaving and joining the cluster, we ended up with uneven load distribution on brokers;
  • and more!

The reassignment is done in several steps:

  1. you collect the data – how you want to do it;
  2. you generate the JSON that describes the change: for each partition, it should have a list of brokers that replicate it, and the first one is the preferred one – the one you want to become a leader;
  3. you apply this change with bin/kafka-reassign-partitions.sh --execute;
  4. you wait for this change to complete, checking the progress with bin/kafka-reassign-partitions.sh --verify;
  5. you ask the cluster to actually reassign the partition leaders with bin/kafka-preferred-replica-election.sh

A simple example of putting it all together is in this gist. It uses bin/kafka-reassign-partitions.sh to generate the plan. If we need some special adjustments, or we want to change the replication factor (this is done by adding more brokers) — than we’ll have to edit this plan, either manually or with some kind of script, or just generate the plan with a script right away.