⚫ home ⚫ about ⚫ pale of notes ⚫ articles by date

Apache Kafka notes

Getting started

Documentation is really awesome, quckstart works.

Advice: don’t use root ZooKeeper path: --zookeeper localhost:2181/kafka works.

TBD: what if we have ZK cluster? Probably the syntax is --zookeeper localhost:2181,localhost:2182/kafka (path is mentioned only once, at the very end).

Deployment considerations

Kafka and Zookeeper

Netflix says scary things — like, kafka cluster is not really good in surviving ZK ensemble break down. They deploy ZK cluster per kafka cluster.

Offsets replication factor

You should increase replication factor of __consumer_offsets topic to at least 3. For example:

offsets.topic.replication.factor=3

If we do, we need to make sure that when this topic is created (e.g. when the first consumer is connected) we have 3 alive brokers. Otherwise, we may need to manually increase the replication factor for this topic.

Monitoring and management UIs

Offsets/consumers

Management

The Kafka REST Proxy can also be used.

Operations notes

Getting the list of active brokers

Somewhat ugly, but works when you need it, and you don’t have a way to compile Java code handy:

bin/zookeeper-shell.sh localhost:2181/kafka/local ls /brokers/ids \
    | tail -n1 | egrep -o '[[:digit:]]+'

Managing partitions

Partitions can be reassigned manually:

  • if you need to change the replication factor of the existing topic;
  • if, with brokers leaving and joining the cluster, we ended up with uneven load distribution on brokers;
  • and more!

The reassignment is done in several steps:

  1. you collect the data – how you want to do it;
  2. you generate the JSON that describes the change: for each partition, it should have a list of brokers that replicate it, and the first one is the preferred one – the one you want to become a leader;
  3. you apply this change with bin/kafka-reassign-partitions.sh --execute;
  4. you wait for this change to complete, checking the progress with bin/kafka-reassign-partitions.sh --verify;
  5. you ask the cluster to actually reassign the partition leaders with bin/kafka-preferred-replica-election.sh

A simple example of putting it all together is in this gist. It uses bin/kafka-reassign-partitions.sh to generate the plan. If we need some special adjustments, or we want to change the replication factor (this is done by adding more brokers) — than we’ll have to edit this plan, either manually or with some kind of script, or just generate the plan with a script right away.