HACKER Q&A
📣 asomethings

If kafka topic has X number partition then should I deploy X consumer?


Few documents say that consumer must equal to consumer count or unless it will slow down. I understand why it slows down. So I did some simple math.

IF Kafka Throughput is 100K/s and takes 100ms to process it. It will need 10,000 partitions according to https://eventsizer.io/partitions .

Then should I deploy 10K consumers to process it without lag or latency? I think real-time chat app like `Discord` or `Slack` should have more throughput then 100K/s but I don't think they have 10K consumers up and running.

What am I missing here?


  👤 z9e Accepted Answer ✓
In theory yes, you want a 1:1 partition to consumer ratio, that would give you the most optimal throughput, but it’s okay if you have less consumers than partitions, you’ll just have consumers doing more work.

Keep in mind you shouldn’t really go over 20k partitions in a Kafka cluster (recommended by Confluent), as that’s when things will start to get unstable. 10k is a lot of partitions, we have a hundred topics in our main cluster and only are at 15k partitions total. But if your running a high scale low latency environment then that sounds like what you’d need.