Photo by Adam Kool on Unsplash

[System Design] How LinkedIn Solved Its Log Aggregation Problem: A Case Study Of Kafka

Nishant Tanwar 🕉️

--

In the article we will be going over the the essential parts from the research paper “Kafka: a Distributed Messaging System for Log Processing”.

Kafka was developed at LinkedIn for collecting and delivering high volumes of log data with low latency.

1) The Start

Back in 2011, large amount of “log” was getting generated at internet sized companies. Two types of data was generated which included

  1. User Activity Events : This includes logins, page views, clicks, “likes”, sharing, comments and search queries.
  2. Operational Metrics : This includes service call stack, call latency, errors, and system metrics such as CPU, memory, network and disk utilization on each machine.

This data then can be utilized for

  1. Search Relevance
  2. Recommendations which maybe driven by item popularity.
  3. Ad targeting and reporting.
  4. Security applications that protect against abusive behaviors such as spam or unauthorized data…

--

--

Nishant Tanwar 🕉️

Software Engineer @ Google. I write about programming, system design and health. Follow me for resources to crack your dream company.