Kth Largest Element in a Stream

Kth Largest Element in a Stream

January 6, 2024
medium
heap

Problem: #

Design a class to find the kth largest element in a stream of numbers. The class should have the following two methods:

  • add(num): Adds the number num to the stream.
  • getKthLargest(): Returns the kth largest element in the stream.

Example:

  1. Initialize an instance with k=3 and the stream stream = [4, 5, 8, 2].
  2. getKthLargest() should return 4.
  3. After adding 3, 5, 10, and 9 to the stream, getKthLargest() should return 5.

Solution #

The key to solving this problem efficiently is to use a min-heap of size k. The heap will store the k largest elements seen so far. When a new number is added, if the heap is not full, we add it directly. If the heap is full, we only add the new number if it is larger than the smallest number in the heap (the root of the min-heap), and then we remove the smallest number.

Algorithm Steps #

  1. Initialize a min-heap with a capacity of k.
  2. For the initial stream of numbers, add each number to the heap until it reaches capacity.
  3. For each new number added by the add method:
    • If the heap is not full, add the new number.
    • If the heap is full, compare the new number with the smallest number in the heap. If it’s larger, remove the smallest number and add the new number.
  4. The getKthLargest method returns the root of the min-heap.

Code (Python) #

import heapq

class KthLargest:
    def __init__(self, k, nums):
        self.heap = nums
        self.k = k
        heapq.heapify(self.heap)
        while len(self.heap) > k:
            heapq.heappop(self.heap)

    def add(self, val):
        if len(self.heap) < self.k:
            heapq.heappush(self.heap, val)
        elif val > self.heap[0]:
            heapq.heappushpop(self.heap, val)
        return self.heap[0]

    def getKthLargest(self):
        return self.heap[0]

# Test the class
kthLargest = KthLargest(3, [4, 5, 8, 2])
print(kthLargest.getKthLargest())  # Output: 4
kthLargest.add(3)
kthLargest.add(5)
kthLargest.add(10)
kthLargest.add(9)
print(kthLargest.getKthLargest())  # Output: 5

Time Complexity #

  • add(num): O(log k), since we potentially need to insert a new element into the heap.
  • getKthLargest(): O(1), as it returns the element at the root of the heap.

Space Complexity #

O(k), since the heap stores at most k elements.