Kth Largest Element in K Sorted Arrays

Kth Largest Element in K Sorted Arrays

December 19, 2023
medium
Merge Sort, Heap Sort, PriorityQueue

Problem #

Given K sorted arrays of integers, find the Kth largest element among all the elements in the arrays.

Solution #

Here’s the Python code for finding the Kth largest element among all elements in given sorted arrays:

import heapq

def kth_largest_in_sorted_arrays(arrays, k):
    # Create a min heap
    min_heap = []
    
    # Initialize heap with the last element of each array
    for i, array in enumerate(arrays):
        if array:
            heapq.heappush(min_heap, (array[-1], i, len(array) - 1))

    # Pop elements from the heap k times
    while k > 0 and min_heap:
        val, arr_idx, ele_idx = heapq.heappop(min_heap)
        k -= 1

        # If there are more elements in the array, push the next largest element into the heap
        if ele_idx > 0:
            heapq.heappush(min_heap, (arrays[arr_idx][ele_idx - 1], arr_idx, ele_idx - 1))

    # Return the Kth largest element
    return val if k == 0 else None

# Example usage
arrays = [[1, 4, 7], [2, 5, 8], [3, 6, 9]]
k = 5
print(kth_largest_in_sorted_arrays(arrays, k))

This code uses a min heap to efficiently find the Kth largest element. It first initializes the heap with the last element of each array (assuming the arrays are sorted in ascending order) and then iteratively pops the smallest element from the heap until it reaches the Kth iteration. The function returns the Kth largest element. In the provided example, the function will return 5 for k = 5.

Solution Analysis: Step by Step #

  1. Importing the heapq module:

    import heapq
    

    The heapq module is a part of Python’s standard library and provides an implementation of the heap queue algorithm, also known as the priority queue algorithm. Heaps are binary trees for which every parent node has a value less than or equal to any of its children. This implementation uses a min-heap.

  2. Function Definition:

    def kth_largest_in_sorted_arrays(arrays, k):
    

    This defines a function named kth_largest_in_sorted_arrays that takes two parameters: arrays, a list of sorted arrays, and k, the index of the largest element we want to find.

  3. Creating a Min Heap:

    min_heap = []
    

    A min heap is initialized as an empty list. This heap will store tuples, where each tuple contains an element from the arrays, the index of the array it came from, and the index of the element in its array.

  4. Initializing the Heap:

    for i, array in enumerate(arrays):
        if array:
            heapq.heappush(min_heap, (array[-1], i, len(array) - 1))
    

    The heap is initialized with the last element (which is the largest due to sorting) of each array. The heapq.heappush function adds an element to the heap while maintaining the heap property. Each element pushed onto the heap is a tuple containing the value of the element, the index of the array it belongs to (i), and its index in its own array.

  5. Finding the Kth Largest Element:

    while k > 0 and min_heap:
        val, arr_idx, ele_idx = heapq.heappop(min_heap)
        k -= 1
        if ele_idx > 0:
            heapq.heappush(min_heap, (arrays[arr_idx][ele_idx - 1], arr_idx, ele_idx - 1))
    

    The algorithm repeatedly removes the smallest element from the heap (heapq.heappop) until it has done so k times. For each popped element, it checks if there are more elements in the source array (using ele_idx > 0). If so, it pushes the next largest element from that array onto the heap. This way, the heap always contains at most one element from each array, and these are the largest remaining elements.

  6. Returning the Result:

    return val if k == 0 else None
    

    After k iterations, the variable val holds the Kth largest element. If k is 0, the function returns val. If k is not zero (which could happen if k is larger than the total number of elements in all arrays), the function returns None, indicating that the Kth largest element does not exist.

  7. Example Usage:

    arrays = [[1, 4, 7], [2, 5, 8], [3, 6, 9]]
    k = 5
    print(kth_largest_in_sorted_arrays(arrays, k))
    

    This is an example to demonstrate the usage of the function. It defines a list of sorted arrays and a value for k, then calls the function and prints the result.

The key idea behind this approach is to use a min heap to efficiently find the Kth largest element across multiple sorted arrays. By always maintaining the largest elements in the heap, the algorithm ensures that the Kth element popped from the heap is the Kth largest overall.