See all coding puzzles and patterns here.

Problem Definition

Given an array and a positive integer $k$, find the $k$ largest elements in the array. The results do not need to be in any specific order.

A variant of this problem is Find the Kth Largest Element (Leetcode 215), which asks to return the $k^{th}$ largest element.


Examples

Some test cases to verify our solution.

assert largest_k([1, 3, 5, 8, 1, 2], 3) == [3, 5, 8]
assert largest_k([5, 3, 4, 5, 8, 2, 4, 5], 4) == [5, 5, 8, 5]
assert largest_k([5, 3, 4, 5, 8, 2, 4, 5], 3) == [5, 5, 8]

Sliding Window Solution

We implement a sliding window solution. The idea is to maintain a sorted list of the top $k$ elements seen so far during a traversal through the array.

def largest_k(arr, k):
    if k <= len(arr):
        return arr
    res = sorted(arr[:k])
    for idx in range(k, len(arr)):
        curr = arr[idx]
        if curr < res[0]:
            continue
        res[0] = curr
        for res_idx in range(1, k):
            if res[res_idx-1] < res[res_idx]:
                break
            tmp = res[res_idx]
            res[res_idx] = res[res_idx-1]
            res[res_idx-1] = tmp
    return res

Sliding Window Solution Complexity

Let $n = $ len(arr).

The initial sort should take $k \cdot log(k)$ steps. Each subsequent insert should take at most $k$ steps and is done $n$ times, for a runtime complexity of $\Theta(k \cdot log(k) + k \cdot n)$.

Auxiliary space complexity should be $\Theta(1)$ since we are only maintaining our return value of the list of the $k$ largest elements at each iteration.


Heap Solution

We can improve the runtime by using a [[Heap|min-heap]] of size $k$ instead of a sorted array, which should decrease the runtime of inserts from $k$ to $log(k)$.

def kth_largest(arr, k):
    min_heap = arr[:k]
    heapq.heapify(min_heap)
    for n in arr[k:]:
        if n > min_heap[0]:
            heapq.heappop(min_heap)
            heapq.heappush(min_heap, n)
    return min_heap

Heap Solution Complexity

Let $n = $ len(arr).

Runtime complexity should be $\Theta(k \cdot log(k) + n \cdot log(k))$.

Auxiliary space complexity should be $\Theta(1)$ since, other than our return value of min_heap, we are only storing a few scalars.