- Leetcode 1570 (Premium)
- Difficulty: Medium
- Patterns:
Problem Definition
Given 2 sparse vectors, which mostly have 0 values, compute their dot product.
Do this by implementing the class SparseVector
, which supports:
- an initialization method that takes in an array of numbers representing the sparse vector, and
- a
dot_product
method that takes anotherSparseVector
and computes the dot product of the two vectors.
Examples
Some test cases to verify our solution.
v1 = SparseVector([1, 0, 0, 2, 3])
v2 = SparseVector([0, 3, 0, 4, 0])
assert v1.dot_product(v2) == 8
assert v2.dot_product(v1) == 8
l1 = [1] + [0] * 1_000_000 + [1]
l2 = [5, 4, 3, 2, 1, 0]
v1 = SparseVector(l1)
v2 = SparseVector(l2)
assert v1.dot_product(v2) == 5
assert v2.dot_product(v1) == 5
An example computation for the first case: $$ [1,0,0,2,3] \cdot [0,3,0,4,0] = 1\cdot0 + 0\cdot3 + 0\cdot0 + 2\cdot4 + 3\cdot0 = 2\cdot4 = 8 $$
Dictionary Solution
Since the vectors will be sparse, we will store only the populated indexes in a dictionary. To optimize traversal, we'll iterate over the smaller of the two dictionaries during multiplication.
class SparseVector:
def __init__(self, nums: list[int]):
self.data = {i: v for i, v in enumerate(nums) if v}
def dot_product(self, other_vector: 'SparseVector') -> int:
res = 0
d, other_d = self.data, other_vector.data
if len(other_d) < len(d):
d, other_d = other_d, d
for i in d:
if i in other_d:
res += d[i] * other_d[i]
return res
Dictionary Solution Complexity
Let $n = $ the number of non-zeros in the first vector, $m = $ the number of non-zeros in the second vector.
Runtime complexity for the dot_produt
method should be $\Theta(min(n, m))$.
Auxiliary space complexity for the two vectors should be $\Theta(n, m)$. In general, for a system of vectors ${v_i}$ with number of non-zeros ${n_i}$, the auxiliary space complexity should be $\Theta(\sum(n_i))$.
Binary Search Solution
While the dictionary solution works very well for very sparse vectors, it works less well for less sparse vectors due to memory usage.
For such cases, we can store our sparse vectors as tuples of (index, value)
.
For the dot product, we'll iterate over the smaller of the two and do a
binary search for the current index in the larger array.
class SparseVector2:
def __init__(self, nums: list[int]):
self.data = [(i, v) for i, v in enumerate(nums)]
def dot_product(self, other: 'SparseVector') -> int:
res = 0
d, other_d = self.data, other.data
if len(d) > len(other_d):
d, other_d = other_d, d
other_idx = 0
other_max = len(other_d) - 1
def search_other(left: int, target: int) -> int:
l, r = left, other_max
while l < r:
mid = (l + r) // 2
if other_d[mid][0] == target:
return mid
if other_d[mid][0] < target:
l = mid + 1
else:
r = mid
return l
for idx, val in d:
other_idx = search_other(other_idx, idx)
if other_d[other_idx][0] == idx:
res += other_d[other_idx][1] * val
return res
Binary Search Solution Complexity
Let $n = $ the number of non-zeros in the more sparse vectors, $m = $ the number of non-zeros in the less sparse vector.
Runtime complexity of dot_product
should be $\Theta(n \cdot log(m))$.
Auxiliary space complexity should be the same as the dictionary solution above.