Day 75 — Dot Product of Sparse Vectors
Coding problem
| Problem | Dot Product of Sparse Vectors |
| LeetCode ID(s) | LeetCode #1570 |
| Difficulty | Medium |
| Pattern | Hash / Two Pointer |
| Company tags | Meta |
| Suggested time | 15m |
Solution outline (coding)
- Store sparse vectors as index → value maps for nonzeros.
- Iterate smaller map and lookup in the other; accumulate
a[i]*b[i].
Time complexity: O(k + min(m,n)) where k = overlapping nonzeros.
Space complexity: O(k) — no dense arrays.
Show Python solution
from typing import List
class SparseVector:
def __init__(self, nums: List[int]):
self.m = {i: v for i, v in enumerate(nums) if v}
def dotProduct(self, vec: 'SparseVector') -> int:
a, b = self.m, vec.m
if len(a) > len(b):
a, b = b, a
return sum(v * b.get(i, 0) for i, v in a.items())
# Input: SparseVector([1,0,0,2,3]).dotProduct(SparseVector([0,3,0,4,0]))
# Output: 8SQL interview practice
1. Interview question
Companies / track: Meta
Meta: expect event logging, ads / integrity, and social product analytics—deduping, sessionization, and window functions are frequent themes.
What you are asked to write (SQL prompt):
Frame this as metrics work for **Meta**-scale surfaces (ads, product, or engagement — as the tables suggest).
sparse_vectors(vector_id, index, value). BigQuery: implement dot product between two sparse vectors and aggregate similarities across pairs above a threshold.
Tables implied by the prompt:
sparse_vectors(vector_id, index, value)
Engine: BigQuery — use its date, array, and approximate functions as documented.
2. Solution outline
- Clarify out loud: result grain (one row per what?), join keys, time zone, and any
ORDER BY/LIMIT/ tie-breakers. - Map Hash / Two Pointer to SQL: say the relational equivalent (e.g. hash map →
GROUP BY+ key; two pointers → ordered window + filter). - Aggregate: fix
GROUP BYkeys, thenHAVINGfor thresholds (e.g. min volume). - Structure: CTEs (
WITH) — one step per CTE; validate on a tiny slice (counts, nulls, duplicates).
Show SQL solution (BigQuery)
Main query
WITH e AS (
SELECT vector_id, idx, val FROM sparse_vectors
)
SELECT a.vector_id AS v1, b.vector_id AS v2, SUM(a.val * b.val) AS dot_product
FROM e AS a
JOIN e AS b ON a.idx = b.idx AND a.vector_id < b.vector_id
GROUP BY v1, v2
HAVING dot_product > @threshold;