Dot Product of Sparse Vectors

Day 75 — Dot Product of Sparse Vectors

Coding problem

Problem	Dot Product of Sparse Vectors
LeetCode ID(s)	LeetCode #1570
Difficulty	Medium
Pattern	Hash / Two Pointer
Company tags	Meta
Suggested time	15m

Solution outline (coding)

Store sparse vectors as index → value maps for nonzeros.
Iterate smaller map and lookup in the other; accumulate a[i]*b[i].

Time complexity: O(k + min(m,n)) where k = overlapping nonzeros.

Space complexity: O(k) — no dense arrays.

Show Python solution

from typing import List


class SparseVector:
  def __init__(self, nums: List[int]):
    self.m = {i: v for i, v in enumerate(nums) if v}

  def dotProduct(self, vec: 'SparseVector') -> int:
    a, b = self.m, vec.m
    if len(a) > len(b):
      a, b = b, a
    return sum(v * b.get(i, 0) for i, v in a.items())


# Input:  SparseVector([1,0,0,2,3]).dotProduct(SparseVector([0,3,0,4,0]))
# Output: 8

SQL interview practice

1. Interview question

Companies / track: Meta

Meta: expect event logging, ads / integrity, and social product analytics—deduping, sessionization, and window functions are frequent themes.

What you are asked to write (SQL prompt):

Frame this as metrics work for **Meta**-scale surfaces (ads, product, or engagement — as the tables suggest).
sparse_vectors(vector_id, index, value). BigQuery: implement dot product between two sparse vectors and aggregate similarities across pairs above a threshold.

Tables implied by the prompt:

sparse_vectors(vector_id, index, value)

Engine: BigQuery — use its date, array, and approximate functions as documented.

2. Solution outline

Clarify out loud: result grain (one row per what?), join keys, time zone, and any ORDER BY / LIMIT / tie-breakers.
Map Hash / Two Pointer to SQL: say the relational equivalent (e.g. hash map → GROUP BY + key; two pointers → ordered window + filter).
Aggregate: fix GROUP BY keys, then HAVING for thresholds (e.g. min volume).
Structure: CTEs (WITH) — one step per CTE; validate on a tiny slice (counts, nulls, duplicates).

Show SQL solution (BigQuery)

Main query

WITH e AS (
  SELECT vector_id, idx, val FROM sparse_vectors
)
SELECT a.vector_id AS v1, b.vector_id AS v2, SUM(a.val * b.val) AS dot_product
FROM e AS a
JOIN e AS b ON a.idx = b.idx AND a.vector_id < b.vector_id
GROUP BY v1, v2
HAVING dot_product > @threshold;

Day 75: Dot Product of Sparse Vectors