Day 66 — Median of Two Sorted Arrays
Coding problem
| Problem | Median of Two Sorted Arrays |
| LeetCode ID(s) | LeetCode #4 |
| Difficulty | Hard |
| Pattern | Binary Search |
| Company tags | |
| Suggested time | 25m |
Solution outline (coding)
- Binary search the partition of the two arrays so left halves contain half the total elements.
- Ensure left max ≤ right min on both sides; derive median from boundary values.
- Handle odd/even total length carefully.
Time complexity: O(log(min(m,n))).
Space complexity: O(1).
Show Python solution
from typing import List
class Solution:
def findMedianSortedArrays(self, nums1: List[int], nums2: List[int]) -> float:
if len(nums1) > len(nums2):
nums1, nums2 = nums2, nums1
m, n = len(nums1), len(nums2)
lo, hi = 0, m
half = (m + n + 1) // 2
while lo <= hi:
i = (lo + hi) // 2
j = half - i
left1 = float('-inf') if i == 0 else nums1[i - 1]
right1 = float('inf') if i == m else nums1[i]
left2 = float('-inf') if j == 0 else nums2[j - 1]
right2 = float('inf') if j == n else nums2[j]
if left1 <= right2 and left2 <= right1:
if (m + n) % 2:
return float(max(left1, left2))
return (max(left1, left2) + min(right1, right2)) / 2.0
if left1 > right2:
hi = i - 1
else:
lo = i + 1
return 0.0
# Input: nums1 = [1,3], nums2 = [2]
# Output: 2.0SQL interview practice
1. Interview question
Companies / track: Google
Google: SQL screens usually assume BigQuery, Ads / Search / YouTube-style fact tables, and talking through bytes processed and partition pruning.
What you are asked to write (SQL prompt):
Frame this as metrics work for **Google**-scale surfaces (ads, product, or engagement — as the tables suggest).
two_sorted_series(a_index, a_value, b_index, b_value). BigQuery: join and compute median of union; ensure scalability for large series via partitioned processing.
Tables implied by the prompt:
two_sorted_series(a_index, a_value, b_index, b_value)
Engine: BigQuery — use its date, array, and approximate functions as documented.
2. Solution outline
- Clarify out loud: result grain (one row per what?), join keys, time zone, and any
ORDER BY/LIMIT/ tie-breakers. - Map Binary Search to SQL: say the relational equivalent (e.g. hash map →
GROUP BY+ key; two pointers → ordered window + filter). - Filter time first: predicate on
DATE(ts)/ partition column before heavy joins; state the window in plain English. - Joins: name keys explicitly; watch fan-out from duplicate IDs.
- Percentiles: approx vs exact; say so if you use
APPROX_QUANTILES. - Structure: CTEs (
WITH) — one step per CTE; validate on a tiny slice (counts, nulls, duplicates).
Show SQL solution (BigQuery)
Main query
WITH u AS (
SELECT a_value AS v, 'a' AS src FROM two_sorted_series
UNION ALL
SELECT b_value, 'b' FROM two_sorted_series
),
ranked AS (
SELECT v, ROW_NUMBER() OVER (ORDER BY v) AS rn, COUNT(*) OVER () AS cnt FROM u
)
SELECT APPROX_QUANTILES(v, 100)[OFFSET(50)] AS median_union FROM ranked;