Day 53 — Graph Valid Tree
Coding problem
| Problem | Graph Valid Tree |
| LeetCode ID(s) | LeetCode #261 |
| Difficulty | Medium |
| Pattern | Union-Find |
| Company tags | |
| Suggested time | 20m |
Solution outline (coding)
- Valid tree: exactly
n-1edges and fully connected undirected graph. - Use Union-Find: if union finds already-connected nodes before n-1 unions, cycle; after, check one component.
- Or DFS from 0: must visit
nnodes.
Time complexity: O(n α(n)) with UF.
Space complexity: O(n).
Show Python solution
from typing import List
class Solution:
def validTree(self, n: int, edges: List[List[int]]) -> bool:
if len(edges) != n - 1:
return False
parent = list(range(n))
def find(x: int) -> int:
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
def union(a: int, b: int) -> bool:
ra, rb = find(a), find(b)
if ra == rb:
return False
parent[rb] = ra
return True
for a, b in edges:
if not union(a, b):
return False
return len({find(i) for i in range(n)}) == 1
# Input: n = 5, edges = [[0,1],[0,2],[0,3],[1,4]]
# Output: TrueSQL interview practice
1. Interview question
Companies / track: Google
Google: SQL screens usually assume BigQuery, Ads / Search / YouTube-style fact tables, and talking through bytes processed and partition pruning.
What you are asked to write (SQL prompt):
Frame this as metrics work for **Google**-scale surfaces (ads, product, or engagement — as the tables suggest).
network_edges(a,b). BigQuery: detect if network is a valid tree (n nodes, n-1 edges, single component) and output diagnostics.
Tables implied by the prompt:
network_edges(a, b)tree(n nodes, n-1 edges, single component)
Engine: BigQuery — use its date, array, and approximate functions as documented.
2. Solution outline
- Clarify out loud: result grain (one row per what?), join keys, time zone, and any
ORDER BY/LIMIT/ tie-breakers. - Map Union-Find to SQL: say the relational equivalent (e.g. hash map →
GROUP BY+ key; two pointers → ordered window + filter). - Structure: CTEs (
WITH) — one step per CTE; validate on a tiny slice (counts, nulls, duplicates).
Show SQL solution (BigQuery)
Main query
WITH c AS (SELECT COUNT(*) AS n FROM nodes),
e AS (SELECT COUNT(*) AS m FROM network_edges),
comps AS (SELECT APPROX_TOP_COUNT(a, 1)[OFFSET(0)].value AS approx_root FROM network_edges)
SELECT n, m, m = n - 1 AS edge_count_ok FROM c, e;