Day 3 — Group Anagrams
Coding problem
| Problem | Group Anagrams |
| LeetCode ID(s) | LeetCode #49 |
| Difficulty | Medium |
| Pattern | Hash Map Grouping |
| Company tags | Meta, Google |
| Suggested time | 20m |
Solution outline (coding)
- For each word, build a canonical key: sorted characters, or a 26-length count signature.
- Use that key to bucket words in a map:
key -> list of words. - Return all groups with more than one word (or all values, depending on prompt).
Time complexity: O(n · k log k) if sorting each word of length k; O(n · k) with count vectors.
Space complexity: O(n · k) — storing all strings plus the map.
Show Python solution
from collections import defaultdict
from typing import List
class Solution:
def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
groups: dict[tuple[str, ...], List[str]] = defaultdict(list)
for s in strs:
key = tuple(sorted(s))
groups[key].append(s)
return list(groups.values())
# Input: strs = ["eat","tea","tan","ate","nat","bat"]
# Output: [["bat"],["nat","tan"],["ate","eat","tea"]] (order of groups/within groups may vary)SQL interview practice
1. Interview question
Companies / track: Meta, Google
Meta and Google both lean on ads and product surfaces: event streams, CTR-style rates, windowed metrics, and careful handling of identity joins.
What you are asked to write (SQL prompt):
Frame this as metrics work for **Meta, Google**-scale surfaces (ads, product, or engagement — as the tables suggest).
queries(id, text, created_at). BigQuery: derive an anagram_key (lowercase, remove spaces, sort chars) using SPLIT/ARRAY/STRING and group keys with count >= 5.
Tables implied by the prompt:
queries(id, text, created_at)anagram_key(lowercase, remove spaces, sort chars)
Engine: BigQuery — use its date, array, and approximate functions as documented.
2. Solution outline
- Clarify out loud: result grain (one row per what?), join keys, time zone, and any
ORDER BY/LIMIT/ tie-breakers. - Map Hash Map Grouping to SQL: say the relational equivalent (e.g. hash map →
GROUP BY+ key; two pointers → ordered window + filter). - Arrays / UDFs:
UNNESTand offsets for index; say when logic belongs in SQL vs a UDF. - Structure: CTEs (
WITH) — one step per CTE; validate on a tiny slice (counts, nulls, duplicates).
Show SQL solution (BigQuery)
WITH normalized AS (
SELECT
id,
text,
LOWER(REGEXP_REPLACE(text, r'\s+', '')) AS text_norm
FROM
queries
),
with_chars AS (
SELECT
id,
text,
ARRAY(SELECT ch FROM UNNEST(SPLIT(text_norm, '')) AS ch ORDER BY ch) AS chars_sorted
FROM
normalized
),
anagram_keys AS (
SELECT
id,
text,
ARRAY_TO_STRING(chars_sorted, '') AS anagram_key
FROM
with_chars
),
counts AS (
SELECT
anagram_key,
COUNT(*) AS group_count
FROM
anagram_keys
GROUP BY
anagram_key
HAVING
group_count >= 5
)
SELECT
q.id,
q.text,
q.anagram_key,
c.group_count
FROM
anagram_keys AS q
JOIN
counts AS c
USING (anagram_key)
ORDER BY
anagram_key,
q.id;