8 questions using this pattern
Track disjoint sets with efficient union and find operations. Union-Find (also called Disjoint Set Union) excels at dynamically grouping elements and answering "are these two elements in the same group?" queries.
Imagine a social network where you want to know if two people are connected (directly or through friends of friends). Instead of searching the entire network each time, everyone in a connected group points to a group leader. To check if two people are connected, just check if they have the same leader. When groups merge (someone bridges two groups), you just update one leader to point to the other.
Another analogy: corporate acquisitions. Each company has a parent company (possibly itself). When companies merge, one becomes a subsidiary of the other. To find the ultimate parent, you follow the chain of ownership.
Union-Find maintains a forest of trees where each tree represents a set. Each element points to its parent, and the root of the tree is the set's representative.
Two key operations:
Two key optimizations:
With both optimizations, operations run in nearly O(1) time—technically O(α(n)) where α is the inverse Ackermann function, which is ≤ 4 for any practical input size.
def findRedundantConnection(edges): n = len(edges) parent = list(range(n + 1)) rank = [0] * (n + 1) def find(x): if parent[x] != x: parent[x] = find(parent[x]) # Path compression return parent[x] def union(x, y): px, py = find(x), find(y) if px == py: return False # Already connected! # Union by rank if rank[px] < rank[py]: px, py = py, px parent[py] = px if rank[px] == rank[py]: rank[px] += 1 return True for u, v in edges: if not union(u, v): return [u, v] # Redundant edge found!We have a graph that started as a tree with n nodes, but one extra edge was added, creating a cycle. Our task is to find and return the edge that can be removed to restore the tree structure.
Initial state (each element is its own set):
parent: [0, 1, 2, 3, 4] (each points to itself)
Sets: {0}, {1}, {2}, {3}, {4}
Union(0, 1):
parent: [0, 0, 2, 3, 4] (1 now points to 0)
0
|
1
Sets: {0, 1}, {2}, {3}, {4}
Union(2, 3) then Union(3, 4):
parent: [0, 0, 2, 2, 3]
0 2
| / \
1 3 (direct)
|
4
Sets: {0, 1}, {2, 3, 4}
Union(1, 4) — merges the two trees:
Find(1) = 0, Find(4) = 2
Union by rank: attach smaller under larger
parent: [0, 0, 0, 2, 3]
0
/|
1 2
/ \
3 (direct)
|
4
Sets: {0, 1, 2, 3, 4}
Path compression during Find(4):
Find(4): 4 → 3 → 2 → 0 (found root)
Compress: make 4, 3, 2 all point directly to 0
parent: [0, 0, 0, 0, 0]
0
/|\ \
1 2 3 4
Now Find(4) is O(1)!
Use this skeleton as a starting point for problems using this pattern:
class UnionFind:
"""Union-Find with path compression and union by rank."""
def __init__(self, n: int):
self.parent = list(range(n))
self.rank = [0] * n
self.count = n # Number of disjoint sets
def find(self, x: int) -> int:
"""Find root with path compression."""
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x])
return self.parent[x]
def union(self, x: int, y: int) -> bool:
"""Union by rank. Returns True if x and y were in different sets."""
root_x, root_y = self.find(x), self.find(y)
if root_x == root_y:
return False # Already in same set
# Union by rank
if self.rank[root_x] < self.rank[root_y]:
root_x, root_y = root_y, root_x
self.parent[root_y] = root_x
if self.rank[root_x] == self.rank[root_y]:
self.rank[root_x] += 1
self.count -= 1
return True
def connected(self, x: int, y: int) -> bool:
"""Check if x and y are in the same set."""
return self.find(x) == self.find(y)
def count_components(n: int, edges: list[list[int]]) -> int:
"""Count connected components in undirected graph."""
uf = UnionFind(n)
for u, v in edges:
uf.union(u, v)
return uf.count
def has_cycle(n: int, edges: list[list[int]]) -> bool:
"""Detect cycle in undirected graph."""
uf = UnionFind(n)
for u, v in edges:
if uf.connected(u, v):
return True # Adding edge creates cycle
uf.union(u, v)
return False
def kruskal_mst(n: int, edges: list[tuple[int, int, int]]) -> int:
"""Kruskal's MST algorithm using Union-Find."""
# edges are (weight, u, v)
edges.sort() # Sort by weight
uf = UnionFind(n)
mst_weight = 0
edges_used = 0
for weight, u, v in edges:
if uf.union(u, v):
mst_weight += weight
edges_used += 1
if edges_used == n - 1:
break
return mst_weight if edges_used == n - 1 else -1
def accounts_merge(accounts: list[list[str]]) -> list[list[str]]:
"""Merge accounts with common emails."""
email_to_id = {}
email_to_name = {}
uf = UnionFind(len(accounts))
# Map emails to account indices
for i, account in enumerate(accounts):
name = account[0]
for email in account[1:]:
email_to_name[email] = name
if email in email_to_id:
uf.union(i, email_to_id[email])
else:
email_to_id[email] = i
# Group emails by root account
from collections import defaultdict
root_to_emails = defaultdict(set)
for email, idx in email_to_id.items():
root = uf.find(idx)
root_to_emails[root].add(email)
# Build result
return [[email_to_name[next(iter(emails))]] + sorted(emails)
for emails in root_to_emails.values()]Look for these keywords and patterns in problem descriptions:
Without path compression, repeated Find operations can be O(n) each, degrading overall performance.
Fix:
Always compress paths during Find:
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x])
Union-Find assumes undirected connections. For directed graphs, cycles mean something different (back edges in DFS).
Fix:
Use DFS with coloring (WHITE/GRAY/BLACK) for cycle detection in directed graphs. Union-Find is for undirected connectivity.
Track whether elements are in the same connected component.
Examples: Number of Connected Components, Friend Circles
If union is called on two already-connected elements, adding that edge would create a cycle.
Examples: Redundant Connection, Graph Valid Tree
Sort edges by weight, greedily add edges that don't create cycles (checked via Union-Find).
Examples: Min Cost to Connect All Points, Connecting Cities With Minimum Cost
Handle streaming edge insertions while answering connectivity queries.
Examples: Evaluate Division, Accounts Merge
Track relative weights/distances between elements and their roots. Used in problems with equivalence relationships.
Examples: Evaluate Division (weighted paths)
Master the pattern with these representative problems.
Test your mastery with complex variations.
Explore similar techniques:
Level-by-level traversal using a queue, exploring all neighbors at the current depth before moving deeper. This guarantees the shortest path in unweighted graphs and is ideal for problems requiring layer-by-layer processing.
Explore as deep as possible along each branch before backtracking, using recursion or an explicit stack. DFS is memory-efficient and naturally suited for problems involving paths, connectivity, and exhaustive exploration.
For problems asking "how many components," manually counting at the end is inefficient.
Fix:
Decrement count in union when merging two different sets:
if root_x != root_y:
self.count -= 1
Some solutions need to know if a union actually merged two sets or if they were already connected.
Fix:
Return boolean from union indicating if merge happened:
if root_x == root_y:
return False # Already same set
# ... do union ...
return True # Merged