The O(n) Club: Find Duplicate Subtrees — Java, HashMaps, and Tree Déjà Vu
⚡ TL;DR
If you want to spot all duplicate subtrees in a binary tree and still have time for lunch, serialize each subtree using DFS, count the patterns with a HashMap, and return a root for each duplicate. Please don’t brute force—nobody likes O(n^2) drama. Here’s a Java preview:
// DFS + HashMap approach Map<String, Integer> freq = new HashMap<>(); List<TreeNode> result = new ArrayList<>(); serialize(root); String serialize(TreeNode node) { if (node == null) return "#"; String key = node.val + "," + serialize(node.left) + "," + serialize(node.right); freq.put(key, freq.getOrDefault(key, 0) + 1); if (freq.get(key) == 2) result.add(node); return key; }
🧠 How Devs Usually Mess This Up
Ever compared every subtree to every other subtree directly, hoping for a match? Welcome to time complexity regret. Or maybe you skip encoding null nodes in serialization—turns out, that’s just giving your bug a free pass. And let’s not forget those who hand back all duplicate nodes, not one per kind. Your interviewer is now facepalming.
🔍 What’s Actually Going On
Imagine your tree’s subtrees are all little robots; they look similar but you need a unique blueprint for each. Serialization creates that blueprint: same structure and values yield identical signatures. Organize all detected blueprints in a big HashMap. Spot duplicates whenever a signature gets popular (appears more than once). Dump each kind’s root node into your winning list. Boom, you’re done—instead of endless, error-prone comparison loops.
🛠️ PseudoCode
- DFS Traversal: Visit every node bottom-up (because you want the subtrees intact before you serialize the parent).
- Serialize Each Subtree: For a given node, make a string: node value, serialize left, serialize right. For nulls, use something iconic (like ‘#’).
String serialize(TreeNode node) { if (node == null) return "#"; return node.val + ',' + serialize(node.left) + ',' + serialize(node.right); }
- Track Occurrences: For each serialization, up the count in your HashMap.
freq.put(serial, freq.getOrDefault(serial, 0) + 1);
- Collect Roots: Only when a serialization appears exactly twice do you add the node’s root to your result list. Not before, not after.
- Return List: Roll credits with your result list of unique duplicate roots!
💻 The Code
// Definition for tree node
class TreeNode {
int val;
TreeNode left;
TreeNode right;
TreeNode(int x) { val = x; }
}
import java.util.*;
public class Solution {
Map<String, Integer> freq = new HashMap<>();
List<TreeNode> result = new ArrayList<>();
public List<TreeNode> findDuplicateSubtrees(TreeNode root) {
serialize(root);
return result;
}
private String serialize(TreeNode node) {
if (node == null) return "#";
String serial = node.val + "," + serialize(node.left) + "," + serialize(node.right);
freq.put(serial, freq.getOrDefault(serial, 0) + 1);
if (freq.get(serial) == 2) result.add(node); // Add only once
return serial;
}
}
⚠️ Pitfalls, Traps & Runtime Smacks
- Skipped Nulls: Not encoding null children? You’ll match up different shapes and cause tree identity theft. Always serialize nulls.
- Too Many Outputs: Output one root per duplicate kind, not every repeat you see. Less is more, trust me.
- Memory Gluttony: If the tree is a “clone factory,” your HashMap and strings may have a field day. Pro tip: Interviewers love integer-ized serial IDs if you really want to wow them.
- Complexity Blues: Yes, it’s basically O(n)—unless your tree is a Russian nesting doll of clones, then RAM might grumble.
🧠 Interviewer Brain-Tattoo
“Comparing subtrees node-by-node is like trying to spot twins by DNA sequencing during a 10-minute speed date. Don’t. Serialize and HashMap—it’s the shortcut to tree enlightenment.”