The O(n) Club: Find Duplicate Subtrees and Outsmart Copy-Paste Branches

The O(n) Club: Find Duplicate Subtrees and Outsmart Copy-Paste Branches

⚡ TL;DR

Cut through the noise: Want to spot all duplicate subtrees in your binary tree? Serialize each subtree’s structure and values, hash those babies, and only report each copy-paste once. No O(N²) headaches. Exhibit A (Java version):

// Quick sketch:
Map<String, Integer> serialCount = new HashMap<>();
List<TreeNode> result = new ArrayList<>();
String serialize(TreeNode node) {
    if (node == null) return "#";
    String rep = serialize(node.left) + ',' + node.val + ',' + serialize(node.right);
    serialCount.put(rep, serialCount.getOrDefault(rep, 0) + 1);
    if (serialCount.get(rep) == 2) result.add(node);
    return rep;
}

🧠 How Devs Usually Mess This Up

If your go-to move is comparing every subtree to every other subtree, that’s… bold and also criminally inefficient. O(N²) is for caffeine-induced nightmares, not for interviews. Common newbie traps:

  • Comparing by value only (structure matters, not just numbers. A palm tree and a coat rack are both vertical, but…)
  • Returning every duplicate you see (the question asks for each pattern exactly once)
  • Skipping serialization altogether (get comfy with recursion, it’s here to stay)

🔍 What’s Actually Going On

Imagine every subtree in your binary tree hands you a business card: “Here’s exactly what I look like, left and right kids included.” If two subtrees have identical business cards, congrats—you’ve got a copy-paste job. Serialization (turning a subtree into a string like left,val,right, with # for nulls) gives each subtree a unique fingerprint, structure and all. Traverse post-order so every child knows its business before the parent claims the glory. When a serialization shows up twice, snag that root node—once!

🛠️ PseudoCode

  • Traverse the binary tree with post-order (left, right, root):
    • Serialize left subtree
    • Serialize right subtree
    • Merge: left + ‘,’ + node value + ‘,’ + right
  • Use a hash map to count how many times each serialization pops up:
    map.put(rep, map.getOrDefault(rep,0) + 1);
  • When a serialization hits count 2, add the node to results (no need to celebrate every subsequent copy—strict monogamy, please).
  • Return the glorious list of root nodes with duplicate subtrees when done.

💻 The Code

// Fully caffeinated Java solution
class Solution {
    private Map<String, Integer> map = new HashMap<>();
    private List<TreeNode> results = new ArrayList<>();
     public List<TreeNode> findDuplicateSubtrees(TreeNode root) {
        serialize(root);
        return results;
    }
     private String serialize(TreeNode node) {
        if (node == null) return "#";
        String rep = serialize(node.left) + "," + node.val + "," + serialize(node.right);
        map.put(rep, map.getOrDefault(rep, 0) + 1);
        if (map.get(rep) == 2) {
            results.add(node);
        }
        return rep;
    }
}
 // Classic TreeNode
class TreeNode {
    int val;
    TreeNode left, right;
    TreeNode(int x) {
        val = x;
    }
}

⚠️ Pitfalls, Traps & Runtime Smacks

  • Structure is the law: [1,null,2] is not [1,2,null]. Serialization must include nulls (think: # for nowhere).
  • Don’t collect every matching pattern. For each unique copy-paste, add one root node—first duplicate only. Interviewer tears saved.
  • Comparing nodes by value alone misses all the real structural twins.
  • Space/Time: O(N). Brag-worthy, until your tree is a thousand levels deep—then your strings get long and memory gets cranky.

🧠 Interviewer Brain-Tattoo

When trees start handing in identical resumes, let serialization and your hash map do the HR work—no more binary detective nonsense.

Previous Article

The Mutex Club: Thread-Local Sleuthing with Java MDC

Next Article

The Mutex Club: Taming JUnit Parallel Tests