The O(n) Club: Find Duplicate Subtrees: Java, HashMaps, and Tree Déjà Vu

The O(n) Club: Find Duplicate Subtrees — Java, HashMaps, and Tree Déjà Vu

⚡ TL;DR

If you want to spot all duplicate subtrees in a binary tree and still have time for lunch, serialize each subtree using DFS, count the patterns with a HashMap, and return a root for each duplicate. Please don’t brute force—nobody likes O(n^2) drama. Here’s a Java preview:

// DFS + HashMap approach
Map<String, Integer> freq = new HashMap<>();
List<TreeNode> result = new ArrayList<>();
serialize(root);
 String serialize(TreeNode node) {
    if (node == null) return "#";
    String key = node.val + "," + serialize(node.left) + "," + serialize(node.right);
    freq.put(key, freq.getOrDefault(key, 0) + 1);
    if (freq.get(key) == 2) result.add(node);
    return key;
}

🧠 How Devs Usually Mess This Up

Ever compared every subtree to every other subtree directly, hoping for a match? Welcome to time complexity regret. Or maybe you skip encoding null nodes in serialization—turns out, that’s just giving your bug a free pass. And let’s not forget those who hand back all duplicate nodes, not one per kind. Your interviewer is now facepalming.

🔍 What’s Actually Going On

Imagine your tree’s subtrees are all little robots; they look similar but you need a unique blueprint for each. Serialization creates that blueprint: same structure and values yield identical signatures. Organize all detected blueprints in a big HashMap. Spot duplicates whenever a signature gets popular (appears more than once). Dump each kind’s root node into your winning list. Boom, you’re done—instead of endless, error-prone comparison loops.

🛠️ PseudoCode

  • DFS Traversal: Visit every node bottom-up (because you want the subtrees intact before you serialize the parent).
  • Serialize Each Subtree: For a given node, make a string: node value, serialize left, serialize right. For nulls, use something iconic (like ‘#’).
    String serialize(TreeNode node) {
        if (node == null) return "#";
        return node.val + ',' + serialize(node.left) + ',' + serialize(node.right);
    }
  • Track Occurrences: For each serialization, up the count in your HashMap.
    freq.put(serial, freq.getOrDefault(serial, 0) + 1);
  • Collect Roots: Only when a serialization appears exactly twice do you add the node’s root to your result list. Not before, not after.
  • Return List: Roll credits with your result list of unique duplicate roots!

💻 The Code

// Definition for tree node
class TreeNode {
    int val;
    TreeNode left;
    TreeNode right;
    TreeNode(int x) { val = x; }
}
 import java.util.*;
 public class Solution {
    Map<String, Integer> freq = new HashMap<>();
    List<TreeNode> result = new ArrayList<>();
     public List<TreeNode> findDuplicateSubtrees(TreeNode root) {
        serialize(root);
        return result;
    }
     private String serialize(TreeNode node) {
        if (node == null) return "#";
        String serial = node.val + "," + serialize(node.left) + "," + serialize(node.right);
        freq.put(serial, freq.getOrDefault(serial, 0) + 1);
        if (freq.get(serial) == 2) result.add(node); // Add only once
        return serial;
    }
}

⚠️ Pitfalls, Traps & Runtime Smacks

  • Skipped Nulls: Not encoding null children? You’ll match up different shapes and cause tree identity theft. Always serialize nulls.
  • Too Many Outputs: Output one root per duplicate kind, not every repeat you see. Less is more, trust me.
  • Memory Gluttony: If the tree is a “clone factory,” your HashMap and strings may have a field day. Pro tip: Interviewers love integer-ized serial IDs if you really want to wow them.
  • Complexity Blues: Yes, it’s basically O(n)—unless your tree is a Russian nesting doll of clones, then RAM might grumble.

🧠 Interviewer Brain-Tattoo

“Comparing subtrees node-by-node is like trying to spot twins by DNA sequencing during a 10-minute speed date. Don’t. Serialize and HashMap—it’s the shortcut to tree enlightenment.”

Previous Article

The Mutex Club: Thread-Local Sleuthing with Java MDC

Next Article

The Mutex Club: Taming JUnit Parallel Tests