Skip to main content

Cracking Longest Common Subsequence: A Journey from Confusion to Clarity

Β· 8 min read
Mahmut Salman
Software Developer

I thought I had DP figured out. I'd solved house robber with that sweet max(rob_this_house + dp[i-2], skip_this_house) logic. I'd reversed linked lists by imagining "the rest is already solved." But when I faced Longest Common Subsequence, all my mental models fell apart.

No decision tree. No "imagine the rest" logic. No clever way to use previous answers. Just... confusion.

Here's how I finally cracked it.

The Problem That Broke My Pattern Recognition​

Longest Common Subsequence (LCS): Given two strings, find the length of their longest common subsequence. A subsequence is characters you can extract by deleting some (or zero) elements without changing the order.

Example:

  • Input: s1 = "acabac", s2 = "aebab"
  • Output: 3
  • Why? The LCS is "aba" (or "aab" or "acb" - multiple solutions exist)

Visual representation (from the problem):

    a   c   a   b   a   c           a   c   a   b   a   c           a   c   a   b   a   c
β•± OR β•± β•² β•± β•² OR β•± β•² β•²
a e b a b a e b a b a e b a b

Input: s1 = "acabac", s2 = "aebab"
Output: 3

My Initial Mental Models (All Failed)​

Coming from previous DP problems, I had three strategies:

1. Decision Tree Approach (House Robber Style)​

In house robber, each house had a clear decision: rob it or skip it. The states were obvious.

For LCS? I couldn't figure out what the "decision" even was.

2. "Imagine the Rest" (Linked List Reversal Style)​

When reversing a linked list, I'd think: "Imagine nodes 2β†’3β†’4β†’5 are already reversed. Now what do I do with node 1?"

For LCS? What does "the rest is solved" even mean here?

3. Using Previous Answers (Classic DP)​

I knew DP meant building solutions from subproblems. But I couldn't see how solving smaller strings helped me.

All I could manage:

int func(String s1, String s2, int index1, int index2) {
// ... now what?
}

And one random idea: "Maybe use HashMap for character frequencies?" (Spoiler: that doesn't preserve order, so it's useless for subsequences.)

The Breakthrough Conversation​

Confusion #1: The Visualization​

Me: "Looking at the example, I see we're skipping the first 'a' in string 1 but taking the 'a' at index 2. So even when characters match, we're making a choice to skip or take?"

Mentor: "Wait, let me clarify. Those diagonal lines show characters we ARE including. But here's the key insight you're missing..."

Key Insight #1

When characters match (s1[i] == s2[j]), taking them is never worse than skipping them for finding the maximum length. Even though multiple valid LCS sequences exist, we can always take matching characters greedily when computing length.

Me: "Oh! So when they match, there's actually no decision to make?"

Mentor: "Exactly. The real decision happens when they DON'T match."

Confusion #2: What Happens When Characters Don't Match?​

Me: "Okay, so when s1[i] β‰  s2[j], we need to... increase both indices?"

Mentor: "Think about it differently. If they don't match, at least one of these characters can't be in our common subsequence at this position. But we don't know which one to skip!"

The Real Decision

When characters don't match, we have two options:

  1. Skip s1[i] β†’ try func(i+1, j)
  2. Skip s2[j] β†’ try func(i, j+1)

We take the maximum of both choices.

This was the decision tree I'd been looking for!

Confusion #3: Return Values vs Parameters​

This is where everything almost fell apart for me. I wrote:

int func(String s1, String s2, int index1, int index2, int answer) {
if (s1.charAt(index1) == s2.charAt(index2)) {
func(s1, s2, index1++, index2++, answer++);
}
if (s1.charAt(index1) != s2.charAt(index2)) {
answer += Math.max(
func(s1, s2, index1++, index2, answer),
func(s1, s2, index1, index2++, answer)
);
}
}

Me: "I get the logic, but how do I handle what these functions return? Do I accumulate the answer in a parameter?"

Mentor: "Stop! You're thinking about this backwards."

Key Insight #2

The function should RETURN the length of LCS, not accumulate it in a parameter.

Think of it this way: func(s1, s2, 0, 0) asks "What's the length of LCS for the entire strings?"

The function returns that number!

Me: "Oh... so it's like asking 'What is the answer?' not 'Add this to the answer.'"

Mentor: "Exactly! Now rewrite it."

int func(String s1, String s2, int index1, int index2) {
if (s1.charAt(index1) == s2.charAt(index2)) {
// We found 1 matching character!
// Total LCS = 1 + LCS of remaining strings
return 1 + func(s1, s2, index1+1, index2+1);
}

// Characters don't match - try both options
return Math.max(
func(s1, s2, index1+1, index2), // Skip s1[i]
func(s1, s2, index1, index2+1) // Skip s2[j]
);
}

Confusion #4: Base Cases​

Me: "For void recursive functions, I'd just use return; to stop. But here, what I return affects the overall answer. Should it be 1? Or 0?"

Mentor: "Think about what it means. If index1 >= s1.length(), we've run out of characters in s1. How many characters can be common between an empty string and whatever remains?"

Me: "Zero! An empty string has no common subsequence with anything."

Key Insight #3

Base cases return 0 because:

  • If we run out of characters in either string, there are no more possible matches
  • The LCS length of an empty string with anything is 0
  • This zero propagates up through the recursive calls

The Complete Solution​

public int longestCommonSubsequence(String s1, String s2) {
return func(s1, s2, 0, 0);
}

private int func(String s1, String s2, int index1, int index2) {
// Base case: ran out of characters in either string
if (index1 >= s1.length() || index2 >= s2.length()) {
return 0; // No more matches possible
}

// Case 1: Characters match - take them!
if (s1.charAt(index1) == s2.charAt(index2)) {
return 1 + func(s1, s2, index1 + 1, index2 + 1);
}

// Case 2: Characters don't match - try both options
return Math.max(
func(s1, s2, index1 + 1, index2), // Skip character from s1
func(s1, s2, index1, index2 + 1) // Skip character from s2
);
}

Trace Example: s1 = "AB", s2 = "A"​

Let me trace how this works:

func(0,0): 'A'=='A'
β†’ return 1 + func(1,0)

func(1,0): 'B'β‰ 'A'
β†’ return max(func(2,0), func(1,1))

func(2,0): index1 out of bounds
β†’ return 0

func(1,1): index2 out of bounds
β†’ return 0

So func(1,0) returns max(0,0) = 0
So func(0,0) returns 1 + 0 = 1 βœ“

The answer bubbles up from the base cases!

What Finally Made This Click​

1. Understanding What the Function Returns​

This was huge. I was stuck thinking about accumulating an answer. Once I understood the function answers a question ("What is the LCS length from this point?"), everything fell into place.

2. Recognizing the Real Decision Point​

The decision isn't "take or skip matching characters." It's "when they DON'T match, which one do we skip?" That's the branching.

3. Trusting the Recursion​

Like with linked list reversal, I had to trust that func(i+1, j+1) correctly returns the LCS of the remaining substrings. I don't need to see how - I just use it.

4. Base Cases as Boundary Conditions​

Base cases aren't just "stop the recursion." They're the foundation that builds up the answer. Every recursive call eventually hits a base case, and those zeros/values propagate back up.

The Pattern I Missed Before​

Looking back at my previous DP problems:

House Robber: max(rob_this + dp[i-2], skip_this)

  • Decision: Rob or skip current house

LCS: 1 + func(i+1,j+1) when match, else max(func(i+1,j), func(i,j+1))

  • Decision: When no match, which character to skip

Both have decisions, but LCS's decision point is less obvious because matching characters have no choice - they're always taken.

Next Steps: Optimizing with DP​

This recursive solution works but has overlapping subproblems (we compute the same func(i, j) multiple times). The next evolution:

  1. Memoization (Top-Down DP): Cache results in a 2D array
  2. Tabulation (Bottom-Up DP): Build the solution iteratively
  3. Space Optimization: Reduce 2D array to 1D or even O(1)

But those optimizations only make sense once you understand the recursive structure. And now, I finally do.

Key Takeaways​

βœ… What the function returns matters more than what it receives - Don't accumulate in parameters; return the answer

βœ… Decision points aren't always obvious - Sometimes the "decision" is only visible when something goes wrong (characters don't match)

βœ… Matching greedily works for LCS - When characters match, always take them for maximum length

βœ… Base cases are the foundation - They're not just stoppers; they're the values that build up the answer

βœ… Trust the recursion - If func(i,j) correctly returns the LCS from that point, use it. Don't overthink the mechanics.


Looking back, my initial confusion makes sense. LCS doesn't fit neatly into the "make a choice at every step" pattern like house robber. But once I understood that the choice happens only when characters don't match, and that the function returns an answer rather than accumulates one, everything clicked.

Sometimes the breakthrough isn't finding a new pattern - it's understanding why your old patterns don't apply, and adjusting your mental model accordingly.