#392 - Is Subsequence
Is Subsequence
- Difficulty: Easy
- Topics: Two Pointers, String, Dynamic Programming
- Link: https://leetcode.com/problems/is-subsequence/
Problem Description
Given two strings s
and t
, return true
if s
is a subsequence of t
, or false
otherwise.
A subsequence of a string is a new string that is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (i.e., "ace"
is a subsequence of "abcde"
while "aec"
is not).
Example 1:
Input: s = "abc", t = "ahbgdc"
Output: true
Example 2:
Input: s = "axc", t = "ahbgdc"
Output: false
Constraints:
0 <= s.length <= 100
0 <= t.length <= 104
s
andt
consist only of lowercase English letters.
Follow up: Suppose there are lots of incoming s
, say s1, s2, ..., sk
where k >= 109
, and you want to check one by one to see if t
has its subsequence. In this scenario, how would you change your code?
Solution
1. Problem Deconstruction (500+ words)
Technical Version: Given two strings and , determine if there exists an increasing sequence of indices in such that for all from to , where denotes the length of string . The algorithm must return a boolean value indicating whether such a sequence exists, with time and space complexity optimized for the given constraints.
Beginner Version: Imagine you have two strings. The first string (s) is like a pattern you’re trying to find in the second string (t). You’re allowed to remove some characters from t, but you must keep the remaining characters in the same order. If you can get exactly s by doing this, then s is a “subsequence” of t. Your task is to write a program that checks if this is possible.
Mathematical Version: Let and be strings over alphabet . We say is a subsequence of if there exists a strictly increasing function such that for all , we have .
Constraint Analysis:
- : This relatively small upper bound means we can consider algorithms with time complexity without performance issues.
- : While larger than , this is still manageable for linear or near-linear algorithms. An solution would handle the maximum input size efficiently.
- Both strings consist only of lowercase English letters: This 26-character alphabet allows for potential optimizations using character indexing or precomputation.
- Hidden edge cases: Empty string (always a subsequence), longer than (impossible), equals (trivially true), and all characters of appearing in but in wrong order.
2. Intuition Scaffolding
Real-World Metaphor: Imagine you’re a detective trying to verify if a suspect’s alibi matches security camera footage. The alibi (string s) lists specific events in order: “entered store → bought coffee → left store.” The security footage (string t) shows many events throughout the day. You need to check if you can find those specific events in the exact same order in the footage, even if other unrelated events happened in between.
Gaming Analogy: Think of a rhythm game where you must hit specific notes (string s) in the correct sequence during a song (string t). Many other notes appear on screen, but you only care about hitting the target notes in order. If you can tap all the target notes as they appear (in sequence), you succeed. Missing any target note or hitting them out of order means failure.
Math Analogy: Consider two sequences of numbers where we want to determine if the first sequence appears as an “ordered subset” of the second. This is equivalent to checking if we can find an order-preserving injection from the indices of s to the indices of t that preserves element equality.
Common Pitfalls:
- Checking character counts only - Verifying that all characters in s appear in t is necessary but not sufficient (order matters)
- Using sliding window incorrectly - A fixed window won’t work since matching characters can be arbitrarily far apart
- Greedy matching from the end - This can miss valid subsequences that use earlier occurrences
- Two-pointer confusion - Moving the wrong pointer when characters don’t match
- Forgetting empty string case - The empty string is always a subsequence of any string
- Assuming sorted strings - The problem doesn’t require strings to be sorted
3. Approach Encyclopedia
Approach 1: Two-Pointer Iteration (Optimal)
What: Use two pointers to traverse both strings simultaneously, matching characters in order.
Why: Leverages the natural ordering constraint of subsequences while maintaining time complexity.
How:
Initialize s_index = 0, t_index = 0
While s_index < |s| AND t_index < |t|:
If s[s_index] equals t[t_index]:
Move s_index forward (found match)
Always move t_index forward
Return TRUE if s_index reached end of s, else FALSE
Complexity Proof:
- Time: - Each character in t is visited at most once
- Space: - Only two integer variables needed
Visualization:
s = "abc", t = "ahbgdc"
Step 0: s_ptr↑(a) | t_ptr↑(a) → MATCH → s_ptr=1, t_ptr=1
Step 1: s_ptr↑(b) | t_ptr↑(h) → NO MATCH → t_ptr=2
Step 2: s_ptr↑(b) | t_ptr↑(b) → MATCH → s_ptr=2, t_ptr=3
Step 3: s_ptr↑(c) | t_ptr↑(g) → NO MATCH → t_ptr=4
Step 4: s_ptr↑(c) | t_ptr↑(d) → NO MATCH → t_ptr=5
Step 5: s_ptr↑(c) | t_ptr↑(c) → MATCH → s_ptr=3 DONE
Approach 2: Dynamic Programming (For Follow-up)
What: Preprocess t to create a next-character table for efficient repeated queries.
Why: Optimizes for the follow-up scenario with many s queries.
How:
Preprocessing:
Create dp[i][c] = next position of char c in t starting from index i
Fill table from right to left
Query:
For each char in s, use table to find next occurrence
If any char not found, return FALSE
Complexity Proof:
- Preprocessing:
- Query: per string
- Space:
4. Code Deep Dive
def isSubsequence(s: str, t: str) -> bool:
# Edge case: empty s is always subsequence
if not s:
return True
# Initialize pointers for both strings
s_ptr, t_ptr = 0, 0
# Traverse both strings
while s_ptr < len(s) and t_ptr < len(t):
# Current characters match - move s pointer
if s[s_ptr] == t[t_ptr]:
s_ptr += 1
# Always move t pointer to search for next match
t_ptr += 1
# Success if we matched all characters in s
return s_ptr == len(s)
Edge Case Handling:
- Empty s (line 3-4): Returns True immediately
- s longer than t: Loop terminates by t_ptr condition, returns False
- No matches found: s_ptr never reaches end, returns False
- Partial matches: Only returns True if ALL characters matched
5. Complexity War Room
Hardware-Aware Analysis:
- At maximum |t| = 10,000, the two-pointer approach uses ~80KB for string storage + 24 bytes for pointers
- Fits comfortably in L1 cache (typically 32-64KB)
- Branch prediction benefits from sequential memory access pattern
Industry Comparison Table:
Approach | Time | Space | Readability | Interview Viability |
---|---|---|---|---|
Two-Pointer | $O( | t | )$ | |
DP Preprocessing | $O( | t | + k | s |
Recursive DFS | $O(2^{ | t | })$ | $O( |
Binary Search* | $O( | s | \log | t |
*After preprocessing character positions
6. Pro Mode Extras
Follow-up Solution for Multiple Queries:
class SubsequenceChecker:
def __init__(self, t: str):
self.t = t
self.n = len(t)
# Precompute next position for each character from each index
self.dp = [[-1] * 26 for _ in range(self.n + 1)]
# Fill DP table from right to left
for i in range(self.n - 1, -1, -1):
for c in range(26):
self.dp[i][c] = self.dp[i + 1][c]
self.dp[i][ord(t[i]) - ord('a')] = i
def is_subsequence(self, s: str) -> bool:
pos = 0
for char in s:
idx = ord(char) - ord('a')
if pos > self.n or self.dp[pos][idx] == -1:
return False
pos = self.dp[pos][idx] + 1
return True
Variants Section:
- Number of Distinct Subsequences: Count how many distinct ways s appears as subsequence in t
- Longest Common Subsequence: Find the longest string that is subsequence of both strings
- Shortest Uncommon Subsequence: Find shortest string that is subsequence of s but not t
Interview Cheat Sheet:
- First Mention: Always state time/space complexity before coding
- Edge Cases: Empty strings, s longer than t, duplicate characters
- Optimization Path: Two-pointer → Binary search → DP preprocessing
- Testing Strategy: Test with examples + empty + full match + no match cases