← ~/lc-grind/posts

#128 - Longest Consecutive Sequence

May 30, 2025

Longest Consecutive Sequence

Problem Description

Given an unsorted array of integers nums, return the length of the longest consecutive elements sequence.

You must write an algorithm that runs in O(n) time.

Example 1:


Input: nums = [100,4,200,1,3,2]
Output: 4
Explanation: The longest consecutive elements sequence is [1, 2, 3, 4]. Therefore its length is 4.

Example 2:


Input: nums = [0,3,7,2,5,8,4,6,0,1]
Output: 9

Example 3:


Input: nums = [1,0,1,2]
Output: 3

Constraints:

  • 0 <= nums.length <= 105
  • -109 <= nums[i] <= 109

Solution

1. Problem Deconstruction

Technical Description:
Given an unsorted array of integers, compute the maximum length of a contiguous interval of integers that appear in the array. A contiguous interval is a set of integers {k,k+1,k+2,,k+m1}\{k, k+1, k+2, \dots, k+m-1\} where each element exists in the input. The solution must run in O(n)O(n) time with nn as the input size.

Beginner Description:
Imagine you have a bag of numbered tiles. Find the longest unbroken chain of tiles where each number is exactly one more than the previous (like 5, 6, 7). The chain doesn’t need to appear in order in the bag. Solve this efficiently even for huge bags (100,000+ tiles).

Mathematical Description:
Let SS be the set of input integers. Find:

max{kaZ such that {a,a+1,,a+k1}S}\max \left\{ k \mid \exists a \in \mathbb{Z} \text{ such that } \{a, a+1, \dots, a+k-1\} \subseteq S \right\}

Constraint Analysis:

  • 0nums.length1050 \leq \text{nums.length} \leq 10^5:
    • Empty input returns 0.
    • O(n2)O(n^2) solutions fail (1e10 operations at n=105n=10^5). Forces O(n)O(n) design.
  • 109nums[i]109-10^9 \leq \text{nums}[i] \leq 10^9:
    • Range too large for array indexing (2e9+1 slots). Requires hash-based lookups.
    • Negative numbers and zero handled naturally via sets.

Hidden Edge Cases:

  • Duplicates: Ignore duplicates (consecutive sequences require distinct integers).
  • Single-element sequences: Minimum sequence length is 1.
  • Disjoint sequences: e.g., [1,3,5,2,4][1, 3, 5, 2, 4] has {1,2,3,4,5}\{1,2,3,4,5\} (length 5).
  • Overflow risks: Large integers don’t affect set lookups (hash collisions negligible).

2. Intuition Scaffolding

Real-World Metaphor:
Like finding the longest unbroken strand of pearls where each pearl is numbered consecutively. Instead of sorting all pearls (slow), quickly check neighbors by dumping pearls into labeled buckets (hash set).

Gaming Analogy:
In a dungeon crawler, teleporters are placed at random integer coordinates. Find the longest path where you can hop between adjacent teleporters (xx+1x \to x+1). Optimize by only starting searches at teleporters with no left neighbor.

Math Analogy:
Maximize I|I| where II is an interval [a,b][a, b] with step size 1 and ISI \subseteq S. Requires efficient membership queries without sorting SS.

Common Pitfalls:

  1. Naive brute force: For each number, scan forward until break. Worst-case O(n2)O(n^2) (e.g., [1,2,,n][1,2,\dots,n]).
  2. Sorting: O(nlogn)O(n \log n) violates O(n)O(n) requirement.
  3. Global min/max focus: Longest sequence may not start at global min (e.g., [5,1,2,3][-5, 1, 2, 3]).
  4. Duplicate mishandling: Counting duplicates inflates sequence length (e.g., [1,1,2][1,1,2] is length 2, not 3).
  5. Over-engineering with union-find: Unnecessary for linear sequences (simpler set method exists).

3. Approach Encyclopedia

Approach 1: Brute Force (Naive)

Pseudocode:

max_len = 0
for num in nums:
    current = num
    while current + 1 in nums_set:
        current += 1
    max_len = max(max_len, current - num + 1)

Complexity Proof:

  • Time: O(nL)O(n \cdot L) where LL is max sequence length. Worst-case O(n2)O(n^2) (single sequence).
  • Space: O(n)O(n) for the set.
    Visualization:
nums = [1, 2, 3]  
1 → 2 → 3 (length=3)  
2 → 3     (length=2)  
3 →       (length=1)  

Approach 2: Sorting

Pseudocode:

sort(nums)
remove_duplicates_in_sorted()  # Or use set first
max_len, curr_len = 0, 1
for i in range(1, len(nums)):
    if nums[i] == nums[i-1] + 1:
        curr_len += 1
    else:
        max_len = max(max_len, curr_len)
        curr_len = 1
return max(max_len, curr_len)

Complexity Proof:

  • Time: O(nlogn)O(n \log n) for sorting + O(n)O(n) scan. Dominated by sort.
  • Space: O(1)O(1) (in-place sort) or O(n)O(n) (if copying).
    Visualization:
Sorted: [1, 2, 3, 100, 200]
Scan: 1→2→3 (break at 100), curr_len=3 → max_len=3.

Approach 3: Set with Sequence Heads (Optimal)

Pseudocode:

nums_set = set(nums)
max_len = 0
for num in nums_set:
    if num - 1 not in nums_set:  # Head of sequence
        curr = num
        while curr + 1 in nums_set:
            curr += 1
        max_len = max(max_len, curr - num + 1)

Complexity Proof:

  • Time: O(n)O(n) — Each element processed twice max (once in loop, once in sequence).
  • Space: O(n)O(n) for the set.
    Visualization:
Set: {100, 4, 200, 1, 3, 2}
Heads: 100 (no 99), 200 (no 199), 1 (no 0)
1 → 2 → 3 → 4 (length=4)

Approach 4: Boundary Mapping with Dictionary

Pseudocode:

boundary = {}
max_len = 0
for num in nums:
    if num not in boundary:
        L = boundary.get(num-1, 0)
        R = boundary.get(num+1, 0)
        length = 1 + L + R
        boundary[num] = length
        boundary[num - L] = length  # Update left boundary
        boundary[num + R] = length  # Update right boundary
        max_len = max(max_len, length)

Complexity Proof:

  • Time: O(n)O(n) — Each num processed once, dictionary ops O(1)O(1).
  • Space: O(n)O(n) — Store boundaries for sequences.
    Visualization:
nums = [1, 2, 0, 3]
Step 0: num=1 → L=0, R=0 → length=1 → boundaries{1:1}
Step 1: num=2 → L=boundary[1]=1, R=0 → length=2 → update boundaries{1:2, 2:2}
Step 2: num=0 → L=0, R=boundary[1]=2 → length=3 → update boundaries{0:3, 1:2 → 3? NO, update 0 and 2:}
   boundary[0]=3, boundary[2]=3 (and boundary[1] remains, but unused)
Step 3: num=3 → L=boundary[2]=3, R=0 → length=4 → update boundaries{3:4, 0:4, 3:4}

4. Code Deep Dive

Optimal Solution (Set with Heads):

def longestConsecutive(nums):
    if not nums:
        return 0
    num_set = set(nums)          # Remove duplicates, O(1) lookups
    max_length = 0
    for num in num_set:          # Iterate unique numbers
        if num - 1 not in num_set:  # Identify sequence head
            current_num = num
            current_length = 1
            while current_num + 1 in num_set:  # Extend sequence
                current_num += 1
                current_length += 1
            max_length = max(max_length, current_length)
    return max_length

Edge Case Handling:

  • Empty Input: if not nums returns 0.
  • Duplicates: Set removes them (e.g., [0,0][0,0] → set {0}\{0\}).
  • Example 1: [100,4,200,1,3,2][100,4,200,1,3,2]
    • Heads: 100100, 200200, 11
    • Sequence from 11: 12341→2→3→4 (length 4).
  • Example 2: [0,3,7,2,5,8,4,6,0,1][0,3,7,2,5,8,4,6,0,1]
    • Heads: 00 (no 1-1), others skipped.
    • Sequence: 0180→1→\dots→8 (length 9).
  • Example 3: [1,0,1,2][1,0,1,2]
    • Heads: 00 (no 1-1), sequence 0120→1→2 (length 3).

5. Complexity War Room

Hardware-Aware Analysis:

  • At n=105n=10^5, set uses 100,000×30 bytes=3 MB\approx 100,000 \times 30 \text{ bytes} = 3 \text{ MB} (fits in L3 cache).
  • Single pass with O(1)O(1) lookups: 105\approx 10^5 operations (fast on modern CPUs).

Industry Comparison Table:

Approach Time Space Readability Interview Viability
Brute Force O(n2)O(n^2) O(n)O(n) 9/10 ❌ Fails large nn
Sorting O(nlogn)O(n \log n) O(1)O(1) 8/10 ⚠️ Not O(n)O(n)
Set (Optimal) O(n)O(n) O(n)O(n) 10/10 ✅ Clear & efficient
Boundary Mapping O(n)O(n) O(n)O(n) 7/10 ✅ Clever but tricky

6. Pro Mode Extras

Variants:

  1. Longest Consecutive Sequence in Binary Tree (Leetcode 298):
    • Consecutive path along parent-child edges. DFS with parent value tracking.
  2. Multiple Transactions (Leetcode 123):
    • Extend to kk transactions (dynamic programming with 3D state).
  3. Longest Arithmetic Subsequence (Leetcode 1027):
    • Generalize to arbitrary difference (not just 1).

Interview Cheat Sheet:

  • First Step: Clarify “consecutive” = value difference 1 (not array position).
  • Key Insight: Only start sequences at heads (no left neighbor) for O(n)O(n).
  • Mention: Set for O(1)O(1) lookups and duplicate removal.
  • Test: Empty input, duplicates, disjoint sequences, all duplicates.
  • Tradeoffs: Set uses O(n)O(n) space but enables O(n)O(n) time.