← ~/lc-grind/posts

#219 - Contains Duplicate II

October 15, 2025

Contains Duplicate II

Problem Description

Given an integer array nums and an integer k, return true if there are two distinct indices i and j in the array such that nums[i] == nums[j] and abs(i - j) <= k.

Example 1:


Input: nums = [1,2,3,1], k = 3
Output: true

Example 2:


Input: nums = [1,0,1,1], k = 1
Output: true

Example 3:


Input: nums = [1,2,3,1,2,3], k = 2
Output: false

Constraints:

  • 1 <= nums.length <= 105
  • -109 <= nums[i] <= 109
  • 0 <= k <= 105

Solution

1. Problem Deconstruction (500+ words)

Technical Restatement: Given a sequence SS of length nn where S[i]ZS[i] \in \mathbb{Z}, and an integer parameter k0k \geq 0, determine whether there exists a pair of indices (i,j)(i, j) such that:

  1. iji \neq j (distinct positions)
  2. S[i]=S[j]S[i] = S[j] (equivalent elements)
  3. ijk|i - j| \leq k (proximity constraint)

Beginner-Friendly Explanation: We’re looking for two positions in an array that have the same number, where these positions are no more than k spots apart from each other. For example, if k = 3, we’re checking if any number repeats within a 4-element window (since distance ≤ 3 means they could be 0, 1, 2, or 3 positions apart).

Mathematical Formulation:

i,j[0,n1]ijS[i]=S[j]ijk\exists i,j \in [0,n-1] \mid i \neq j \land S[i] = S[j] \land |i-j| \leq k

Constraint Analysis:

  • 1 <= nums.length <= 105: Rules out O(n²) brute force approaches (would require ~10¹⁰ operations)
  • -109 <= nums[i] <= 109: Values span large range, making array indexing impractical; requires hash-based solutions
  • 0 <= k <= 105: k can be larger than array length, requiring bounds checking
  • Hidden edge cases:
    • k = 0 → always false (no two distinct indices can have distance ≤ 0)
    • k ≥ n-1 → reduces to checking for any duplicate in entire array
    • Large k values may exceed array bounds in sliding window implementations

2. Intuition Scaffolding

Real-World Metaphor (Security System): Imagine a museum security system tracking visitor badges. If the same badge number appears at two different checkpoints within k minutes, an alert triggers. The system maintains a rolling window of recent visitors to efficiently detect proximity duplicates.

Gaming Analogy (Item Spawn Detection): In a game where items spawn at map positions, detect if the same item type appears within k units of distance. Players could exploit closely-spawned identical items, so the game engine needs efficient duplicate detection in spatial neighborhoods.

Math Analogy (Modular Arithmetic): This is equivalent to finding collisions in a sliding hash window, where we’re looking for value collisions within a constrained index difference, similar to detecting hash collisions in a rolling checksum with limited history.

Common Pitfalls:

  1. Global duplicate checking: Simply finding duplicates ignores proximity constraint
  2. Fixed window starts: Assuming window always starts at index 0
  3. Index subtraction errors: Forgetting absolute value in distance calculation
  4. Early termination: Not continuing search after finding first non-qualifying duplicate
  5. Boundary conditions: Handling cases where k exceeds array length

3. Approach Encyclopedia

Brute Force Approach:

def containsNearbyDuplicateBrute(nums, k):
    n = len(nums)
    # Check all possible pairs
    for i in range(n):
        for j in range(i + 1, min(i + k + 1, n)):
            if nums[i] == nums[j]:
                return True
    return False

Complexity Proof:

  • Time: i=0n1min(k,ni1)=O(nmin(k,n))\sum_{i=0}^{n-1} \min(k, n-i-1) = O(n \cdot \min(k, n))
  • Space: O(1)O(1) auxiliary space
  • Worst case: k=nk = nO(n2)O(n^2) operations

Visualization:

nums = [1, 2, 3, 1], k = 3
i=0: compare with j=1,2,3 ✓ match at j=3 (distance=3)
    1 ── 2 ── 3 ── 1
    ↑              ↑
    i=0           j=3 (distance=3 ≤ k=3)

Hash Map with Last Seen Index (Optimal):

def containsNearbyDuplicateOptimal(nums, k):
    last_seen = {}
    for i, num in enumerate(nums):
        if num in last_seen and i - last_seen[num] <= k:
            return True
        last_seen[num] = i  # Update to most recent position
    return False

Complexity Proof:

  • Time: O(n)O(n) single pass with O(1)O(1) dictionary operations
  • Space: O(min(n,k))O(\min(n, k)) in worst case, but O(n)O(n) theoretical upper bound

Visualization:

nums = [1, 0, 1, 1], k = 1
Step 0: last_seen={}, num=1 → store {1:0}
Step 1: last_seen={1:0}, num=0 → store {1:0, 0:1}  
Step 2: last_seen={1:0, 0:1}, num=1 → found at index 0, distance=2 > 1 → update {1:2, 0:1}
Step 3: last_seen={1:2, 0:1}, num=1 → found at index 2, distance=1 ≤ 1 ✓ return True

Sliding Window + HashSet Approach:

def containsNearbyDuplicateSliding(nums, k):
    window = set()
    for i, num in enumerate(nums):
        if num in window:
            return True
        window.add(num)
        if i >= k:  # Maintain window size ≤ k
            window.remove(nums[i - k])
    return False

Complexity Proof:

  • Time: O(n)O(n) with each operation O(1)O(1) on average
  • Space: O(min(n,k))O(\min(n, k)) for the sliding window

4. Code Deep Dive

Optimal Solution (Hash Map):

def containsNearbyDuplicate(nums, k):
    # Storage for most recent occurrence of each value
    last_occurrence = {}
    
    # Iterate through array with index tracking
    for current_index, value in enumerate(nums):
        # Check if we've seen this value before within allowed distance
        if value in last_occurrence and current_index - last_occurrence[value] <= k:
            return True  # Found valid duplicate pair
        
        # Update to track most recent occurrence
        last_occurrence[value] = current_index
    
    return False  # No valid duplicates found

Line-by-Line Analysis:

  • Line 2: Dictionary initialization - O(1) time, will store value → last_index mappings
  • Line 5: Single pass iteration - O(n) time complexity guaranteed
  • Line 7: Critical check - verifies both existence and proximity constraint in O(1)
  • Line 10: Strategic update - maintains the most recent position for future checks
  • Line 12: Default return - only reached after exhaustive search

Edge Case Handling:

  • Example 1 ([1,2,3,1], k=3):
    • Index 3: 1 found in map at index 0, distance=3 ≤ 3 ✓
  • Example 2 ([1,0,1,1], k=1):
    • Index 2: 1 found but distance=2 > 1 → update position
    • Index 3: 1 found at index 2, distance=1 ≤ 1 ✓
  • Example 3 ([1,2,3,1,2,3], k=2):
    • All duplicates occur at distance=3 > 2 → return False
  • k = 0: Condition i - j ≤ 0 never satisfied for distinct indices → always False
  • Empty array: Loop never executes → returns False correctly

5. Complexity War Room

Hardware-Aware Analysis:

  • At n=105n = 10^5 elements, optimal solution uses ~200KB for dictionary (assuming 4B integers)
  • Fits comfortably in L3 cache (typically 8-32MB on modern CPUs)
  • Memory access pattern exhibits good locality (sequential array traversal)

Industry Comparison Table:

Approach Time Space Readability Interview Viability Real-World Usage
Brute Force O(n·min(k,n)) O(1) 10/10 ❌ Fails large N Never in production
Sliding Window O(n) O(min(n,k)) 8/10 ✅ Good alternative Common in streaming
Hash Map O(n) O(n) 9/10 ✅ Preferred Industry standard

Performance Characteristics:

  • Best case: Ω(1) - duplicate found in first k+1 elements
  • Worst case: O(n) - no duplicates or only distant duplicates
  • Average case: Θ(n) - linear scaling with input size

6. Pro Mode Extras

Algorithm Variants:

  1. Contains Duplicate III: Find indices where |nums[i] - nums[j]| ≤ t and |i - j| ≤ k
    • Requires sorted data structures or bucketing approaches
  2. Multiple Transactions: Find all pairs satisfying the condition rather than just existence
  3. Sliding Window Maximum: Adapt technique to find max in each window of size k

Interview Cheat Sheet:

  • First Mention: Always state time/space complexity upfront
  • Clarify: Ask about input constraints and edge cases
  • Progress: Start with brute force, then optimize
  • Test: Walk through examples with your solution
  • Optimize: Hash-based solutions preferred for O(n) problems

Advanced Optimization:

def containsNearbyDuplicateOptimized(nums, k):
    if k == 0: return False  # Early termination for edge case
    seen = {}
    for i, num in enumerate(nums):
        if num in seen and i - seen[num] <= k:
            return True
        seen[num] = i
    return False

Key Insights:

  • The problem reduces to maintaining a “memory” of recent occurrences
  • Hash tables provide optimal O(1) lookups for this memory pattern
  • Sliding window approach minimizes space usage when k is small
  • Most recent position is always sufficient - older positions cannot yield better results