Contains Duplicate II

Difficulty: Easy
Topics: Array, Hash Table, Sliding Window
Link: https://leetcode.com/problems/contains-duplicate-ii/

Problem Description

Given an integer array nums and an integer k, return true if there are two distinct indices i and j in the array such that nums[i] == nums[j] and abs(i - j) <= k.

Example 1:


Input: nums = [1,2,3,1], k = 3
Output: true

Example 2:


Input: nums = [1,0,1,1], k = 1
Output: true

Example 3:


Input: nums = [1,2,3,1,2,3], k = 2
Output: false

Constraints:

1 <= nums.length <= 105
-109 <= nums[i] <= 109
0 <= k <= 105

Solution

1. Problem Deconstruction (500+ words)

Technical Restatement: Given a sequence $S$ of length $n$ where $S[i] \in \mathbb{Z}$ , and an integer parameter $k \geq 0$ , determine whether there exists a pair of indices $(i, j)$ such that:

$i \neq j$ (distinct positions)
$S[i] = S[j]$ (equivalent elements)
$|i - j| \leq k$ (proximity constraint)

Beginner-Friendly Explanation: We’re looking for two positions in an array that have the same number, where these positions are no more than k spots apart from each other. For example, if k = 3, we’re checking if any number repeats within a 4-element window (since distance ≤ 3 means they could be 0, 1, 2, or 3 positions apart).

Mathematical Formulation:

\exists i,j \in [0,n-1] \mid i \neq j \land S[i] = S[j] \land |i-j| \leq k

Constraint Analysis:

1 <= nums.length <= 105: Rules out O(n²) brute force approaches (would require ~10¹⁰ operations)
-109 <= nums[i] <= 109: Values span large range, making array indexing impractical; requires hash-based solutions
0 <= k <= 105: k can be larger than array length, requiring bounds checking
Hidden edge cases:
- k = 0 → always false (no two distinct indices can have distance ≤ 0)
- k ≥ n-1 → reduces to checking for any duplicate in entire array
- Large k values may exceed array bounds in sliding window implementations

2. Intuition Scaffolding

Real-World Metaphor (Security System): Imagine a museum security system tracking visitor badges. If the same badge number appears at two different checkpoints within k minutes, an alert triggers. The system maintains a rolling window of recent visitors to efficiently detect proximity duplicates.

Gaming Analogy (Item Spawn Detection): In a game where items spawn at map positions, detect if the same item type appears within k units of distance. Players could exploit closely-spawned identical items, so the game engine needs efficient duplicate detection in spatial neighborhoods.

Math Analogy (Modular Arithmetic): This is equivalent to finding collisions in a sliding hash window, where we’re looking for value collisions within a constrained index difference, similar to detecting hash collisions in a rolling checksum with limited history.

Common Pitfalls:

Global duplicate checking: Simply finding duplicates ignores proximity constraint
Fixed window starts: Assuming window always starts at index 0
Index subtraction errors: Forgetting absolute value in distance calculation
Early termination: Not continuing search after finding first non-qualifying duplicate
Boundary conditions: Handling cases where k exceeds array length

3. Approach Encyclopedia

Brute Force Approach:

def containsNearbyDuplicateBrute(nums, k):
    n = len(nums)
    # Check all possible pairs
    for i in range(n):
        for j in range(i + 1, min(i + k + 1, n)):
            if nums[i] == nums[j]:
                return True
    return False

Complexity Proof:

Time: $\sum_{i=0}^{n-1} \min(k, n-i-1) = O(n \cdot \min(k, n))$
Space: $O(1)$ auxiliary space
Worst case: $k = n$ → $O(n^2)$ operations

Visualization:

nums = [1, 2, 3, 1], k = 3
i=0: compare with j=1,2,3 ✓ match at j=3 (distance=3)
    1 ── 2 ── 3 ── 1
    ↑              ↑
    i=0           j=3 (distance=3 ≤ k=3)

Hash Map with Last Seen Index (Optimal):

def containsNearbyDuplicateOptimal(nums, k):
    last_seen = {}
    for i, num in enumerate(nums):
        if num in last_seen and i - last_seen[num] <= k:
            return True
        last_seen[num] = i  # Update to most recent position
    return False

Complexity Proof:

Time: $O(n)$ single pass with $O(1)$ dictionary operations
Space: $O(\min(n, k))$ in worst case, but $O(n)$ theoretical upper bound

Visualization:

nums = [1, 0, 1, 1], k = 1
Step 0: last_seen={}, num=1 → store {1:0}
Step 1: last_seen={1:0}, num=0 → store {1:0, 0:1}  
Step 2: last_seen={1:0, 0:1}, num=1 → found at index 0, distance=2 > 1 → update {1:2, 0:1}
Step 3: last_seen={1:2, 0:1}, num=1 → found at index 2, distance=1 ≤ 1 ✓ return True

Sliding Window + HashSet Approach:

def containsNearbyDuplicateSliding(nums, k):
    window = set()
    for i, num in enumerate(nums):
        if num in window:
            return True
        window.add(num)
        if i >= k:  # Maintain window size ≤ k
            window.remove(nums[i - k])
    return False

Complexity Proof:

Time: $O(n)$ with each operation $O(1)$ on average
Space: $O(\min(n, k))$ for the sliding window

4. Code Deep Dive

Optimal Solution (Hash Map):

def containsNearbyDuplicate(nums, k):
    # Storage for most recent occurrence of each value
    last_occurrence = {}
    
    # Iterate through array with index tracking
    for current_index, value in enumerate(nums):
        # Check if we've seen this value before within allowed distance
        if value in last_occurrence and current_index - last_occurrence[value] <= k:
            return True  # Found valid duplicate pair
        
        # Update to track most recent occurrence
        last_occurrence[value] = current_index
    
    return False  # No valid duplicates found

Line-by-Line Analysis:

Line 2: Dictionary initialization - O(1) time, will store value → last_index mappings
Line 5: Single pass iteration - O(n) time complexity guaranteed
Line 7: Critical check - verifies both existence and proximity constraint in O(1)
Line 10: Strategic update - maintains the most recent position for future checks
Line 12: Default return - only reached after exhaustive search

Edge Case Handling:

Example 1 ([1,2,3,1], k=3):
- Index 3: 1 found in map at index 0, distance=3 ≤ 3 ✓
Example 2 ([1,0,1,1], k=1):
- Index 2: 1 found but distance=2 > 1 → update position
- Index 3: 1 found at index 2, distance=1 ≤ 1 ✓
Example 3 ([1,2,3,1,2,3], k=2):
- All duplicates occur at distance=3 > 2 → return False
k = 0: Condition i - j ≤ 0 never satisfied for distinct indices → always False
Empty array: Loop never executes → returns False correctly

5. Complexity War Room

Hardware-Aware Analysis:

At $n = 10^5$ elements, optimal solution uses ~200KB for dictionary (assuming 4B integers)
Fits comfortably in L3 cache (typically 8-32MB on modern CPUs)
Memory access pattern exhibits good locality (sequential array traversal)

Industry Comparison Table:

Approach	Time	Space	Readability	Interview Viability	Real-World Usage
Brute Force	O(n·min(k,n))	O(1)	10/10	❌ Fails large N	Never in production
Sliding Window	O(n)	O(min(n,k))	8/10	✅ Good alternative	Common in streaming
Hash Map	O(n)	O(n)	9/10	✅ Preferred	Industry standard

Performance Characteristics:

Best case: Ω(1) - duplicate found in first k+1 elements
Worst case: O(n) - no duplicates or only distant duplicates
Average case: Θ(n) - linear scaling with input size

6. Pro Mode Extras

Algorithm Variants:

Contains Duplicate III: Find indices where |nums[i] - nums[j]| ≤ t and |i - j| ≤ k
- Requires sorted data structures or bucketing approaches
Multiple Transactions: Find all pairs satisfying the condition rather than just existence
Sliding Window Maximum: Adapt technique to find max in each window of size k

Interview Cheat Sheet:

First Mention: Always state time/space complexity upfront
Clarify: Ask about input constraints and edge cases
Progress: Start with brute force, then optimize
Test: Walk through examples with your solution
Optimize: Hash-based solutions preferred for O(n) problems

Advanced Optimization:

def containsNearbyDuplicateOptimized(nums, k):
    if k == 0: return False  # Early termination for edge case
    seen = {}
    for i, num in enumerate(nums):
        if num in seen and i - seen[num] <= k:
            return True
        seen[num] = i
    return False

Key Insights:

The problem reduces to maintaining a “memory” of recent occurrences
Hash tables provide optimal O(1) lookups for this memory pattern
Sliding window approach minimizes space usage when k is small
Most recent position is always sufficient - older positions cannot yield better results

#219 - Contains Duplicate II

Contains Duplicate II

Problem Description

Solution

1. Problem Deconstruction (500+ words)

2. Intuition Scaffolding

3. Approach Encyclopedia

4. Code Deep Dive

5. Complexity War Room

6. Pro Mode Extras