#71 - Simplify Path
Simplify Path
- Difficulty: Medium
- Topics: String, Stack
- Link: https://leetcode.com/problems/simplify-path/
Problem Description
You are given an absolute path for a Unix-style file system, which always begins with a slash '/'. Your task is to transform this absolute path into its simplified canonical path.
The rules of a Unix-style file system are as follows:
- A single period
'.'represents the current directory. - A double period
'..'represents the previous/parent directory. - Multiple consecutive slashes such as
'//'and'///'are treated as a single slash'/'. - Any sequence of periods that does not match the rules above should be treated as a valid directory or file name. For example,
'...'and'....'are valid directory or file names.
The simplified canonical path should follow these rules:
- The path must start with a single slash
'/'. - Directories within the path must be separated by exactly one slash
'/'. - The path must not end with a slash
'/', unless it is the root directory. - The path must not have any single or double periods (
'.'and'..') used to denote current or parent directories.
Return the simplified canonical path.
Example 1:
Input: path = “/home/”
Output: “/home”
Explanation:
The trailing slash should be removed.
Example 2:
Input: path = “/home//foo/”
Output: “/home/foo”
Explanation:
Multiple consecutive slashes are replaced by a single one.
Example 3:
Input: path = “/home/user/Documents/…/Pictures”
Output: “/home/user/Pictures”
Explanation:
A double period ".." refers to the directory up a level (the parent directory).
Example 4:
Input: path = “/…/”
Output: “/”
Explanation:
Going one level up from the root directory is not possible.
Example 5:
Input: path = “/…/a/…/b/c/…/d/./”
Output: “/…/b/d”
Explanation:
"..." is a valid name for a directory in this problem.
Constraints:
1 <= path.length <= 3000pathconsists of English letters, digits, period'.', slash'/'or'_'.pathis a valid absolute Unix path.
Solution
1. Problem Deconstruction (500+ words)
Technical Reformulation:
Given a string path representing an absolute Unix file path, implement a transformation function that:
- Collapses multiple consecutive slashes into single slashes
- Resolves
.tokens (current directory) by removing them - Resolves
..tokens (parent directory) by removing the preceding directory segment - Preserves sequences of 3+ periods as valid directory names
- Returns the shortest possible normalized path starting with
/and containing no trailing slash (except root)
The core algorithmic challenge is processing a sequence of path tokens while maintaining directory context and applying stack-based transformations.
Beginner-Friendly Version:
Imagine you’re organizing files on a computer. The computer gives you a messy address like:
/home/user/./documents/../photos/
Your job is to clean this up by:
- Removing any
./(which means “stay in this folder”) - Converting
../to “go back one folder” (removing the folder before it) - Turning multiple
///into just/ - Keeping
...as a regular folder name - Making sure the address starts with
/and doesn’t end with/
Mathematical Formulation:
Let be the sequence of path components after splitting by / and filtering empty strings.
Let be a stack data structure initialized as empty.
For each :
- If : remains unchanged ()
- If : (pop if non-empty)
- Otherwise: (push operation)
The canonical path is then:
Constraint Analysis:
1 <= path.length <= 3000: Allows O(n²) solutions but O(n) is optimal. Memory usage is negligible.- Character set restriction: Simplifies parsing since we only handle alphanumerics, periods, slashes, underscores. No need for Unicode handling.
- Absolute path guarantee: Always starts with
/, eliminating edge cases for relative paths. - Valid path constraint: Input won’t contain invalid sequences like
.../..as special tokens, but...itself is valid.
Hidden Edge Cases:
- Root directory:
/,/..,/../..all simplify to/ - Multiple consecutive parent directories:
/a/b/../../c→/c - Mixed valid periods:
/.../a/..../bpreserves all multi-period sequences - Trailing current directory:
/a/./.→/a
2. Intuition Scaffolding
Real-World Metaphor (Office Building): Imagine navigating a corporate office building:
/= building entrancehome= your department floor.= staying on current floor..= taking elevator back to previous floor//= redundant “go to the same floor again” instructions...= special conference room name (not an instruction)
Gaming Analogy (Dungeon Navigation): Think of exploring dungeon levels:
- Each
/separates dungeon levels .= “stay on current level” (useless instruction)..= “go back to previous level”///= stuttering the same instruction multiple times...= a special room called “Triple Dot Chamber”
Math Analogy (Function Composition): Consider the path as function composition :
.= identity function..= inverse function- Directory names = specific functions
- Multiple slashes = redundant identity compositions
- Canonical path = reduced function composition
Common Pitfalls:
- Global Min/Max Fallacy: Trying to track “lowest possible path” rather than processing sequentially
- Over-eager Period Handling: Treating
...and....as special tokens instead of names - State Machine Complexity: Building elaborate FSMs when stack operations suffice
- Reverse Processing: Attempting to process path backwards and losing directory context
- Over-normalization: Removing valid directory names containing periods
3. Approach Encyclopedia
Brute Force (Iterative Refinement):
function simplifyPath(path):
while "//" in path:
path = replace "//" with "/"
while "/./" in path:
path = replace "/./" with "/"
while "directory/../" in path:
path = replace "directory/../" with "/"
remove trailing "/" unless root
return path
Complexity: Each while loop is O(n) and could run O(n) times → O(n²) worst case. Space O(n).
Optimal Stack Approach:
function simplifyPath(path):
stack = []
for token in split(path, "/"):
if token == "" or token == ".":
continue
elif token == "..":
if stack not empty: stack.pop()
else:
stack.push(token)
return "/" + join(stack, "/")
Complexity Proof:
- Time: O(n) where n is path length
- Splitting: O(n) single pass
- Stack operations: O(n) total (each token processed once)
- Join: O(n) for string construction
- Space: O(n) for stack storage in worst case (no
..operations)
Visualization:
Input: "/home//user/./docs/../file"
Split: ["", "home", "", "user", ".", "docs", "..", "file"]
Stack evolution:
[] →
["home"] →
["home", "user"] →
(skip ".") →
["home", "user", "docs"] →
(pop "docs") →
["home", "user", "file"]
Result: "/home/user/file"
4. Code Deep Dive
def simplifyPath(path: str) -> str:
# Split by slash and filter empty strings (handles multiple slashes)
components = path.split('/')
stack = []
for comp in components:
# Skip empty and current directory tokens
if comp == '' or comp == '.':
continue
# Handle parent directory - pop from stack if possible
elif comp == '..':
if stack: # Guard against popping from empty stack
stack.pop()
# Regular directory name - push to stack
else:
stack.append(comp)
# Reconstruct path with leading slash
return '/' + '/'.join(stack)
Line-by-Line Analysis:
components = path.split('/'): Single O(n) pass to tokenize input. Empty strings result from leading/trailing/multiple slashes.stack = []: O(1) initialization. Maximum O(n) space in worst case.if comp == '' or comp == '.': Handles both slash collapsing and current directory in O(1).elif comp == '..': Parent directory logic. Theif stack:check prevents underflow (handles/..case).else: stack.append(comp): All other tokens are valid directory/file names.return '/' + '/'.join(stack): O(n) reconstruction. Handles root case automatically (empty stack →'/').
Edge Case Handling:
- Example 1 (
"/home/"): Trailing slash creates empty string component → ignored. - Example 2 (
"/home//foo/"): Double slash creates empty string → filtered out. - Example 3 (
"/home/user/Documents/../Pictures"):".."pops"Documents"from stack. - Example 4 (
"/../"):".."attempts to pop empty stack → no effect. - Example 5 (
"/.../a/../b/c/../d/./"):"..."treated as name, not special token.
5. Complexity War Room
Hardware-Aware Analysis:
- For maximum input size (3000 chars): ~3KB string storage fits in L1 cache
- Stack operations use pointer arithmetic - CPU cache friendly
- String joining creates new allocation but within reasonable bounds
Industry Comparison Table:
| Approach | Time | Space | Readability | Interview Viability |
|---|---|---|---|---|
| Brute Force | O(n²) | O(n) | 8/10 | ❌ Fails large inputs |
| Regex Substitution | O(n) | O(n) | 6/10 | ⚠️ Over-engineered |
| Stack Processing | O(n) | O(n) | 9/10 | ✅ Optimal |
| Two-Pass Manual | O(n) | O(1) | 4/10 | ❌ Too complex |
Empirical Performance:
- Stack approach: ~0.1ms for max input on modern hardware
- Memory: ~2-3x input size (split components + stack + output)
6. Pro Mode Extras
Variants Section:
- Relative Path Resolution: Handle both absolute and relative paths
def resolvePath(base: str, relative: str) -> str:
if relative.startswith('/'):
return simplifyPath(relative)
return simplifyPath(base + '/' + relative)
- Multiple Directory Separators: Support both
\and/(Windows + Unix)
def universalSimplifyPath(path: str) -> str:
# Normalize separators first
normalized = path.replace('\\', '/')
return simplifyPath(normalized)
- Path Validation: Detect invalid paths during simplification
def validateAndSimplify(path: str) -> str:
stack = []
for comp in path.split('/'):
if comp in ('', '.'):
continue
elif comp == '..':
if not stack:
raise ValueError("Invalid path: too many '..'")
stack.pop()
elif any(c in comp for c in '<>|:"?*'):
raise ValueError(f"Invalid character in: {comp}")
else:
stack.append(comp)
return '/' + '/'.join(stack)
Interview Cheat Sheet:
- First Mention: “I’ll use a stack-based approach for O(n) time/space”
- Key Insight: “Treat
.as no-op and..as stack pop” - Edge Cases: Always test: root directory, multiple periods, trailing slashes
- Optimization: “We could optimize space by processing in-place, but stack is clearer”
- Extension Ready: “This approach extends naturally to relative path resolution”
Advanced Considerations:
- Symbolic link resolution (requires filesystem context)
- Memory-mapped processing for extremely large paths
- Streaming tokenization for memory-constrained environments
- Parallel preprocessing of path segments (for ultra-long paths)