#399 - Evaluate Division
Evaluate Division
- Difficulty: Medium
- Topics: Array, String, Depth-First Search, Breadth-First Search, Union Find, Graph, Shortest Path
- Link: https://leetcode.com/problems/evaluate-division/
Problem Description
You are given an array of variable pairs equations and an array of real numbers values, where equations[i] = [Ai, Bi] and values[i] represent the equation Ai / Bi = values[i]. Each Ai or Bi is a string that represents a single variable.
You are also given some queries, where queries[j] = [Cj, Dj] represents the jth query where you must find the answer for Cj / Dj = ?.
Return the answers to all queries. If a single answer cannot be determined, return -1.0.
Note: The input is always valid. You may assume that evaluating the queries will not result in division by zero and that there is no contradiction.
**Note:**The variables that do not occur in the list of equations are undefined, so the answer cannot be determined for them.
Example 1:
Input: equations = [["a","b"],["b","c"]], values = [2.0,3.0], queries = [["a","c"],["b","a"],["a","e"],["a","a"],["x","x"]]
Output: [6.00000,0.50000,-1.00000,1.00000,-1.00000]
Explanation:
Given: a / b = 2.0, b / c = 3.0
queries are: a / c = ?, b / a = ?, a / e = ?, a / a = ?, x / x = ?
return: [6.0, 0.5, -1.0, 1.0, -1.0 ]
note: x is undefined => -1.0
Example 2:
Input: equations = [["a","b"],["b","c"],["bc","cd"]], values = [1.5,2.5,5.0], queries = [["a","c"],["c","b"],["bc","cd"],["cd","bc"]]
Output: [3.75000,0.40000,5.00000,0.20000]
Example 3:
Input: equations = [["a","b"]], values = [0.5], queries = [["a","b"],["b","a"],["a","c"],["x","y"]]
Output: [0.50000,2.00000,-1.00000,-1.00000]
Constraints:
1 <= equations.length <= 20equations[i].length == 21 <= Ai.length, Bi.length <= 5values.length == equations.length0.0 < values[i] <= 20.01 <= queries.length <= 20queries[i].length == 21 <= Cj.length, Dj.length <= 5Ai, Bi, Cj, Djconsist of lower case English letters and digits.
Solution
1. Problem Deconstruction
Technical Restatement
Given a directed weighted graph representation of variable relationships where:
- Nodes represent string variables
- Directed edges
A → Bhave weightvalues[i]representingA/B = values[i] - Implicit reverse edges
B → Aexist with weight1/values[i] - Must compute division results for query pairs by finding valid paths and multiplying edge weights
- Return
-1.0for undefined variables or disconnected components
Beginner-Friendly Version
Imagine you have a list of conversion rates between different currencies. For example:
- “2 dollars = 1 euro” means dollar/euro = 2.0
- “3 euros = 1 pound” means euro/pound = 3.0
If someone asks “How many dollars per pound?”, you’d multiply: dollars/euro × euros/pound = 2.0 × 3.0 = 6.0
Some currencies might not be connected (like “yen” not mentioned), so you can’t convert between them.
Mathematical Formulation
Given:
- Set of equations: with values where
- Set of queries:
Find: where are edge weights along path
If no path exists or variables undefined:
Constraint Analysis
| Constraint | Impact Analysis | Edge Cases |
|---|---|---|
1 <= equations.length <= 20 |
Small enough for exponential algorithms | Minimal equations (1) vs maximum (20) |
equations[i].length == 2 |
Consistent bipartite structure | No self-loops in input data |
1 <= Ai.length, Bi.length <= 5 |
Short variable names, hash efficiency | Single character vs 5-character names |
values.length == equations.length |
1:1 mapping guaranteed | No missing or extra conversion factors |
0.0 < values[i] <= 20.0 |
No zero/negative division issues | Very small (>0) vs large (20.0) values |
1 <= queries.length <= 20 |
Query count manageable | Single query optimization possible |
Cj.length, Dj.length <= 5 |
Consistent variable naming | Unknown variables easily detectable |
| Hidden Constraints | Algorithmic Implications | Test Considerations |
| No division by zero | All denominators valid in queries | Self-query (a/a) always = 1.0 |
| No contradiction | Graph is consistent | All paths between nodes give same result |
| Variables may be undefined | Must track existence | Queries with undefined variables |
2. Intuition Scaffolding
Real-World Metaphor: Currency Exchange Network
Think of variables as different currencies and values as exchange rates. If:
- USD/EUR = 0.8 and EUR/GBP = 0.9
- Then USD/GBP = USD/EUR × EUR/GBP = 0.8 × 0.9 = 0.72
The graph represents this currency conversion network where finding exchange rates becomes path finding.
Gaming Analogy: Crafting Recipe Progression
In games like Minecraft or Factorio, recipes form dependency chains:
- 2 Wood → 1 Plank
- 4 Planks → 1 Crafting Table
- Query: “How much Wood per Crafting Table?” = 2 × 4 = 8
Unknown materials represent disconnected components in the crafting graph.
Math Analogy: Matrix of Multiplicative Relationships
Each equation creates relationships in a multiplicative group. The problem reduces to solving:
where M represents the coefficient matrix of the ratio relationships.
Common Pitfalls
- Global Min/Max Fallacy: Trying to find “cheapest conversion” rather than exact multiplicative path
- Bidirectional Edge Confusion: Forgetting that A/B = v implies B/A = 1/v
- Path Existence Assumption: Assuming all variables are connected when they form separate components
- Self-Query Oversight: Missing that a/a = 1.0 regardless of equations
- Floating Point Precision: Accumulating rounding errors in long multiplication chains
- Early Termination: Stopping BFS/DFS at first solution when multiple paths should give same result
- Variable Existence Check: Not verifying both query variables exist in the equation set
3. Approach Encyclopedia
Approach 1: Brute Force Path Exploration
Algorithm: BruteForceEvaluateDivision
Input: equations, values, queries
Output: results array
1. Build adjacency list from equations and values
- For each (A, B, value): add A→B:value and B→A:1/value
2. For each query (C, D):
a. If C or D not in graph: result ← -1.0
b. Else if C == D: result ← 1.0
c. Else:
visited ← set()
result ← DFS(C, D, 1.0, visited)
If result not found: return -1.0
3. DFS(current, target, product, visited):
a. If current == target: return product
b. Mark current visited
c. For each neighbor with weight:
If neighbor not visited:
result ← DFS(neighbor, target, product × weight, visited)
If result ≠ -1: return result
d. Return -1.0
Complexity Proof:
- Time: where D = maximum depth
- Worst case: with 20 queries and chain length 20
- Space: for graph + recursion depth
Visualization:
Equations: a/b=2, b/c=3
Graph:
a → b (2.0) b → a (0.5)
b → c (3.0) c → b (0.333)
Query a/c:
Start: a (product=1.0)
→ b (1.0×2.0=2.0)
→ c (2.0×3.0=6.0) ✓
Approach 2: BFS with Path Tracking
Algorithm: BFSEvaluateDivision
Input: equations, values, queries
Output: results array
1. Build graph as adjacency list
2. For each query (C, D):
a. If C or D not in graph: -1.0
b. If C == D: 1.0
c. Else:
queue ← [(C, 1.0)], visited ← {C}
While queue not empty:
(node, product) ← dequeue
If node == D: return product
For each neighbor with weight:
If neighbor not visited:
visited.add(neighbor)
enqueue(neighbor, product × weight)
Return -1.0
Complexity Proof:
- Time: = = worst case
- Space: for graph + for BFS queue
Visualization:
Query a/c:
Queue: [(a,1.0)]
Process a: neighbors b(2.0) → Queue: [(b,2.0)]
Process b: neighbors a(0.5), c(3.0)
Skip a (visited), Add c → Queue: [(c,6.0)]
Process c: Found target! Return 6.0
Approach 3: Floyd-Warshall All-Pairs (Optimal)
Algorithm: FloydWarshallEvaluateDivision
Input: equations, values, queries
Output: results array
1. Collect all unique variables → list nodes
2. Create distance matrix dist[n][n] with:
- dist[i][i] = 1.0
- dist[i][j] = known value if edge i→j exists
- Else dist[i][j] = -1.0 (unknown)
3. For k from 0 to n-1:
For i from 0 to n-1:
For j from 0 to n-1:
If dist[i][k] ≠ -1 and dist[k][j] ≠ -1:
If dist[i][j] == -1:
dist[i][j] = dist[i][k] × dist[k][j]
4. For each query (C, D):
i = indexOf(C), j = indexOf(D)
Return dist[i][j] if both exist, else -1.0
Complexity Proof:
- Time: where V ≤ 40 → =
- Space: = for distance matrix
Mathematical Derivation: Floyd-Warshall works because division relationships are transitive: If and then
The algorithm computes: but since all paths give same result (no contradictions), it’s just path finding.
4. Code Deep Dive
def calcEquation(equations, values, queries):
"""
Optimal Solution using Floyd-Warshall All-Pairs Shortest Path
Time: O(V^3 + Q), Space: O(V^2) where V = unique variables
"""
# Step 1: Collect all variables and create index mapping
variables = set()
for A, B in equations:
variables.add(A)
variables.add(B)
n = len(variables)
node_to_idx = {node: i for i, node in enumerate(variables)}
# Step 2: Initialize distance matrix
dist = [[-1.0] * n for _ in range(n)]
# Self-relationships: a/a = 1.0
for i in range(n):
dist[i][i] = 1.0
# Step 3: Populate direct relationships from equations
for (A, B), value in zip(equations, values):
i, j = node_to_idx[A], node_to_idx[B]
dist[i][j] = value # A/B = value
dist[j][i] = 1.0 / value # B/A = 1/value
# Step 4: Floyd-Warshall to compute all pairs
for k in range(n): # Intermediate node
for i in range(n): # Source node
for j in range(n): # Target node
# If path i→k and k→j exists, compute i→j
if dist[i][k] != -1.0 and dist[k][j] != -1.0:
if dist[i][j] == -1.0:
dist[i][j] = dist[i][k] * dist[k][j]
# Step 5: Process queries
results = []
for C, D in queries:
if C not in node_to_idx or D not in node_to_idx:
results.append(-1.0) # Undefined variable
else:
i, j = node_to_idx[C], node_to_idx[D]
results.append(dist[i][j])
return results
Line-by-Line Analysis:
| Line | Purpose | Edge Case Handling |
|---|---|---|
| 5-9 | Collect unique variables | Handles duplicate variables automatically |
| 12-14 | Initialize matrix | Sets unknown relationships to -1.0 |
| 17-18 | Self-relationships | Ensures a/a = 1.0 for all variables |
| 21-24 | Direct edges | Creates bidirectional edges A/B and B/A |
| 27-32 | Floyd-Warshall | Only updates unknown paths, preserves known |
| 37-41 | Query processing | Checks variable existence before lookup |
Edge Case Mapping:
- Example 1:
[["a","a"],["x","x"]]→ Handled by existence check (line 38) - Example 2:
[["bc","cd"]]→ Multi-character names handled naturally - Example 3:
[["a","b"],["b","a"]]→ Bidirectional edges ensure consistency
5. Complexity War Room
Hardware-Aware Analysis
- Memory Usage: 40 variables → 40×40×8 bytes = 12.8KB (fits in L1 cache)
- Cache Performance: Floyd-Warshall has poor locality but small dataset
- Parallel Potential: Inner loops parallelizable for larger datasets
Industry Comparison Table
| Approach | Time Complexity | Space Complexity | Readability | Interview Viability |
|---|---|---|---|---|
| Brute Force DFS | 9/10 | ❌ Exponential worst-case | ||
| BFS Path Finding | 8/10 | ✅ Good for clarity | ||
| Union-Find with Weights | 6/10 | ✅ Optimal but complex | ||
| Floyd-Warshall | 7/10 | ✅ Recommended |
Theoretical Bounds
- Lower Bound: must store all pairs relationships
- Upper Bound: for Floyd-Warshall is optimal for dense graphs
- Practical Limit: With V=40, easily handles maximum input size
6. Pro Mode Extras
Algorithm Variants
Variant 1: Multiple Transactions (Weighted Union-Find)
# For problems with frequent updates and queries
parent = {}; weight = {}
def find(x):
if x not in parent:
parent[x] = x; weight[x] = 1.0
return x
if parent[x] != x:
root = find(parent[x])
weight[x] *= weight[parent[x]]
parent[x] = root
return parent[x]
def union(x, y, value):
rootX, rootY = find(x), find(y)
if rootX != rootY:
parent[rootX] = rootY
weight[rootX] = weight[y] * value / weight[x]
Variant 2: Online Queries with Caching
# For streaming queries with memoization
cache = {}
def query(C, D):
if (C, D) in cache: return cache[(C, D)]
# Compute and cache result
# Invalidate cache on new equations
Interview Cheat Sheet
Essential Talking Points:
- “This is essentially a graph path finding problem with multiplicative weights”
- “Floyd-Warshall gives us O(1) queries after O(V³) preprocessing”
- “We must handle bidirectional relationships and disconnected components”
- “The no-contradiction guarantee means all paths give the same result”
Common Follow-ups:
- “How would you handle contradictions in the equations?”
- “What if we could add new equations dynamically?”
- “How to minimize floating point error accumulation?”
Optimization Priority:
- ✅ Check variable existence first
- ✅ Handle self-queries (a/a = 1.0)
- ✅ Use Floyd-Warshall for small V
- ✅ Consider Union-Find for dynamic updates
Red Flags to Avoid:
- ❌ Not checking for undefined variables
- ❌ Forgetting bidirectional edges
- ❌ Using DFS without cycle prevention
- ❌ Not handling self-query case