Lecture 31 - Big O Notation & Algorithmic Complexity
Logistics
- Welcome to the algorithms & data structures unit!
- Retest scores out, corrections tomorrow in discussion
- HW6 due Friday / HW7 released Friday (maybe Saturday)
- HW5 grades will be released tomorrow / corrections due a week from tomorrow
- Readings shift focus: Python DS book + videos (concepts, not syntax!)
- DECKS OF CARDS?
Learning objectives
By the end of today, you should be able to:
- Use Big O notation to describe time and space complexity
- Analyze code to determine its Big O complexity (loops, nested loops, logarithmic patterns)
- Recognize common complexity classes: O(1), O(log n), O(n), O(n^2), O(2^n)
- Apply key rules: drop constants, keep dominant terms
Part 1: Big O notation - The math of "about how fast?"
Motivation: When does speed matter?
Think about:
- Sorting 10 items vs. sorting 1 million items
- Searching through 100 names vs. searching Facebook's 3 billion users
- A game processing 60 frames per second
Our intuition is usually that a task that's twice as big should take twice as long
It's often not that simple - it depends on the algorithm
Think-pair-share: Counting operations
Part 1: Given this code:
#![allow(unused)] fn main() { fn sum_array(arr: &[i32]) -> i32 { let mut total = 0; for &num in arr { total += num; } total } }
Question: If the array has n elements, how many addition operations happen?
Part 2: Now consider this code:
#![allow(unused)] fn main() { fn count_pairs(n: usize) -> usize { let mut count = 0; for i in 1..n { for j in i..n { count += 1; } } count } }
Question: If we call count_pairs(n), how many times does the inner loop execute in total?
- Try with a small value like n=4 to trace through it
- Can you find a pattern or formula?
What is Big O?
Big O notation describes how runtime/memory grows as input size grows.
Key idea: We ignore:
- Exact number of operations
- Constants and performance on small inputs
- Hardware / OS dependent values
We focus on: The growth rate as n goes to infinity
Example: Linear growth
#![allow(unused)] fn main() { fn print_all(arr: &[i32]) { for &item in arr { // n iterations println!("{}", item); } } }
- Array of size 10: ~10 operations
- Array of size 100: ~100 operations
- Array of size n: ~n operations
This is O(n) - "linear time"
Example: Quadratic growth
#![allow(unused)] fn main() { fn print_all_pairs(arr: &[i32]) { for &i in arr { // n iterations for &j in arr { // n iterations for EACH i println!("{}, {}", i, j); } } } }
- Array of size 10: ~100 operations (10 × 10)
- Array of size 100: ~10,000 operations (100 × 100)
- Array of size n: ~n^2 operations
This is O(n^2) - "quadratic time"
Example: Logarithmic growth
#![allow(unused)] fn main() { fn binary_search(arr: &[i32], target: i32) -> Option<usize> { let mut low = 0; let mut high = arr.len(); while low < high { let mid = (low + high) / 2; if arr[mid] == target { return Some(mid); } else if arr[mid] < target { low = mid + 1; } else { high = mid; } } None } }
- Array of size 10: ~3-4 operations (log_2 10 ≈ 3.3)
- Array of size 100: ~6-7 operations (log_2 100 ≈ 6.6)
- Array of size 1,000,000: ~20 operations! (log_2 1,000,000 ≈ 20)
This is O(log n) - "logarithmic time" (very fast!)
Example: Exponential growth
#![allow(unused)] fn main() { fn print_all_subsets(arr: &[i32], index: usize, current: &mut Vec<i32>) { if index == arr.len() { println!("{:?}", current); // Print one subset return; } // Don't include arr[index] print_all_subsets(arr, index + 1, current); // Include arr[index] current.push(arr[index]); print_all_subsets(arr, index + 1, current); current.pop(); } }
- Array of size 3: 8 subsets (2³)
- Array of size 10: 1,024 subsets (2¹⁰)
- Array of size 20: 1,048,576 subsets (2²⁰)
- Array of size n: 2^n subsets
This is O(2^n) - "exponential time" (explodes quickly!)
Example: Constant time
#![allow(unused)] fn main() { fn get_first(arr: &[i32]) -> Option<i32> { arr.first().copied() } }
- Array of size 10: 1 operation
- Array of size 1000: 1 operation
- Array of size n: still 1 operation!
This is O(1) - "constant time" (doesn't depend on n)
Think about: What's the complexity?
#![allow(unused)] fn main() { fn find_range(arr: &[i32]) -> Option<i32> { let mut min = arr.first()?; for &item in arr { if item < min { min = item; } } let mut max = arr.first()?; for &item in arr { if item > max { max = item; } } Some(max - min) } }
[PAUSE - think-pair-share]
Common complexity classes (from best to worst)
| Notation | Name | Example |
|---|---|---|
| O(1) | Constant | Array access by index |
| O(log n) | Logarithmic | Binary search |
| O(n) | Linear | Loop through array once |
| O(n log n) | Linearithmic | Good sorting algorithms |
| O(n^2) | Quadratic | Nested loops |
| O(2^n) | Exponential | Trying all subsets |
| O(n!) | Factorial | Trying all permutations |
Rule of thumb: Each step down this list is MUCH slower!
Rules for analyzing code
-
Loops: Multiply complexity by number of iterations
- Loop n times doing O(1) work = O(n)
- Loop n times doing O(n) work = O(n^2)
- Outer loop n times, inner loop m times = O(n m)
-
Drop constants and lower-order terms:
- O(3n) -> O(n)
- O(n^2 + n) -> O(n^2)
- O(5) -> O(1)
Let's do this one together
#![allow(unused)] fn main() { fn mystery_function(arr: &[i32]) -> i32 { let n = arr.len(); let mut count = 0; for i in 0..n { count += arr[i]; } for i in 0..10 { count += 1; } for i in 0..n { for j in 0..n { if arr[i] == arr[j] { count += 1; } } } count } }
Space complexity too!
Big O also applies to memory usage.
#![allow(unused)] fn main() { fn make_doubles(arr: &[i32]) -> Vec<i32> { let mut result = Vec::new(); for &item in arr { result.push(item * 2); } result } }
- Time complexity: O(n) - one loop
- Space complexity: O(n) - create new vector of size n
Best case vs. worst case vs. average case
Example: Linear search
#![allow(unused)] fn main() { fn find_position(arr: &[i32], target: i32) -> Option<usize> { for (i, &item) in arr.iter().enumerate() { if item == target { return Some(i); } } None } }
- Best case: O(1) - target is first element
- Worst case: O(n) - target not in array (must check all)
- Average case: O(n) - on average, check half the array
Usually we care most about worst case!
Complexity of Rust Operations
Vec operations: What's the complexity?
Let's think about standard Vec operations:
| Operation | Big O | Why |
|---|---|---|
vec[i] (indexing) | O(1) | Direct memory access |
vec.push(x) | O(1)* | Usually just increment (amortized*) |
vec.pop() | O(1) | Just decrement |
vec.insert(0, x) | O(n) | Must shift all elements |
vec.remove(i) | O(n) | Must shift elements after i |
vec.contains(&x) | O(n) | Must check each element |
How Vec.push() is clever
Problem: Vec has fixed capacity. What if it fills up?
Solution: When full, allocate double the space and copy everything over.
Example growth: capacity goes 4 -> 8 -> 16 -> 32 -> 64...
Cost analysis:
- Most pushes: O(1) - just add to end
- Occasional push: O(n) - must copy everything
- Amortized over many operations: O(1)!
Example: Implementing a simple dynamic array
Here's a simplified version showing the core idea:
struct SimpleVec { data: Vec<i32>, len: usize, capacity: usize, } impl SimpleVec { fn new() -> Self { SimpleVec { data: Vec::new(), len: 0, capacity: 0, } } fn push(&mut self, value: i32) { // Check if we need to grow if self.len == self.capacity { // Double capacity (or start with 4) let new_capacity = if self.capacity == 0 { 4 } else { self.capacity * 2 }; // Allocate new space and copy let mut new_data = Vec::with_capacity(new_capacity); for i in 0..self.len { new_data.push(self.data[i]); } self.data = new_data; self.capacity = new_capacity; println!("Resized! New capacity: {}", new_capacity); } // Add the new element self.data.push(value); self.len += 1; } } fn main() { let mut v = SimpleVec::new(); for i in 0..10 { println!("Pushing {}", i); v.push(i); } }
What you'll see:
Pushing 0
Resized! New capacity: 4
Pushing 1
Pushing 2
Pushing 3
Pushing 4
Resized! New capacity: 8
Pushing 5
...
Key insight: Most pushes don't resize. The occasional expensive resize is amortized across many cheap pushes!
Bonus: Why "Big-O"? The notation family
You might wonder: Is there a "little-o"? Why "Big"?
Big-O is part of a family of asymptotic notations:
Big-O (O): Upper bound - "at most this fast"
- Both O(n) and O(n^2) algorithms are O(n³)
- Most common - used for worst-case analysis
Big-Theta (Θ): Tight bound - "exactly this fast"
- More precise than Big-O
Big-Omega (Ω): Lower bound - "at least this fast"
- Eg. Any sorting algorithm is Ω(n) because you must look at all elements
- Used for best-case or impossibility results
Little-o (o): Strict upper bound - "strictly slower than"
- Example: n is o(n^2), but n is not o(n)
- Rarely used in practice
When you'll see the others:
- Θ: Advanced algorithms courses, research papers
- Ω: Proving lower bounds, impossibility results
- o: Theoretical CS, mathematical proofs
Bonus - P vs NP and computational complexity
What is P?
P = Problems solvable in Polynomial time
Polynomial time means O(n^k) for some constant k:
- O(n), O(n^2), O(n^3), O(n^10) are all polynomial
- O(2^n), O(n!), O(n^n) are NOT polynomial
Examples of P problems:
- Sorting: O(n log n)
- Finding max: O(n)
- Matrix multiplication (in the activity!)
- Shortest path (Dijkstra): O(E log V)
Key idea: Problems in P are considered "efficiently solvable"
What is NP?
NP = Nondeterministic Polynomial time
Definition: Problems where:
- Solutions can be verified in polynomial time
- But finding solutions might be harder
Example: Sudoku
- Verifying a solution: O(n^2) - just check rows, columns, boxes
- Finding a solution: Unknown - might need to try many possibilities
All P problems are in NP:
- If you can solve it fast, you can verify it fast too
- P is a subset of NP
The million-dollar question: P vs NP
Question: Does P = NP?
In other words: If we can quickly verify a solution, can we quickly find it too?
Most believe: P != NP (there are problems where verifying is easier than solving)
Why it matters:
- If P = NP: Many "hard" problems become easy (cryptography breaks!)
- If P != NP: Some problems are fundamentally hard
Prize: Solve this and win $1 million (Clay Mathematics Institute)
NP-complete problems
NP-Complete: The "hardest" problems in NP
Examples:
- Traveling Salesman Problem (TSP)
- Boolean satisfiability (SAT)
- Knapsack problem
- Graph coloring
- Sudoku solving
Special property: If you can solve ANY NP-complete problem in polynomial time, then P = NP!
Why should you care?
In practice:
- Recognize when a problem is NP-complete
- Don't waste time looking for fast exact solutions
- Use approximations or heuristics instead
Example:
- Finding THE best route (TSP): NP-complete, use approximations
- Finding A good route (Dijkstra): P, can solve exactly
Remember: Not all hard-looking problems are NP-complete!
- Some can be solved efficiently with clever algorithms
- Learning algorithms helps you recognize which is which
Complexity cheat sheet
Fast to Slow:
- O(1) - Instant, no matter the size
- O(log n) - Doubles the size, adds one step
- O(n) - Proportional to size
- O(n log n) - The best we can do for sorting
- O(n^2) - Nested loops, gets bad quickly
- O(2^n) - Explodes! Avoid if possible
Remember: The difference between O(n) and O(n^2) can be seconds vs. hours!