Lecture 31 - Big O Notation & Algorithmic Complexity

Logistics

  • Welcome to the algorithms & data structures unit!
  • Retest scores out, corrections tomorrow in discussion
  • HW6 due Friday / HW7 released Friday (maybe Saturday)
  • HW5 grades will be released tomorrow / corrections due a week from tomorrow
  • Readings shift focus: Python DS book + videos (concepts, not syntax!)
  • DECKS OF CARDS?

Learning objectives

By the end of today, you should be able to:

  • Use Big O notation to describe time and space complexity
  • Analyze code to determine its Big O complexity (loops, nested loops, logarithmic patterns)
  • Recognize common complexity classes: O(1), O(log n), O(n), O(n^2), O(2^n)
  • Apply key rules: drop constants, keep dominant terms

Part 1: Big O notation - The math of "about how fast?"

Motivation: When does speed matter?

Think about:

  • Sorting 10 items vs. sorting 1 million items
  • Searching through 100 names vs. searching Facebook's 3 billion users
  • A game processing 60 frames per second

Our intuition is usually that a task that's twice as big should take twice as long

It's often not that simple - it depends on the algorithm

Think-pair-share: Counting operations

Part 1: Given this code:

#![allow(unused)]
fn main() {
fn sum_array(arr: &[i32]) -> i32 {
    let mut total = 0;
    for &num in arr {
        total += num;
    }
    total
}
}

Question: If the array has n elements, how many addition operations happen?

Part 2: Now consider this code:

#![allow(unused)]
fn main() {
fn count_pairs(n: usize) -> usize {
    let mut count = 0;
    for i in 1..n {
        for j in i..n {
            count += 1;
        }
    }
    count
}
}

Question: If we call count_pairs(n), how many times does the inner loop execute in total?

  • Try with a small value like n=4 to trace through it
  • Can you find a pattern or formula?

What is Big O?

Big O notation describes how runtime/memory grows as input size grows.

Key idea: We ignore:

  • Exact number of operations
  • Constants and performance on small inputs
  • Hardware / OS dependent values

We focus on: The growth rate as n goes to infinity

Example: Linear growth

#![allow(unused)]
fn main() {
fn print_all(arr: &[i32]) {
    for &item in arr {  // n iterations
        println!("{}", item);
    }
}
}
  • Array of size 10: ~10 operations
  • Array of size 100: ~100 operations
  • Array of size n: ~n operations

This is O(n) - "linear time"

Example: Quadratic growth

#![allow(unused)]
fn main() {
fn print_all_pairs(arr: &[i32]) {
    for &i in arr {           // n iterations
        for &j in arr {       // n iterations for EACH i
            println!("{}, {}", i, j);
        }
    }
}
}
  • Array of size 10: ~100 operations (10 × 10)
  • Array of size 100: ~10,000 operations (100 × 100)
  • Array of size n: ~n^2 operations

This is O(n^2) - "quadratic time"

Example: Logarithmic growth

#![allow(unused)]
fn main() {
fn binary_search(arr: &[i32], target: i32) -> Option<usize> {
    let mut low = 0;
    let mut high = arr.len();

    while low < high {
        let mid = (low + high) / 2;
        if arr[mid] == target {
            return Some(mid);
        } else if arr[mid] < target {
            low = mid + 1;     
        } else {
            high = mid;
        }
    }
    None
}
}
  • Array of size 10: ~3-4 operations (log_2 10 ≈ 3.3)
  • Array of size 100: ~6-7 operations (log_2 100 ≈ 6.6)
  • Array of size 1,000,000: ~20 operations! (log_2 1,000,000 ≈ 20)

This is O(log n) - "logarithmic time" (very fast!)

Example: Exponential growth

#![allow(unused)]
fn main() {
fn print_all_subsets(arr: &[i32], index: usize, current: &mut Vec<i32>) {
    if index == arr.len() {
        println!("{:?}", current);  // Print one subset
        return;
    }

    // Don't include arr[index]
    print_all_subsets(arr, index + 1, current);

    // Include arr[index]
    current.push(arr[index]);
    print_all_subsets(arr, index + 1, current);
    current.pop();
}
}
  • Array of size 3: 8 subsets (2³)
  • Array of size 10: 1,024 subsets (2¹⁰)
  • Array of size 20: 1,048,576 subsets (2²⁰)
  • Array of size n: 2^n subsets

This is O(2^n) - "exponential time" (explodes quickly!)

Example: Constant time

#![allow(unused)]
fn main() {
fn get_first(arr: &[i32]) -> Option<i32> {
    arr.first().copied()
}
}
  • Array of size 10: 1 operation
  • Array of size 1000: 1 operation
  • Array of size n: still 1 operation!

This is O(1) - "constant time" (doesn't depend on n)

Think about: What's the complexity?

#![allow(unused)]
fn main() {
fn find_range(arr: &[i32]) -> Option<i32> {
    let mut min = arr.first()?;
    for &item in arr {
        if item < min {
            min = item;
        }
    }

    let mut max = arr.first()?;
    for &item in arr {
        if item > max {
            max = item;
        }
    }

    Some(max - min)
}
}

[PAUSE - think-pair-share]

Common complexity classes (from best to worst)

NotationNameExample
O(1)ConstantArray access by index
O(log n)LogarithmicBinary search
O(n)LinearLoop through array once
O(n log n)LinearithmicGood sorting algorithms
O(n^2)QuadraticNested loops
O(2^n)ExponentialTrying all subsets
O(n!)FactorialTrying all permutations

Rule of thumb: Each step down this list is MUCH slower!

Rules for analyzing code

  1. Loops: Multiply complexity by number of iterations

    • Loop n times doing O(1) work = O(n)
    • Loop n times doing O(n) work = O(n^2)
    • Outer loop n times, inner loop m times = O(n m)
  2. Drop constants and lower-order terms:

    • O(3n) -> O(n)
    • O(n^2 + n) -> O(n^2)
    • O(5) -> O(1)

Let's do this one together

#![allow(unused)]
fn main() {
fn mystery_function(arr: &[i32]) -> i32 {
    let n = arr.len();
    let mut count = 0;

    for i in 0..n {
        count += arr[i];
    }

    for i in 0..10 {
        count += 1;
    }

    for i in 0..n {
        for j in 0..n {
            if arr[i] == arr[j] {
                count += 1;
            }
        }
    }

    count
}
}

Space complexity too!

Big O also applies to memory usage.

#![allow(unused)]
fn main() {
fn make_doubles(arr: &[i32]) -> Vec<i32> {
    let mut result = Vec::new();
    for &item in arr {
        result.push(item * 2);
    }
    result
}
}
  • Time complexity: O(n) - one loop
  • Space complexity: O(n) - create new vector of size n

Best case vs. worst case vs. average case

Example: Linear search

#![allow(unused)]
fn main() {
fn find_position(arr: &[i32], target: i32) -> Option<usize> {
    for (i, &item) in arr.iter().enumerate() {
        if item == target {
            return Some(i);
        }
    }
    None
}
}
  • Best case: O(1) - target is first element
  • Worst case: O(n) - target not in array (must check all)
  • Average case: O(n) - on average, check half the array

Usually we care most about worst case!

Complexity of Rust Operations

Vec operations: What's the complexity?

Let's think about standard Vec operations:

OperationBig OWhy
vec[i] (indexing)O(1)Direct memory access
vec.push(x)O(1)*Usually just increment (amortized*)
vec.pop()O(1)Just decrement
vec.insert(0, x)O(n)Must shift all elements
vec.remove(i)O(n)Must shift elements after i
vec.contains(&x)O(n)Must check each element

How Vec.push() is clever

Problem: Vec has fixed capacity. What if it fills up?

Solution: When full, allocate double the space and copy everything over.

Example growth: capacity goes 4 -> 8 -> 16 -> 32 -> 64...

Cost analysis:

  • Most pushes: O(1) - just add to end
  • Occasional push: O(n) - must copy everything
  • Amortized over many operations: O(1)!

Example: Implementing a simple dynamic array

Here's a simplified version showing the core idea:

struct SimpleVec {
    data: Vec<i32>,
    len: usize,
    capacity: usize,
}

impl SimpleVec {
    fn new() -> Self {
        SimpleVec {
            data: Vec::new(),
            len: 0,
            capacity: 0,
        }
    }

    fn push(&mut self, value: i32) {
        // Check if we need to grow
        if self.len == self.capacity {
            // Double capacity (or start with 4)
            let new_capacity = if self.capacity == 0 { 4 } else { self.capacity * 2 };

            // Allocate new space and copy
            let mut new_data = Vec::with_capacity(new_capacity);
            for i in 0..self.len {
                new_data.push(self.data[i]);
            }

            self.data = new_data;
            self.capacity = new_capacity;
            println!("Resized! New capacity: {}", new_capacity);
        }

        // Add the new element
        self.data.push(value);
        self.len += 1;
    }
}

fn main() {
    let mut v = SimpleVec::new();
    for i in 0..10 {
        println!("Pushing {}", i);
        v.push(i);
    }
}

What you'll see:

Pushing 0
Resized! New capacity: 4
Pushing 1
Pushing 2
Pushing 3
Pushing 4
Resized! New capacity: 8
Pushing 5
...

Key insight: Most pushes don't resize. The occasional expensive resize is amortized across many cheap pushes!

Bonus: Why "Big-O"? The notation family

You might wonder: Is there a "little-o"? Why "Big"?

Big-O is part of a family of asymptotic notations:

Big-O (O): Upper bound - "at most this fast"

  • Both O(n) and O(n^2) algorithms are O(n³)
  • Most common - used for worst-case analysis

Big-Theta (Θ): Tight bound - "exactly this fast"

  • More precise than Big-O

Big-Omega (Ω): Lower bound - "at least this fast"

  • Eg. Any sorting algorithm is Ω(n) because you must look at all elements
  • Used for best-case or impossibility results

Little-o (o): Strict upper bound - "strictly slower than"

  • Example: n is o(n^2), but n is not o(n)
  • Rarely used in practice

When you'll see the others:

  • Θ: Advanced algorithms courses, research papers
  • Ω: Proving lower bounds, impossibility results
  • o: Theoretical CS, mathematical proofs

Bonus - P vs NP and computational complexity

What is P?

P = Problems solvable in Polynomial time

Polynomial time means O(n^k) for some constant k:

  • O(n), O(n^2), O(n^3), O(n^10) are all polynomial
  • O(2^n), O(n!), O(n^n) are NOT polynomial

Examples of P problems:

  • Sorting: O(n log n)
  • Finding max: O(n)
  • Matrix multiplication (in the activity!)
  • Shortest path (Dijkstra): O(E log V)

Key idea: Problems in P are considered "efficiently solvable"

What is NP?

NP = Nondeterministic Polynomial time

Definition: Problems where:

  • Solutions can be verified in polynomial time
  • But finding solutions might be harder

Example: Sudoku

  • Verifying a solution: O(n^2) - just check rows, columns, boxes
  • Finding a solution: Unknown - might need to try many possibilities

All P problems are in NP:

  • If you can solve it fast, you can verify it fast too
  • P is a subset of NP

The million-dollar question: P vs NP

Question: Does P = NP?

In other words: If we can quickly verify a solution, can we quickly find it too?

Most believe: P != NP (there are problems where verifying is easier than solving)

Why it matters:

  • If P = NP: Many "hard" problems become easy (cryptography breaks!)
  • If P != NP: Some problems are fundamentally hard

Prize: Solve this and win $1 million (Clay Mathematics Institute)

NP-complete problems

NP-Complete: The "hardest" problems in NP

Examples:

  • Traveling Salesman Problem (TSP)
  • Boolean satisfiability (SAT)
  • Knapsack problem
  • Graph coloring
  • Sudoku solving

Special property: If you can solve ANY NP-complete problem in polynomial time, then P = NP!

Why should you care?

In practice:

  • Recognize when a problem is NP-complete
  • Don't waste time looking for fast exact solutions
  • Use approximations or heuristics instead

Example:

  • Finding THE best route (TSP): NP-complete, use approximations
  • Finding A good route (Dijkstra): P, can solve exactly

Remember: Not all hard-looking problems are NP-complete!

  • Some can be solved efficiently with clever algorithms
  • Learning algorithms helps you recognize which is which

Complexity cheat sheet

Fast to Slow:

  1. O(1) - Instant, no matter the size
  2. O(log n) - Doubles the size, adds one step
  3. O(n) - Proportional to size
  4. O(n log n) - The best we can do for sorting
  5. O(n^2) - Nested loops, gets bad quickly
  6. O(2^n) - Explodes! Avoid if possible

Remember: The difference between O(n) and O(n^2) can be seconds vs. hours!

Activity Time (on paper, then gradescope)