Lecture 32 - Sorting Algorithms
Logistics
- HW6 due Friday / HW7 released shortly after
- HW5 grades out soon, corrections regraded in < a week
Quick note - clarifying space complexity
Learning objectives
By the end of today, you should be able to:
- Describe how different sorting algorithms work (bubble, insertion, merge, quick)
- Analyze the time complexity of each sorting algorithm
- Explain the merge sort and quicksort algorithms in detail
Motivation: Why do we care about sorting?
Sorting is everywhere:
- Search results (Google, Amazon)
- Leaderboards and rankings
- File systems (sort by date, name, size)
- Finding median, percentiles
- Preparing data for efficient search
Many problems become easier with sorted data
Think-pair-share: What makes a sorting algorithm good?
Question: If you have two sorting algorithms with the same time-complexity, why might you prefer one to the other?
Bubble Sort: The simplest (and slowest!)
Idea: Repeatedly swap adjacent elements if they're in wrong order
Demo on the board
Algorithm:
- Compare arr[0] and arr[1], swap if needed
- Compare arr[1] and arr[2], swap if needed
- Continue to end of array
- Repeat until no swaps needed
Example: Sort [5, 2, 8, 1, 9]
Pass 1: [5,2,8,1,9] -> [2,5,8,1,9] -> [2,5,8,1,9] -> [2,5,1,8,9] -> [2,5,1,8,9]
Pass 2: [2,5,1,8,9] -> [2,1,5,8,9] -> [2,1,5,8,9]
Pass 3: [1,2,5,8,9] -> Done!
Bubble sort complexity
#![allow(unused)] fn main() { fn bubble_sort(arr: &mut [i32]) { let n = arr.len(); for i in 0..n { // Outer loop: n times for j in 0..n-i-1 { // Inner loop: ~n times (average) if arr[j] > arr[j+1] { arr.swap(j, j+1); // O(1) } } } } }
Analysis:
- Time complexity: O(n^2) - nested loops
- Space complexity: O(1) - sorts in place
- Stable: Yes - equal elements stay in order
- Best case: O(n) if already sorted (with optimization)
Verdict: Simple but too slow for large data!
Insertion Sort: Like sorting playing cards
Idea: Build sorted portion one element at a time
How you'd sort cards:
- Pick up first card - sorted!
- Pick up second card, insert in right place
- Pick up third card, insert in right place
- Continue...
Example: Sort [5, 2, 8, 1, 9]
[5] | 2, 8, 1, 9 Sorted portion: [5]
[2, 5] | 8, 1, 9 Insert 2: shift 5 right
[2, 5, 8] | 1, 9 Insert 8: already in place
[1, 2, 5, 8] | 9 Insert 1: shift everything
[1, 2, 5, 8, 9] Insert 9: done!
Insertion sort complexity
#![allow(unused)] fn main() { fn insertion_sort(arr: &mut [i32]) { for i in 1..arr.len() { // n-1 times let key = arr[i]; let mut j = i; while j > 0 && arr[j-1] > key { // Up to i times (worst case) arr[j] = arr[j-1]; j -= 1; } arr[j] = key; } } }
Analysis:
- Time complexity: O(n^2) worst case, O(n) best case
- Space complexity: O(1)
- Stable: Yes
- Adaptive: Fast on nearly-sorted data!
Verdict: Good for small or nearly-sorted arrays!
Mini-activity: The sound of sorting
We're going to watch a video comparing different sorting algorithms working on the same data.
Your task: Fill in this table as you watch. You'll see:
- Bubble Sort
- Insertion Sort
- Merge Sort
- Quick Sort
For each algorithm, note:
- What pattern/strategy do you see?
- How fast/slow does it seem?
- Any advantages or disadvantages you notice?
| Algorithm | What's the strategy? ........... | Speed | Pros/Cons |
|---|---|---|---|
| Selection Sort | |||
| Insertion Sort | |||
| Quick Sort | |||
| Merge Sort | |||
| Heap Sort | |||
| Radix Sort (LSD) | |||
| Radix Sort (MSD) | |||
| std::sort | |||
| std::stable_sort | |||
| Shell sort | |||
| Bubble sort | |||
| Cocktail shaker | |||
| Gnome sort | |||
| Bitonic sort | |||
| Bogo sort |
After watching: What patterns did you notice? Which algorithms seem most efficient?
Takeaways
- Practical algorithms: Quick, Merge, Heap (all O(n log n))
- Special cases: Insertion for nearly-sorted, Radix for integers
- Avoid: Bubble, Selection, Bogo
- Real world: Use language built-ins (introsort, timsort)
Divide-and-conquer for Mergesort
Key idea: Break problem into smaller subproblems, solve recursively, combine results
Merge Sort approach:
- Divide: Split array in half
- Conquer: Sort each half recursively
- Combine: Merge the two sorted halves
Base case: Array of size 1 is already sorted!
Merge sort example
Sort [38, 27, 43, 3, 9, 82, 10]
[38, 27, 43, 3, 9, 82, 10] Split
/ \
[38, 27, 43, 3] [9, 82, 10] Split again
/ \ / \
[38, 27] [43, 3] [9, 82] [10] Split again
/ \ / \ / \ |
[38] [27] [43] [3] [9] [82] [10] Base case - size 1!
Now merge back up:
[27, 38] [3, 43] [9, 82] [10] Merge pairs
[3, 27, 38, 43] [9, 10, 82] Merge pairs
[3, 9, 10, 27, 38, 43, 82] Final merge!
Merging example
Merge [2, 5, 8] and [1, 3, 9]
Left: [2, 5, 8] Right: [1, 3, 9] Result: []
^ ^
Compare 2 vs 1 -> take 1
Left: [2, 5, 8] Right: [1, 3, 9] Result: [1]
^ ^
Compare 2 vs 3 -> take 2
Left: [2, 5, 8] Right: [1, 3, 9] Result: [1, 2]
^ ^
Compare 5 vs 3 -> take 3
... continue until Result: [1, 2, 3, 5, 8, 9]
Time: One comparison per element added = O(n)
Merge sort: Full implementation (skippable)
#![allow(unused)] fn main() { fn merge(left: &[i32], right: &[i32]) -> Vec<i32> { let mut result = Vec::new(); let mut i = 0; let mut j = 0; // Compare elements from left and right, take smaller while i < left.len() && j < right.len() { if left[i] <= right[j] { result.push(left[i]); i += 1; } else { result.push(right[j]); j += 1; } } // Add remaining elements result.extend_from_slice(&left[i..]); result.extend_from_slice(&right[j..]); result } fn merge_sort(arr: &[i32]) -> Vec<i32> { // Base case if arr.len() <= 1 { return arr.to_vec(); } // Divide let mid = arr.len() / 2; let left = merge_sort(&arr[..mid]); // Recursive! let right = merge_sort(&arr[mid..]); // Recursive! // Conquer: merge sorted halves merge(&left, &right) } }
Merge sort complexity analysis
Time complexity:
- Each level of recursion processes all n elements: O(n)
- How many levels? log_2(n) - we halve the array each time
- Total: O(n log n)
Visual: Binary tree of recursive calls
n Level 0: n work
/ \
n/2 n/2 Level 1: n work total
/ \ / \
n/4 n/4 n/4 n/4 Level 2: n work total
...
Height = log n, each level = n work = O(n log n)
Space complexity: O(n) - need extra arrays for merging
Properties:
- Stable
- Predictable (always O(n log n))
- NOT in-place (higher space complexity)
Quicksort - another divide-and-conquer!
Idea: Pick a "pivot", partition array so:
- All elements < pivot are on the left
- All elements > pivot are on the right
- Recursively sort left and right portions
Difference from merge sort:
- Merge sort: Easy divide, hard combine
- Quicksort: Hard divide (partition), easy combine (nothing!)
Quicksort example
Sort [38, 27, 43, 3, 9, 82, 10] - pick last element as pivot
[38, 27, 43, 3, 9, 82, 10] Pivot = 10
Partition: move elements < 10 to left, > 10 to right
[3, 9, 10, 38, 27, 43, 82]
^
Left | Right
Recursively sort left: [3, 9]
Recursively sort right: [38, 27, 43, 82]
Continue until done!
The partition operation
Goal: Rearrange array so pivot is in correct position
High-level algorithm:
- Choose pivot (often last element)
- Scan array, putting small elements left, large elements right
- Put pivot in the middle
- Return pivot's final position
Example partition: Array [38, 27, 43, 3, 9, 82, 10], pivot = 10
Start: [38, 27, 43, 3, 9, 82, 10]
i p
Scan: 38 > 10, skip
27 > 10, skip
43 > 10, skip
3 < 10, found small element!
Swap: [3, 27, 43, 38, 9, 82, 10]
i p
Continue: 9 < 10, swap with 27
[3, 9, 43, 38, 27, 82, 10]
i p
All remaining > 10. Place pivot:
[3, 9, 10, 38, 27, 82, 43]
^
Pivot position = 2
Quicksort complexity
Time complexity:
- Best/Average case: O(n log n)
- Good pivot splits array roughly in half
- log n levels, n work per level
- Worst case: O(n^2)
- Bad pivot (smallest/largest every time)
- Happens when array already sorted and we pick first/last as pivot!
Space complexity: O(log n) - recursion stack
Properties:
- Not stable (elements can jump over equal elements)
- In-place (sorts in original array)
- Often fastest in practice
Improving quicksort: Choosing a better pivot
Problem: Always picking last element can lead to O(n^2)
Solutions:
- Random pivot: Pick random element (most common)
- Median-of-three: Take median of first, middle, last
- Median-of-medians: More complex, guarantees O(n log n)
In practice: Random pivot makes worst case extremely unlikely!
Rust's built-in sorting (For your reference)
You don't usually implement sorting from scratch!
#![allow(unused)] fn main() { let mut numbers = vec![5, 2, 8, 1, 9]; // Sort in place numbers.sort(); // Uses a hybrid algorithm (typically driftsort) println!("{:?}", numbers); // [1, 2, 5, 8, 9] // Sort with custom comparison numbers.sort_by(|a, b| b.cmp(a)); // Reverse order println!("{:?}", numbers); // [9, 8, 5, 2, 1] }
What Rust uses:
sort(): Stable sort, O(n log n), based on merge sortsort_unstable(): Faster, O(n log n), based on quicksort (egipnsort)
When to use which sort?
| Algorithm | When to use |
|---|---|
| Bubble/Insertion | Small arrays (< 50 items), nearly sorted data |
| Merge Sort | Need stable sort, predictable performance, external sorting (too big for memory) |
| Quicksort | General purpose, in-place sorting, average case matters more than worst |
Rust's sort() | Need stability, default choice |
Rust's sort_unstable() | Don't need stability, want maximum speed |
Rule of thumb: Use Rust's built-in sort() or sort_unstable() unless you have specific needs
Activity time!
Appendix - sorting custom types in Rust
#![allow(unused)] fn main() { #[derive(Debug)] struct Student { name: String, gpa: f64, } let mut students = vec![ Student { name: "Alice".to_string(), gpa: 3.8 }, Student { name: "Bob".to_string(), gpa: 3.9 }, Student { name: "Charlie".to_string(), gpa: 3.7 }, ]; // Sort by GPA students.sort_by(|a, b| a.gpa.partial_cmp(&b.gpa).unwrap()); // Or better NaN handling students.sort_by(|a, b| { a.gpa.partial_cmp(&b.gpa) .unwrap_or(std::cmp::Ordering::Equal) // Treat NaN as equal }); // Sort by name students.sort_by(|a, b| a.name.cmp(&b.name)); }