Lecture 32 - Sorting Algorithms

Logistics

HW6 due Friday / HW7 released shortly after
HW5 grades out soon, corrections regraded in < a week

Quick note - clarifying space complexity

Learning objectives

By the end of today, you should be able to:

Describe how different sorting algorithms work (bubble, insertion, merge, quick)
Analyze the time complexity of each sorting algorithm
Explain the merge sort and quicksort algorithms in detail

Motivation: Why do we care about sorting?

Sorting is everywhere:

Search results (Google, Amazon)
Leaderboards and rankings
File systems (sort by date, name, size)
Finding median, percentiles
Preparing data for efficient search

Many problems become easier with sorted data

Question: If you have two sorting algorithms with the same time-complexity, why might you prefer one to the other?

Bubble Sort: The simplest (and slowest!)

Idea: Repeatedly swap adjacent elements if they're in wrong order

Demo on the board

Algorithm:

Compare arr[0] and arr[1], swap if needed
Compare arr[1] and arr[2], swap if needed
Continue to end of array
Repeat until no swaps needed

Example: Sort [5, 2, 8, 1, 9]

Pass 1: [5,2,8,1,9] -> [2,5,8,1,9] -> [2,5,8,1,9] -> [2,5,1,8,9] -> [2,5,1,8,9]
Pass 2: [2,5,1,8,9] -> [2,1,5,8,9] -> [2,1,5,8,9]
Pass 3: [1,2,5,8,9] -> Done!

Bubble sort complexity

#![allow(unused)]
fn main() {
fn bubble_sort(arr: &mut [i32]) {
    let n = arr.len();
    for i in 0..n {                    // Outer loop: n times
        for j in 0..n-i-1 {            // Inner loop: ~n times (average)
            if arr[j] > arr[j+1] {
                arr.swap(j, j+1);      // O(1)
            }
        }
    }
}
}

Analysis:

Time complexity: O(n^2) - nested loops
Space complexity: O(1) - sorts in place
Stable: Yes - equal elements stay in order
Best case: O(n) if already sorted (with optimization)

Verdict: Simple but too slow for large data!

Insertion Sort: Like sorting playing cards

Idea: Build sorted portion one element at a time

How you'd sort cards:

Pick up first card - sorted!
Pick up second card, insert in right place
Pick up third card, insert in right place
Continue...

Example: Sort [5, 2, 8, 1, 9]

[5] | 2, 8, 1, 9              Sorted portion: [5]
[2, 5] | 8, 1, 9              Insert 2: shift 5 right
[2, 5, 8] | 1, 9              Insert 8: already in place
[1, 2, 5, 8] | 9              Insert 1: shift everything
[1, 2, 5, 8, 9]               Insert 9: done!

Insertion sort complexity

#![allow(unused)]
fn main() {
fn insertion_sort(arr: &mut [i32]) {
    for i in 1..arr.len() {                // n-1 times
        let key = arr[i];
        let mut j = i;
        while j > 0 && arr[j-1] > key {    // Up to i times (worst case)
            arr[j] = arr[j-1];
            j -= 1;
        }
        arr[j] = key;
    }
}
}

Analysis:

Time complexity: O(n^2) worst case, O(n) best case
Space complexity: O(1)
Stable: Yes
Adaptive: Fast on nearly-sorted data!

Verdict: Good for small or nearly-sorted arrays!

Mini-activity: The sound of sorting

Sorting Video

We're going to watch a video comparing different sorting algorithms working on the same data.

Your task: Fill in this table as you watch. You'll see:

Bubble Sort
Insertion Sort
Merge Sort
Quick Sort

For each algorithm, note:

What pattern/strategy do you see?
How fast/slow does it seem?
Any advantages or disadvantages you notice?

Algorithm	What's the strategy? ...........	Speed	Pros/Cons
Selection Sort
Insertion Sort
Quick Sort
Merge Sort
Heap Sort
Radix Sort (LSD)
Radix Sort (MSD)
std::sort
std::stable_sort
Shell sort
Bubble sort
Cocktail shaker
Gnome sort
Bitonic sort
Bogo sort

After watching: What patterns did you notice? Which algorithms seem most efficient?

Takeaways

Practical algorithms: Quick, Merge, Heap (all O(n log n))
Special cases: Insertion for nearly-sorted, Radix for integers
Avoid: Bubble, Selection, Bogo
Real world: Use language built-ins (introsort, timsort)

Divide-and-conquer for Mergesort

Key idea: Break problem into smaller subproblems, solve recursively, combine results

Merge Sort approach:

Divide: Split array in half
Conquer: Sort each half recursively
Combine: Merge the two sorted halves

Base case: Array of size 1 is already sorted!

Merge sort example

Sort [38, 27, 43, 3, 9, 82, 10]

[38, 27, 43, 3, 9, 82, 10]           Split
    /                \
[38, 27, 43, 3]    [9, 82, 10]       Split again
   /        \         /      \
[38, 27]  [43, 3]  [9, 82]  [10]     Split again
 /    \    /   \    /   \      |
[38] [27] [43] [3] [9] [82]  [10]    Base case - size 1!

Now merge back up:
[27, 38] [3, 43] [9, 82] [10]        Merge pairs
[3, 27, 38, 43]  [9, 10, 82]         Merge pairs
[3, 9, 10, 27, 38, 43, 82]           Final merge!

Merging example

Merge [2, 5, 8] and [1, 3, 9]

Left: [2, 5, 8]    Right: [1, 3, 9]    Result: []
       ^                    ^
Compare 2 vs 1 -> take 1

Left: [2, 5, 8]    Right: [1, 3, 9]    Result: [1]
       ^                       ^
Compare 2 vs 3 -> take 2

Left: [2, 5, 8]    Right: [1, 3, 9]    Result: [1, 2]
          ^                    ^
Compare 5 vs 3 -> take 3

... continue until Result: [1, 2, 3, 5, 8, 9]

Time: One comparison per element added = O(n)

Merge sort: Full implementation (skippable)

#![allow(unused)]
fn main() {
fn merge(left: &[i32], right: &[i32]) -> Vec<i32> {
    let mut result = Vec::new();
    let mut i = 0;
    let mut j = 0;

    // Compare elements from left and right, take smaller
    while i < left.len() && j < right.len() {
        if left[i] <= right[j] {
            result.push(left[i]);
            i += 1;
        } else {
            result.push(right[j]);
            j += 1;
        }
    }

    // Add remaining elements
    result.extend_from_slice(&left[i..]);
    result.extend_from_slice(&right[j..]);

    result
}

fn merge_sort(arr: &[i32]) -> Vec<i32> {
    // Base case
    if arr.len() <= 1 {
        return arr.to_vec();
    }

    // Divide
    let mid = arr.len() / 2;
    let left = merge_sort(&arr[..mid]);      // Recursive!
    let right = merge_sort(&arr[mid..]);     // Recursive!

    // Conquer: merge sorted halves
    merge(&left, &right)
}
}

Merge sort complexity analysis

Time complexity:

Each level of recursion processes all n elements: O(n)
How many levels? log_2(n) - we halve the array each time
Total: O(n log n)

Visual: Binary tree of recursive calls

               n                    Level 0: n work
            /     \
          n/2     n/2               Level 1: n work total
         /  \     /  \
       n/4  n/4 n/4  n/4            Level 2: n work total
       ...

Height = log n, each level = n work = O(n log n)

Space complexity: O(n) - need extra arrays for merging

Properties:

Stable
Predictable (always O(n log n))
NOT in-place (higher space complexity)

Quicksort - another divide-and-conquer!

Idea: Pick a "pivot", partition array so:

All elements < pivot are on the left
All elements > pivot are on the right
Recursively sort left and right portions

Difference from merge sort:

Merge sort: Easy divide, hard combine
Quicksort: Hard divide (partition), easy combine (nothing!)

Quicksort example

Sort [38, 27, 43, 3, 9, 82, 10] - pick last element as pivot

[38, 27, 43, 3, 9, 82, 10]    Pivot = 10

Partition: move elements < 10 to left, > 10 to right
[3, 9, 10, 38, 27, 43, 82]
      ^
  Left | Right

Recursively sort left: [3, 9]
Recursively sort right: [38, 27, 43, 82]

Continue until done!

The partition operation

Goal: Rearrange array so pivot is in correct position

High-level algorithm:

Choose pivot (often last element)
Scan array, putting small elements left, large elements right
Put pivot in the middle
Return pivot's final position

Example partition: Array [38, 27, 43, 3, 9, 82, 10], pivot = 10

Start:    [38, 27, 43, 3, 9, 82, 10]
           i                      p

Scan: 38 > 10, skip
      27 > 10, skip
      43 > 10, skip
      3 < 10, found small element!

Swap:     [3, 27, 43, 38, 9, 82, 10]
              i                   p

Continue: 9 < 10, swap with 27
          [3, 9, 43, 38, 27, 82, 10]
                 i                p

All remaining > 10. Place pivot:
          [3, 9, 10, 38, 27, 82, 43]
                 ^
            Pivot position = 2

Quicksort complexity

Time complexity:

Best/Average case: O(n log n)
- Good pivot splits array roughly in half
- log n levels, n work per level
Worst case: O(n^2)
- Bad pivot (smallest/largest every time)
- Happens when array already sorted and we pick first/last as pivot!

Space complexity: O(log n) - recursion stack

Properties:

Not stable (elements can jump over equal elements)
In-place (sorts in original array)
Often fastest in practice

Improving quicksort: Choosing a better pivot

Problem: Always picking last element can lead to O(n^2)

Solutions:

Random pivot: Pick random element (most common)
Median-of-three: Take median of first, middle, last
Median-of-medians: More complex, guarantees O(n log n)

In practice: Random pivot makes worst case extremely unlikely!

Rust's built-in sorting (For your reference)

You don't usually implement sorting from scratch!

#![allow(unused)]
fn main() {
let mut numbers = vec![5, 2, 8, 1, 9];

// Sort in place
numbers.sort();  // Uses a hybrid algorithm (typically driftsort)
println!("{:?}", numbers);  // [1, 2, 5, 8, 9]

// Sort with custom comparison
numbers.sort_by(|a, b| b.cmp(a));  // Reverse order
println!("{:?}", numbers);  // [9, 8, 5, 2, 1]
}

What Rust uses:

sort(): Stable sort, O(n log n), based on merge sort
sort_unstable(): Faster, O(n log n), based on quicksort (eg ipnsort)

When to use which sort?

Algorithm	When to use
Bubble/Insertion	Small arrays (< 50 items), nearly sorted data
Merge Sort	Need stable sort, predictable performance, external sorting (too big for memory)
Quicksort	General purpose, in-place sorting, average case matters more than worst
Rust's `sort()`	Need stability, default choice
Rust's `sort_unstable()`	Don't need stability, want maximum speed

Rule of thumb: Use Rust's built-in sort() or sort_unstable() unless you have specific needs

Activity time!

Appendix - sorting custom types in Rust

#![allow(unused)]
fn main() {
#[derive(Debug)]
struct Student {
    name: String,
    gpa: f64,
}

let mut students = vec![
    Student { name: "Alice".to_string(), gpa: 3.8 },
    Student { name: "Bob".to_string(), gpa: 3.9 },
    Student { name: "Charlie".to_string(), gpa: 3.7 },
];

// Sort by GPA
students.sort_by(|a, b| a.gpa.partial_cmp(&b.gpa).unwrap());

// Or better NaN handling
students.sort_by(|a, b| {
    a.gpa.partial_cmp(&b.gpa)
        .unwrap_or(std::cmp::Ordering::Equal)  // Treat NaN as equal
});

// Sort by name
students.sort_by(|a, b| a.name.cmp(&b.name));
}

Lauren's DS210 Materials