Lecture 25 - Systems and Rust

Logistics

  • Midterm 2 next Wednesday
  • Review sessions in lecture Monday and discussion Tuesday
  • HW5 due next Friday (if you signed up)

Learning objectives

Today we'll explicitly name some of the theoretical motivations underlying this past third of the course. You'll be able to:

  • Explain what systems programming is and why it's different from application programming
  • Identify common memory safety bugs (use-after-free, double-free, data races)
  • Appreciate why Rust makes the design choices it does

What is systems programming? (TC 12:25)

Systems programming: Writing software that controls the machine directly

Examples of systems software:

  • Operating systems
  • Databases
  • Game engines
  • Web browsers
  • Compilers and interpreters
  • Embedded systems (IoT devices, cars)
  • Network infrastructure (routers, load balancers)

Key characteristics:

  • Direct control over memory
  • Performance critical
  • Runs for long periods (can't crash!)
  • Often concurrent/parallel
  • Small memory footprint matters

Contrast: Application programming (Python scripts, web apps) runs on top of systems software

The memory management spectrum

Different languages make different trade-offs:

Control/Performance <----------------------------> Safety/Ease

C/C++              Rust           Java/Go         Python/JS
│                  │              │               │
Manual             Ownership      Garbage         Garbage
Memory             System         Collection      Collection
                                  (predictable)   (unpredictable)
│                  │              │               │
Fast, Unsafe       Fast, Safe     Slower, Safe    Slowest, Safe

Memory safety bugs: the problem Rust solves (TC 12:30)

The billion-dollar mistake

"I call it my billion-dollar mistake...the invention of the null reference." — Tony Hoare (invented null in 1965)

In C/C++: ~70% of security vulnerabilities are memory safety bugs (Microsoft/Google data)

We'll see five bugs that are common in languages like C/C++ but that Rust prevents:

  1. Use-after-free
  2. Double-free
  3. Dangling pointers
  4. Buffer overflow
  5. Data races

Bug #1: Use-after-free

What happens: You free memory, then try to use it

// C code - compiles but UNSAFE!
char* data = malloc(100);
strcpy(data, "hello");

free(data);  // Memory returned to OS

// Later...
printf("%s", data);  // Reading freed memory!
                     // Might work, might crash, might read garbage

Why it's dangerous:

  • Memory might be reused by something else
  • Could read sensitive data (passwords, credit cards)
  • Could crash your program
  • Unpredictable - might work in testing, fail in production

How Rust prevents this:

The compiler tracks ownership and won't let you use freed memory!

Bug #2: Double-free

What happens: You free the same memory twice

// C code - compiles but CRASHES!
char* ptr1 = malloc(100);
char* ptr2 = ptr1;  // Two pointers to same memory

free(ptr1);  // Free once
free(ptr2);  // Free again! CRASH or worse

Why it's dangerous:

  • Corrupts memory allocator's internal state
  • Can lead to security exploits
  • Unpredictable behavior

How Rust prevents this:

Since each value has exactly ONE owner, it can only be freed once

Bug #3: Dangling pointers

What happens: A reference outlives the data it points to

// C code - compiles but UNSAFE!
int* get_number() {
    int x = 42;
    return &x;  // Returning pointer to local variable!
}                // x is freed when function returns

int main() {
    int* ptr = get_number();
    printf("%d", *ptr);  // Reading freed memory!
}

How Rust prevents this:

Lifetimes ensure references can't outlive the data they point to!

Bug #4: Buffer overflow

What happens: Writing past the end of an array

// C code - compiles but CRASHES!
int arr[5];
for (int i = 0; i <= 10; i++) {  // Off by one!
    arr[i] = i;  // Writes past end of array
}

Why it's dangerous:

  • Overwrites other variables
  • Can overwrite return addresses (security exploits!)
  • Famous vulnerabilities: Heartbleed, etc.

How Rust prevents this:

Rust checks array bounds at runtime if necessary (and crashes), and checks at compile time when possible!

Bug #5: Data races (TC 12:40)

What happens: Two threads access the same memory, at least one writes, no synchronization

// C code with threads - compiles but RACE CONDITION!
int counter = 0;

void* increment(void* arg) {
    for (int i = 0; i < 1000000; i++) {
        counter++;  // Not thread-safe!
    }
}

// Run two threads...
// Expected: counter = 2,000,000
// Actual: counter = ??? (unpredictable!)

Why it's dangerous:

  • Unpredictable results
  • Hard to reproduce bugs
  • Can corrupt data structures

How Rust prevents this:

We'll see more about this later, but Rust prevents data races at compile time!

Memory safety: summary

Bug TypeWhat It IsHow Rust Prevents
Use-after-freeUsing freed memoryBorrow checker
Double-freeFreeing twiceBorrow checker
Dangling pointerReference outlives dataLifetimes
Buffer overflowArray out of boundsBounds checking
Data raceConcurrent unsynchronized access(We'll see!)

Key insight: All caught at compile time (except bounds, which panic safely)!

Feature, not a bug - "Zero-Cost Abstractions"

"What you don't use, you don't pay for. What you do use, you couldn't hand code any better." — Bjarne Stroustrup (C++ creator, but applies to Rust!)

Examples we've seen:

  • Vec<T>: As fast as manual array + size tracking
  • Iterators: Compile to same code as hand-written loops
  • Option/Result: Zero runtime cost vs. manual null checks
  • Traits: Static dispatch = direct function calls

The cost of memory bugs

Heartbleed (2014)

  • Buffer over-read bug (C) in security software
  • Leaked passwords, private keys, personal data
  • Affected ~17% of all web servers
  • Would not compile in Rust

Dropbox (2016) (not a bug but...)

  • Dropbox's file-sync engine was written in Python
  • Performance bottlenecks, high memory usage, concurrency bugs
  • Dropbox rewrite the whole sync engine in Rust, leading to 10x reduction in memory usage, eliminated race condition bugs, improved performance, sped up development

WannaCry Ransomware (2017)

  • Used Windows SMB (communications protocol) buffer overflow
  • Infected 200,000+ computers
  • $4 billion in damages
  • Would not compile in Rust

Firefox 2019 (2019)

  • Double-free vulnerability (C++)
  • Could allow attackers to execute arbitrary code by exploiting the corrupted heap
  • Bounty of $270,000 awarded to white hat hackers
  • Would not compile in Rust
  • Firefox has been gradually moving to Rust

Zoom Vulnerabilities (2020)

  • Use-after-free bugs in video processing allowed for remote code execution
  • Fortunately found by researchers / patched quickly
  • Would not compile in Rust
  • Zoom paid over $7 million in bug bounties from 2019-2023

If you get really into this stuff... https://www.hackerone.com/bug-bounty-programs

Legendary gaming bugs

The corrupted blood incident (World of Warcraft, 2005)

  • Bug in debuff handling (use-after-free-like behavior)
  • Disease spread uncontrollably, "killed" thousands of players
  • CDC studied it as an epidemiology model!
  • Rust's borrow checker would have caught the improper state handling

The nuclear Gandhi (Civilization, 1991)

  • This example was removed because I apparently fell for a rumor and it never happened!

Pokémon item duplication glitch

  • Buffer overflow in inventory management
  • Players could duplicate rare items by exploiting memory corruption
  • Rust's bounds checking would panic instead of corrupting memory
  • (But this one is so much fun!)