Lecture 24 - Lifetimes

Logistics

  • This is the last lecture of new Rust material (systems talk on Friday)
  • Midterm 2 review in lecture on Monday
  • Joey's review in discussion on Tuesday (review topic survey)
  • Midterm 2 on Wednesday (Nov 5)

Learning Objectives

By the end of today, you should be able to:

  • Understand the problem lifetimes solve (dangling references)
  • Read and interpret lifetime annotations in function signatures
  • Know when lifetimes are automatic vs. when you need to write them
  • Recognize lifetime elision rules in action

The Problem: Dangling References

In many languages, this compiles but causes bugs:

// C code - compiles but DANGEROUS!
char* get_name() {
    char name[] = "Alice";
    return name;  // Returning pointer to local data!
}                // name is freed when function returns!

int main() {
    char* ptr = get_name();
    printf("%s", ptr);  // Reading freed memory - undefined behavior!
}

What happens: The string name is stored on the stack and freed when get_name() returns. The pointer now points to freed memory!

This is called a dangling reference - one of the most common bugs in C/C++

(As you know!) Rust prevents this at compile time

#![allow(unused)]
fn main() {
fn get_name() -> &str {
    let name = String::from("Alice");
    &name[..]  // Compiler error: cannot return reference to local data
}
}

(Try to run it)

Rust's solution: The compiler tracks how long data lives (its lifetime) and prevents references from outliving their data!

What are lifetimes?

Lifetime: How long a piece of data is valid in your program

#![allow(unused)]
fn main() {
{
    let x = 5;             // ────┐ x's lifetime starts
    let r = &x;            //     │
    println!("{}", r);     //     │
}                          // ────┘ x's lifetime ends, r becomes invalid
}

Remember the borrow-checker rule: References can't live longer than the data they point to!

Most of the time, Rust figures this out automatically. But sometimes you need to help the compiler understand.

The Challenge: References crossing function boundaries

When references stay in one scope, lifetimes are obvious:

#![allow(unused)]
fn main() {
{
    let data = String::from("hello");
    let reference = &data;
    println!("{}", reference);
}  // Both data and reference end here - clear!
}

But what about functions that take and return references?

#![allow(unused)]
fn main() {
fn process(input: &str) -> &str {
    // Does the output reference come from input?
    // Or from something else?
    // How long will the returned reference be valid?
    input
}
}

The problem: The function signature doesn't tell us how the output lifetime relates to the input lifetime!

A key example

#![allow(unused)]
fn main() {
fn longest(x: &str, y: &str) -> &str {
    if x.len() > y.len() { x } else { y }
}
}

Questions the compiler can't answer without help:

  • Does the returned reference come from x or y?
  • How long is the returned reference valid?
  • What if x and y have different lifetimes?

This is why we need lifetime annotations - to tell the compiler how references relate to each other across function boundaries!

When lifetimes are automatic

Rust infers lifetimes in simple cases:

#![allow(unused)]
fn main() {
fn first_word(text: &str) -> &str {
    text.split_whitespace().next().unwrap_or("")
}
}

Why this works: There's only one input reference, so the output must come from that input. Rust assumes the returned reference has the same lifetime as text.

#![allow(unused)]
fn main() {
let sentence = String::from("Hello world");
let word = first_word(&sentence);
println!("{}", word);  // Works!
}

Rust automatically knows: "word's lifetime is at most sentence's lifetime"

When you need to write lifetimes

The compiler needs help when there are multiple possible sources for a returned reference:

#![allow(unused)]
fn main() {
fn longest(x: &str, y: &str) -> &str {
    if x.len() > y.len() { x } else { y }
}
}

Compiler error

The problem: Rust doesn't know if the returned reference comes from x or y, so it can't check if the reference will be valid!

Lifetime annotation syntax

Lifetime annotations use a single quote followed by a lowercase name:

#![allow(unused)]
fn main() {
&i32        // a reference (lifetime inferred)
&'a i32     // a reference with explicit lifetime 'a
&'a mut i32 // a mutable reference with explicit lifetime 'a
}

Common convention: Use 'a for the first lifetime, 'b for the second, etc.

The name 'a is pronounced "tick a" or "lifetime a"

Fixing our longest function

#![allow(unused)]
fn main() {
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() { x } else { y }
}
}

What this means:

  • <'a> declares a lifetime parameter named 'a
  • x: &'a str means "x is a reference that lives for lifetime 'a"
  • y: &'a str means "y is a reference that lives for lifetime 'a"
  • -> &'a str means "the return value lives for lifetime 'a"

In English: "For some lifetime 'a, both inputs live at least that long, and the output lives no longer than that."

Understanding the constraint

#![allow(unused)]
fn main() {
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() { x } else { y }
}
}

Think of 'a as the overlap of the two input lifetimes:

x's lifetime:  |──────────────|
y's lifetime:       |────────────|
'a (overlap):       |─────────|

The returned reference can't live longer than the shorter of the two inputs!

Using longest: Case 1 (Works!)

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() { x } else { y }
}

fn main() {
    let string1 = String::from("long string");
    let string2 = String::from("short");

    let result = longest(&string1, &string2);
    println!("Longest: {}", result);  // Works!
}

Why it works: Both string1 and string2 live for the entire main function, so result is always valid.

Using longest: Case 2 (Fails!)

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() { x } else { y }
}

fn main() {
    let string1 = String::from("long string");
    let result;

    {
        let string2 = String::from("short");
        result = longest(&string1, &string2);  // Compiler error!
    }  // string2 is dropped here

    println!("{}", result);  // Would use freed memory!
}

Error: string2 doesn't live long enough. The compiler prevents the dangling reference!

Lifetime Rules Don't Change Behavior

Important: Lifetime annotations don't change how long data lives. They just help the compiler verify safety.

#![allow(unused)]
fn main() {
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() { x } else { y }
}
}

This doesn't make x or y live longer. It just tells the compiler: "I promise the return value won't outlive either input."

The compiler then checks: "Is that promise kept?"

Lifetimes in Structs

If a struct holds references, it needs lifetime annotations:

struct Excerpt<'a> {
    text: &'a str,
}

fn main() {
    let novel = String::from("Call me Ishmael. Some years ago...");
    let first_sentence = novel.split('.').next().unwrap();
    let excerpt = Excerpt {
        text: first_sentence,
    };
    println!("{}", excerpt.text);  // Works!
}  // novel dropped last, so excerpt is always valid

The 'a ensures: An Excerpt instance can't outlive the data it references.

Exactly when you need and don't need annotations

Simple rule: If there are multiple input references and you return a reference, you need annotations.

Automatic (No Annotations Needed):

1a. Function with input reference, returning a reference

#![allow(unused)]
fn main() {
fn first_word(s: &str) -> &str {
    // Compiler knows output comes from s
    s.split_whitespace().next().unwrap_or("")
}
}

1b. Method returning a reference

#![allow(unused)]
fn main() {
impl Excerpt {
    fn get_text(&self) -> &str {
        // Compiler knows output comes from self
        self.text
    }
}
}

For 1a and 1b, the output takes the input's lifetime

2. Multiple inputs but no output reference

#![allow(unused)]
fn main() {
fn print_both(x: &str, y: &str) -> String {
    // No reference returned, no problem!
    format!("{} {}", x, y)
}
}

Function returns nothing or owned values so there's no mystery

Need Annotations:

Multiple input references with reference output

#![allow(unused)]
fn main() {
fn longest(x: &str, y: &str) -> &str {  // ✗ Error!
    // Compiler doesn't know if output is from x or y
    if x.len() > y.len() { x } else { y }
}

// Fix: Add lifetime annotations
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {  // ✓ Works!
    if x.len() > y.len() { x } else { y }
}
}

Methods and lifetimes (FYI)

#![allow(unused)]
fn main() {
struct Excerpt<'a> {
    text: &'a str,
}

impl<'a> Excerpt<'a> {
    fn get_text(&self) -> &str {
        self.text
    }
}
}

Note the syntax:

  • impl<'a> declares the lifetime parameter
  • Excerpt<'a> uses it
  • get_text doesn't need annotations (it uses the lifetime from &self)

But honestly, just avoid references inside structs when you can

The 'static lifetime

One special lifetime you'll see occasionally:

#![allow(unused)]
fn main() {
let s: &'static str = "Hello world";
}

'static means the data lives for the entire program

Common sources of 'static data:

1. String literals

#![allow(unused)]
fn main() {
let s: &'static str = "Hello world";  // Stored in binary
fn get_greeting() -> &'static str {
    "Hello!"  // String literals are always 'static
}
}

2. Static constants

#![allow(unused)]
fn main() {
static MAX_SCORE: i32 = 100;          // Lives for entire program
static APP_NAME: &str = "DataAnalyzer"; // 'static reference
fn get_max() -> &'static i32 {
    &MAX_SCORE  // Can return reference to static
}
}

3. Leaked allocations (rare, but useful sometimes)

We won't cover it, but felt like I had to mention it for correctness

In practice...

... you'll rarely write lifetime annotations

The good news: The compiler tells you exactly when you need them, and how to write them when you do!

The only thing you're responsible for remembering is: you need explicit lifetime annotations when there is more than one reference input parameter and a reference output parameter because the compiler can't determine the output's lifetime on its own.

When you see a lifetime error:

  1. Read the error message carefully
  2. Add the annotations it suggests
  3. Let it check if you got it right

Activity - Stack-Heap practice

  • Review HW4 stack-heap answers
  • I'll do one on the board
  • More practice problems on paper
  • I'll review those on the board