Lecture 22 - Generics and Type Systems
Logistics
- HW4 due tonight (new policy doesn't apply)
- New HW policy for HW5-7 (see Piazza after class)
Take-aways from "comfort-check" quiz
Most comfortable:
- Stack and heap memory
- What
.clone()does .iter()vs.iter_mut()&self,&mut self,self
High variance:
- Ownership and borrow-checker rules
- Why you can't do
text[0]on a String - Modifying a Vec when using
.iter()
Least comfortable:
- What
.collect()does .entry().or_insert()for hashmaps.get()on hashmaps returning an Option- Tuple structs
Learning Objectives (TC 12:25)
By the end of today, you should be able to:
- Write generic functions and structs using type parameters
- Use trait bounds to constrain generic behavior
- Recognize when you've been using generics all along
The Problem with Type-Specific Functions
Python is dynamically typed and quite flexible. We can pass many different types to a function:
def max(x, y):
return x if x > y else y
>>> max(3, 2)
3
>>> max(3.1, 2.2)
3.1
>>> max('s', 't')
't'
Very flexible! Any downsides?
- Requires inferring types each time function is called
- Incurs runtime penalty
- No compile-time guarantees about type safety
Type system approaches (review)
Dynamic Typing (Python, JavaScript, Ruby, R):
- Types checked at runtime
- Flexible coding
- Prone to runtime errors
Static Typing (C/C++, Java, Rust, Go):
- Types checked at compile-time
- Fast execution
- Early error detection
Rust without generics
Rust is strongly typed, so we would have to create a version of the function for each type:
#![allow(unused)] fn main() { fn max_i32(x: i32, y: i32) -> i32 { if x > y { x } else { y } } fn max_f64(x: f64, y: f64) -> f64 { if x > y { x } else { y } } fn max_char(x: char, y: char) -> char { if x > y { x } else { y } } // ... etc., etc. }
Problem: Supporting N types = writing N functions!
fn main() { println!("{}", max_i32(3, 8)); // 8 println!("{}", max_f64(3.3, 8.1)); // 8.1 println!("{}", max_char('a', 'b')); // b }
The dilemma: Python's flexibility with runtime costs vs. Rust's safety with code duplication?
Solution: Generics give us both flexibility AND compile-time guarantees!
Compiling generic functions (Monomorphization)
Insight: Generics = compile-time code generation for zero runtime cost!
SOURCE CODE COMPILED OUTPUT
┌─ fn max<T>(x: T, y: T) ─┐ ┌ Specialized Functions ─┐
│ where T: PartialOrd │ ───► │ fn max_i32(x: i32, ...)│
│ { │ │ fn max_f64(x: f64, ...)│
│ if x > y { x } else{y}│ │ fn max_char(x: char,..)│
│ } │ └────────────────────────┘
└─────────────────────────┘ Monomorphization
A Simple Generic Example (TC 12:30)
Let's try writing a super simple generic function. Use the <T> syntax to indicate that the function is generic:
#![allow(unused)] fn main() { fn passit<T>(x: T) -> T { x } }
The T is a placeholder for the type (could be any letter, but T for "Type" is conventional).
#![allow(unused)] fn main() { fn passit<T>(x: T) -> T { x } let x = passit(5); println!("x is {}", x); // x is 5 let x = passit(1.1); println!("x is {}", x); // x is 1.1 let x = passit('s'); println!("x is {}", x); // x is s }
This works! The function just passes through whatever type it receives.
Okay but that was pretty boring...
Let's try writing a generic max function.
#![allow(unused)] fn main() { fn max<T>(x: T, y: T) -> T { if x > y { x } else { y } } }
... but wait, there's a compiler error!
Problem: Not all types support > comparison!
The Rust compiler is thorough enough to recognize that not all generic types may have the behavior we want.
Solution: Trait bounds specify required behavior.
Trait Bounds: Constraining Generic Types
So how can we make our max function? We need to add a trait bound to specify that T must support comparison:
use std::cmp::PartialOrd; fn max<T: PartialOrd>(x: T, y: T) -> T { if x > y { x } else { y } // Now it works! } fn main() { // Type inference determines T: println!("{}", max(5, 10)); // T = i32 println!("{}", max(3.14, 2.7)); // T = f64 println!("{}", max('a', 'b')); // T = char let i = num::complex::Complex::new(10, 20); let j = num::complex::Complex::new(20, 5); // println!("{:?}", max(i, j)); // Won't compile if T doesn't implement PartialOrd }
Key insight: T: PartialOrd = "T must support comparison operations"
We can place restrictions on the generic types we would support.
Quick note on use std::cmp::PartialOrd;
PartialOrd needed to be imported from std::cmp::PartialOrd
(We didn't have to do this for things like #[derive(PartialOrd)] because those were macros!)
Other imports we might need:
#![allow(unused)] fn main() { use std::fmt::{Debug, Display}; use std::cmp::{PartialOrd, Eq, Ord}; use std::ops::{Add, Sub, Mul, Div}; }
Some traits like Copy, Clone, PartialEq are in the prelude (automatically imported), but others need explicit imports.
Monomorphization in Action (TC 12:35)
// What you write: fn max<T: PartialOrd>(x: T, y: T) -> T { if x > y { x } else { y } } fn main() { println!("{}", max(5, 10)); println!("{}", max(3.14, 2.7)); }
// What the compiler generates (conceptually): fn max_i32(x: i32, y: i32) -> i32 { if x > y { x } else { y } } fn max_f64(x: f64, y: f64) -> f64 { if x > y { x } else { y } } fn main() { println!("{}", max_i32(5, 10)); println!("{}", max_f64(3.14, 2.7)); }
Generic Structs
#![allow(unused)] fn main() { #[derive(Debug)] struct Point<T> { x: T, y: T, } // Type inference at work: let int_point = Point { x: 5, y: 10 }; // Point<i32> let float_point = Point { x: 3.14, y: 2.7 }; // Point<f64> }
Generic Struct Memory Layout
STACK
┌─ Point<i32> ───┐
│ x: 5 [4 bytes]│
│ y: 10 [4 bytes]│
└────────────────┘
8 bytes total
┌─ Point<f64> ─────┐
│ x: 3.14 [8 bytes]│
│ y: 2.7 [8 bytes]│
└──────────────────┘
16 bytes total
Memory insight: Generic structs adapt their size to the contained types!
Methods on Generic Structs
#![allow(unused)] fn main() { #[derive(Debug)] struct Point<T> { x: T, y: T, } impl<T> Point<T> { fn new(x: T, y: T) -> Point<T> { Point { x, y } } fn get_x(&self) -> &T { &self.x } } // Works for any type: let point1 = Point::new(1, 2); // Point<i32> let point2 = Point::new(1.5, 2.5); // Point<f64> println!("{}", point1.get_x()); println!("{}", point2.get_x()); }
Trait Bounds on Methods (TC 12:40)
Sometimes a method only works for certain types. Let's implement a swap method:
#![allow(unused)] fn main() { #[derive(Debug)] struct Point<T> { x: T, y: T, } // This won't compile! impl<T> Point<T> { fn new(x: T, y: T) -> Point<T> { Point { x, y } } fn swap(&mut self) { let temp = self.x; // Might not be Copy! self.x = self.y; self.y = temp; } } }
Problem: We're trying to move self.x out, but T might not implement Copy! (Compiler error gives a different helpful suggestion to add Clone)
Solution: Add a trait bound to the impl block:
#![allow(unused)] fn main() { #[derive(Debug)] struct Point<T> { x: T, y: T, } // Only implement swap for types that are Copy impl<T> Point<T> { fn new(x: T, y: T) -> Point<T> { Point { x, y } } } impl<T: Copy> Point<T> { fn swap(&mut self) { let temp = self.x; // OK - T implements Copy self.x = self.y; self.y = temp; } } let mut point = Point::new(2, 3); println!("{:?}", point); // Point { x: 2, y: 3 } point.swap(); println!("{:?}", point); // Point { x: 3, y: 2 } }
Key insight: impl<T: Copy> means "this implementation only exists for types that implement Copy"
Common Traits and Bounds (TC 12:45)
#![allow(unused)] fn main() { use std::fmt::Debug; // Need to import Debug! // Debug: Check if it can be printed with {:?} fn debug_value<T: Debug>(val: T) { println!("Value: {:?}", val); } // Clone: Check if it can be duplicated with .clone() fn duplicate<T: Clone>(val: &T) -> T { val.clone() } // Copy: Check if it is automatically copied (no moves) fn safe_copy<T: Copy>(val: T) -> (T, T) { (val, val) // val still usable! } }
Built-in Generic Types (You Know These!)
Remember these from earlier in the semester?
#![allow(unused)] fn main() { // Option<T> - maybe has a value let maybe_number: Option<i32> = Some(42); // Result<T, E> - success or error let outcome: Result<i32, String> = Ok(42); // Vec<T> - growable array let numbers: Vec<i32> = vec![1, 2, 3]; // Box<T> - heap-allocated value let boxed_data: Box<i32> = Box::new(5); }
Now you understand what the <T> means!
These are all generic types that work with any type T.
When you wrote Option<i32>, you were using a generic enum specialized for i32.
When you wrote Result<f64, String>, you were using a generic enum specialized for returning f64 on success and String on error
Ownership Interlude: Trait Bounds Quiz
Question: Explain this function signature / why it's a "safe" max
#![allow(unused)] fn main() { use std::cmp::PartialOrd; fn safe_max<T: PartialOrd + Clone>(x: &T, y: &T) -> T { if x > y { x.clone() } else { y.clone() } } }
Answer: We take &T parameters to avoid moving the arguments, but need Clone to return an owned T. PartialOrd enables the comparison operation!
Generic vs. Type-Specific Implementations (TC 12:50)
Even though we have generic methods defined, we can still specify methods for specific types!
#[derive(Debug)] struct Point<T> { x: T, y: T, } // Generic implementation - works for any type T impl<T> Point<T> { fn new(x: T, y: T) -> Point<T> { Point { x, y } } } // Specialized implementation - ONLY for Point<i32> impl Point<i32> { fn distance_from_origin(&self) -> f64 { ((self.x.pow(2) + self.y.pow(2)) as f64).sqrt() } } // Specialized implementation - ONLY for Point<f64> impl Point<f64> { fn distance_from_origin(&self) -> f64 { (self.x.powi(2) + self.y.powi(2)).sqrt() } } fn main(){ let int_point = Point::new(3, 4); println!("Distance: {}", int_point.distance_from_origin()); // 5.0 let float_point = Point::new(3.0, 4.0); println!("Distance: {}", float_point.distance_from_origin()); // 5.0 // let char_point = Point::new('a', 'b'); // char_point.distance_from_origin(); // Error! No such method for Point<char> }
Why Use Specialized Implementations?
- Different algorithms work better for different types (ints, floats)
- Some methods only make sense for certain types
- Sometimes you want drastically different behavior (eg
are_you_a_float)
Readable bounds using where
#![allow(unused)] fn main() { use std::cmp::PartialOrd; use std::fmt::Debug; fn analyze_data<T>(values: &[T]) -> Option<T> where T: PartialOrd + Clone + Debug { values.iter().max().cloned() } }
This is the same as:
#![allow(unused)] fn main() { fn analyze_data<T: PartialOrd + Clone + Debug>(values: &[T]) -> Option<T> { values.iter().max().cloned() } }
Use where when you have multiple bounds - it's more readable!
"Polymorphism" and "Monomorphization"
We say max
GENERIC SOURCE COMPILER OUTPUT (roughly)
┌─────────────────┐ ┌─────────────────┐
│ fn max<T>(x, y) │ ────────► │ fn max_i32(...) │
│ where T: Ord │ │ fn max_f64(...) │
│ { ... } │ │ fn max_char(...)│
└─────────────────┘ └─────────────────┘
One source Multiple functions
What we mean by "zero cost polymorphism"
The compiler generates specialized functions for each type you use.
#![allow(unused)] fn main() { max(5, 10); // Compiles to direct i32 comparison (as fast as hand-written max_i32) max(3.14, 2.7); // Compiles to direct f64 comparison (as fast as hand-written max_f64) }
"Zero cost" means:
- No runtime type checking ("is this an i32 or f64?")
- No performance penalty compared to writing separate functions by hand
This is different from languages like Java (type erasure adds overhead) or Python (dynamic dispatch at runtime).
Activity time
See Gradescope and our B1 website (linked on Piazza) for Activity 22 instructions