Lecture 28 - Testing and Rust from Python
Logistics
- HW4 corrections are due Wednesday night
- Friday we'll have a stack/heap and hand-coding redo opportunity (NEW problems, no grade limit)
- NEXT Tuesday (11/18) corrections during discussion
Today:
- Two independent topics that are super useful
- Part 1: Writing tests in Rust (2/3 of lecture, part of HW6)
- Part 2: Calling Rust from Python (1/3 of lecture)
But first! Solutions to Friday's activity
Why Keep Things Private?
- Encapsulation: External code can't directly modify Person fields, preventing invalid states
- Validation:
validate_age()ensures age is always valid when creating a Person - Flexibility: Can change internal implementation without affecting code that uses these modules
- Clarity: The public API shows exactly what's intended to be used
- Safety: Prevents accidental misuse of helper functions
Learning Objectives
By the end of today, you should be able to:
- Write unit tests in Rust using
#[test]andassert!macros - Run tests with
cargo testand understand test output (though you've already been doing this!) - Understand why testing matters for data science and systems programming
- Know the basics of PyO3 and how to call Rust from Python
- Understand when you might want to use Rust from Python
Why Write Tests?
The reality: Everyone's code has bugs.
Not writing tests is like not wearing a seatbelt because you think you'll never get in an accident.
What everyone does at first:
- Write code
- Run it
- See if it works
- Fix bugs
- Hope you didn't break something else
Better approach:
- Write code
- Write tests
- Run tests automatically
- Fix bugs
- Tests catch if you broke something else!
EVEN BETTER:
- Write tests that capture your desired behavior
- Write code until it passes the tests
- Improve and clean ("refactor") the code
(This is called test-driven development, or TDD - and it's effectively how the homeworks work!)
The simplest test
Tests in Rust are just functions with #[test] attribute:
#![allow(unused)] fn main() { #[test] fn it_works() { assert_eq!(2 + 2, 4); } }
Run with:
cargo test
Output:
running 1 test
test it_works ... ok
test result: ok. 1 passed; 0 failed
Writing Your First Real Test
Let's test a simple function:
#![allow(unused)] fn main() { // Our code pub fn add(a: i32, b: i32) -> i32 { a + b } // Our test #[test] fn test_add() { assert_eq!(add(2, 3), 5); assert_eq!(add(-1, 1), 0); assert_eq!(add(0, 0), 0); } }
Three parts:
#[test]- tells Rust this is a test- Function name (usually starts with
test_) - Assertions (checking if things are true)
Assert Macros: Your Testing Tools
Rust provides several macros for testing:
assert!(condition)
Checks if something is true:
#![allow(unused)] fn main() { #[test] fn test_positive() { let x = 5; assert!(x > 0); } }
assert_eq!(left, right)
Checks if two things are equal:
#![allow(unused)] fn main() { #[test] fn test_length() { let v = vec![1, 2, 3]; assert_eq!(v.len(), 3); } }
assert_ne!(left, right)
Checks if two things are NOT equal:
#![allow(unused)] fn main() { #[test] fn test_different() { let a = String::from("hello"); let b = String::from("world"); assert_ne!(a, b); } }
All three panic if the assertion fails - that's how tests fail!
Adding Custom Error Messages
You can add custom messages to make test failures more helpful:
#![allow(unused)] fn main() { #[test] fn test_score_in_range() { let score = calculate_score(&data); assert!( score >= 0.0 && score <= 100.0, "Score should be between 0 and 100, but got: {}", score ); } #[test] fn test_parse_result() { let result = parse_data("invalid"); assert_eq!( result.len(), 0, "Expected empty result for invalid input, got {} items", result.len() ); } }
When the test fails, you'll see your custom message along with the values!
This is especially helpful when debugging complex data or explaining why a test should pass.
Testing in Modules: Where Tests Live
Convention: Tests (typically) live in a tests module at the bottom of your file:
#![allow(unused)] fn main() { pub fn is_even(n: i32) -> bool { n % 2 == 0 } pub fn is_positive(n: i32) -> bool { n > 0 } }
#![allow(unused)] fn main() { #[cfg(test)] mod tests { use super::*; // Import everything from parent module #[test] fn test_is_even() { assert!(is_even(2)); assert!(is_even(0)); assert!(!is_even(3)); } #[test] fn test_is_positive() { assert!(is_positive(1)); assert!(!is_positive(0)); assert!(!is_positive(-1)); } } }
Why #[cfg(test)]?
- Only compiles tests when running
cargo test - Keeps your final binary smaller
What Makes a Good Test?
1. Test one thing at a time
Bad:
#![allow(unused)] fn main() { #[test] fn test_everything() { assert_eq!(add(1, 2), 3); assert_eq!(multiply(2, 3), 6); assert_eq!(divide(10, 2), 5); } }
Good:
#![allow(unused)] fn main() { #[test] fn test_add() { assert_eq!(add(1, 2), 3); } #[test] fn test_multiply() { assert_eq!(multiply(2, 3), 6); } #[test] fn test_divide() { assert_eq!(divide(10, 2), 5); } }
Why? If one fails, you know exactly which function broke!
2. Use #[should_panic] for expected errors
Sometimes functions should panic:
#![allow(unused)] fn main() { pub fn get_first(v: &Vec<i32>) -> i32 { v[0] // Panics if empty } #[test] #[should_panic] fn test_empty_vec() { get_first(&vec![]); // Should panic! } }
Or even better, specify the panic message:
#![allow(unused)] fn main() { #[test] #[should_panic(expected = "index out of bounds")] fn test_empty_vec() { get_first(&vec![]); } }
3. Test edge cases
Don't just test the happy path!
#![allow(unused)] fn main() { pub fn average(numbers: &[i32]) -> f64 { let sum: i32 = numbers.iter().sum(); sum as f64 / numbers.len() as f64 } #[cfg(test)] mod tests { use super::*; #[test] fn test_average_normal() { assert_eq!(average(&[1, 2, 3]), 2.0); } #[test] fn test_average_single() { assert_eq!(average(&[42]), 42.0); } #[test] #[should_panic] // We expect this to panic! fn test_average_empty() { average(&[]); // Division by zero! } } }
Think about: What could go wrong? Test those cases!
More Testing Principles to Consider
4. Test behavior, not implementation
- Focus on what your function does, not how it does it
- If you refactor internal logic, tests shouldn't need to change
5. Don't test the language/library
- Don't test that
vec![]creates an empty vector - Test your logic, not Rust's or your imports' behavior
6. Keep tests independent
- Each test should work on its own, in any order
- Don't rely on one test running before another
Exception: Integration Tests
Unit tests (what we've been writing): Test individual functions in isolation
Integration tests: Test how multiple parts work together
In Rust, integration tests go in a separate tests/ directory:
my_project/
src/
lib.rs
tests/
integration_test.rs <- Integration tests here!
Example: Testing that multiple components work together:
#![allow(unused)] fn main() { // tests/integration_test.rs use my_project::*; #[test] fn test_full_pipeline() { let data = load_data("test.csv"); let cleaned = remove_outliers(&data, 2.0); let result = calculate_statistics(&cleaned); assert!(result.mean > 0.0); } }
Integration tests can depend on multiple functions working correctly - that's the point!
The car gap story.
For this class: Focus on unit tests. Integration tests are good to know about for larger projects.
Example: Testing Data Processing
Real-world example from data science - smoothing time series data:
#![allow(unused)] fn main() { pub fn moving_average(data: &[f64], window_size: usize) -> Vec<f64> { let mut result = Vec::new(); for i in 0..data.len() { let start = if i < window_size { 0 } else { i - window_size + 1 }; let window = &data[start..=i]; let avg = window.iter().sum::<f64>() / window.len() as f64; result.push(avg); } result } #[cfg(test)] mod tests { use super::*; #[test] fn test_moving_average_basic() { let data = vec![1.0, 2.0, 3.0, 4.0, 5.0]; let result = moving_average(&data, 3); assert_eq!(result[2], 2.0); // (1+2+3)/3 = 2.0 assert_eq!(result[4], 4.0); // (3+4+5)/3 = 4.0 } #[test] fn test_moving_average_window_one() { let data = vec![1.0, 2.0, 3.0]; let result = moving_average(&data, 1); assert_eq!(result, data); // Window of 1 = original data } #[test] fn test_moving_average_small_data() { let data = vec![5.0]; let result = moving_average(&data, 3); assert_eq!(result, vec![5.0]); // Works with single element } #[test] fn test_moving_average_window_larger_than_data() { let data = vec![2.0, 4.0, 6.0]; let result = moving_average(&data, 10); // Window larger than data: uses all available data assert_eq!(result[0], 2.0); // 2/1 = 2.0 assert_eq!(result[1], 3.0); // (2+4)/2 = 3.0 assert_eq!(result[2], 4.0); // (2+4+6)/3 = 4.0 } } }
Testing Best Practices: Quick Summary
- Write tests BEFORE or AS you code - don't wait until the end
- Test edge cases - empty inputs, negative numbers, zero, etc.
- One assertion (or at least one concept) per test (when possible) - easier to debug
- Use descriptive names -
test_average_with_negative_numbers - Test behavior, not implementation - catch bugs early!
In data science:
- Test your data cleaning functions
- Test statistical calculations with known results
- Test that your parsers handle bad data gracefully
Calling Rust from Python
Python is great for:
- Data exploration
- Quick prototyping
- Rich ecosystem (pandas, scikit-learn, pytorch, etc.)
- Easy to write
Rust is great for:
- Performance (100x faster than Python)
- Memory safety
- Parallel processing
Best of both worlds: Write slow parts in Rust, call from Python!
Use Cases for Rust + Python
Real examples:
- Polars: Fast DataFrame library (like pandas but Rust)
- Cryptography: Python's
cryptographylibrary has Rust internals - Tokenizers: Hugging Face uses Rust for fast NLP tokenization
You might use it for:
- Processing large datasets
- Heavy numerical computations
- Performance-critical parts of your pipeline
PyO3: The Bridge Between Rust and Python
PyO3 is a Rust library that lets you:
- Call Rust from Python
- Call Python from Rust
- Create Python modules in Rust
Installation:
cargo add pyo3 --features extension-module
Exporting your Rust function to Python:
Rust code (src/lib.rs):
#![allow(unused)] fn main() { use pyo3::prelude::*; #[pyfunction] fn add(a: i32, b: i32) -> i32 { a + b } #[pymodule] fn my_rust_module(m: &Bound<'_, PyModule>) -> PyResult<()> { m.add_function(wrap_pyfunction!(add, m)?)?; Ok(()) } }
Building and using: the maturin tool
maturin is a tool for building Python packages in Rust.
Step 1: Create a new Rust-Python project
pip install maturin
maturin new my_project
cd my_project
This creates:
my_project/
Cargo.toml <- Rust configuration
pyproject.toml <- Python configuration
src/
lib.rs <- Your Rust code goes here!
Step 2: Write your Rust code in src/lib.rs
Step 3: Build and install into Python
maturin develop
This compiles the Rust code and installs it as a Python module in your current Python environment.
Step 4: Use it in Python (anywhere on your machine)!
import my_project
result = my_project.add(2, 3)
print(result) # 5
How it works: maturin develop compiles src/lib.rs into a binary that Python can load, then installs it where Python can find it (like site-packages).
The maturin develop command compiles your Rust code and installs it into your Python environment!
A Practical Example: Fast String Processing
Rust side (src/lib.rs):
#![allow(unused)] fn main() { use pyo3::prelude::*; #[pyfunction] fn count_words(text: String) -> usize { text.split_whitespace().count() } #[pymodule] fn string_tools(m: &Bound<'_, PyModule>) -> PyResult<()> { m.add_function(wrap_pyfunction!(count_words, m)?)?; Ok(()) } }
Python side:
import string_tools
text = "The quick brown fox jumps over the lazy dog"
count = string_tools.count_words(text)
print(f"Word count: {count}") # Word count: 9
Simple, but imagine this with millions of strings!
Realistic example with mixed code
# Python file: main.py
# Use Rust for the slow parts
import my_fast_rust_module
# Use Python for the easy parts
import pandas as pd
import matplotlib.pyplot as plt
# Load data with pandas
df = pd.read_csv("data.csv")
# Process with Rust (fast!)
results = my_fast_rust_module.process_dataframe(df)
# Visualize with matplotlib (easy!)
plt.hist(results)
plt.show()
Summary: Testing + Python Integration
Testing:
- Use
#[test]andassert!macros - Run with
cargo test - Test edge cases and expected errors
- Write tests as you code!
Rust from Python:
- Use PyO3 + maturin
#[pyfunction]for functions#[pymodule]for modules- Build with
maturin develop - Use when you need performance
Both are about making Rust practical for real projects!
Activity: Writing tests for Friday's activity
Go to our site: https://trgardos.github.io/ds210-fa25-private/b1/activities/activity_28.html for code and instructions