Lecture 28 - Testing and Rust from Python

Logistics

HW4 corrections are due Wednesday night
Friday we'll have a stack/heap and hand-coding redo opportunity (NEW problems, no grade limit)
NEXT Tuesday (11/18) corrections during discussion

Today:

Two independent topics that are super useful
- Part 1: Writing tests in Rust (2/3 of lecture, part of HW6)
- Part 2: Calling Rust from Python (1/3 of lecture)

But first! Solutions to Friday's activity

Why Keep Things Private?

Encapsulation: External code can't directly modify Person fields, preventing invalid states
Validation: validate_age() ensures age is always valid when creating a Person
Flexibility: Can change internal implementation without affecting code that uses these modules
Clarity: The public API shows exactly what's intended to be used
Safety: Prevents accidental misuse of helper functions

Learning Objectives

By the end of today, you should be able to:

Write unit tests in Rust using #[test] and assert! macros
Run tests with cargo test and understand test output (though you've already been doing this!)
Understand why testing matters for data science and systems programming
Know the basics of PyO3 and how to call Rust from Python
Understand when you might want to use Rust from Python

Why Write Tests?

The reality: Everyone's code has bugs.

Not writing tests is like not wearing a seatbelt because you think you'll never get in an accident.

What everyone does at first:

Write code
Run it
See if it works
Fix bugs
Hope you didn't break something else

Better approach:

Write code
Write tests
Run tests automatically
Fix bugs
Tests catch if you broke something else!

EVEN BETTER:

Write tests that capture your desired behavior
Write code until it passes the tests
Improve and clean ("refactor") the code

(This is called test-driven development, or TDD - and it's effectively how the homeworks work!)

The simplest test

Tests in Rust are just functions with #[test] attribute:

#![allow(unused)]
fn main() {
#[test]
fn it_works() {
    assert_eq!(2 + 2, 4);
}
}

Run with:

cargo test

Output:

running 1 test
test it_works ... ok

test result: ok. 1 passed; 0 failed

Writing Your First Real Test

Let's test a simple function:

#![allow(unused)]
fn main() {
// Our code
pub fn add(a: i32, b: i32) -> i32 {
    a + b
}

// Our test
#[test]
fn test_add() {
    assert_eq!(add(2, 3), 5);
    assert_eq!(add(-1, 1), 0);
    assert_eq!(add(0, 0), 0);
}
}

Three parts:

#[test] - tells Rust this is a test
Function name (usually starts with test_)
Assertions (checking if things are true)

Assert Macros: Your Testing Tools

Rust provides several macros for testing:

`assert!(condition)`

Checks if something is true:

#![allow(unused)]
fn main() {
#[test]
fn test_positive() {
    let x = 5;
    assert!(x > 0);
}
}

`assert_eq!(left, right)`

Checks if two things are equal:

#![allow(unused)]
fn main() {
#[test]
fn test_length() {
    let v = vec![1, 2, 3];
    assert_eq!(v.len(), 3);
}
}

`assert_ne!(left, right)`

Checks if two things are NOT equal:

#![allow(unused)]
fn main() {
#[test]
fn test_different() {
    let a = String::from("hello");
    let b = String::from("world");
    assert_ne!(a, b);
}
}

All three panic if the assertion fails - that's how tests fail!

Adding Custom Error Messages

You can add custom messages to make test failures more helpful:

#![allow(unused)]
fn main() {
#[test]
fn test_score_in_range() {
    let score = calculate_score(&data);
    assert!(
        score >= 0.0 && score <= 100.0,
        "Score should be between 0 and 100, but got: {}",
        score
    );
}

#[test]
fn test_parse_result() {
    let result = parse_data("invalid");
    assert_eq!(
        result.len(),
        0,
        "Expected empty result for invalid input, got {} items",
        result.len()
    );
}
}

When the test fails, you'll see your custom message along with the values!

This is especially helpful when debugging complex data or explaining why a test should pass.

Testing in Modules: Where Tests Live

Convention: Tests (typically) live in a tests module at the bottom of your file:

#![allow(unused)]
fn main() {
pub fn is_even(n: i32) -> bool {
    n % 2 == 0
}

pub fn is_positive(n: i32) -> bool {
    n > 0
}
}

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;  // Import everything from parent module

    #[test]
    fn test_is_even() {
        assert!(is_even(2));
        assert!(is_even(0));
        assert!(!is_even(3));
    }

    #[test]
    fn test_is_positive() {
        assert!(is_positive(1));
        assert!(!is_positive(0));
        assert!(!is_positive(-1));
    }
}
}

Why #[cfg(test)]?

Only compiles tests when running cargo test
Keeps your final binary smaller

What Makes a Good Test?

1. Test one thing at a time

Bad:

#![allow(unused)]
fn main() {
#[test]
fn test_everything() {
    assert_eq!(add(1, 2), 3);
    assert_eq!(multiply(2, 3), 6);
    assert_eq!(divide(10, 2), 5);
}
}

Good:

#![allow(unused)]
fn main() {
#[test]
fn test_add() {
    assert_eq!(add(1, 2), 3);
}

#[test]
fn test_multiply() {
    assert_eq!(multiply(2, 3), 6);
}

#[test]
fn test_divide() {
    assert_eq!(divide(10, 2), 5);
}
}

Why? If one fails, you know exactly which function broke!

2. Use `#[should_panic]` for expected errors

Sometimes functions should panic:

#![allow(unused)]
fn main() {
pub fn get_first(v: &Vec<i32>) -> i32 {
    v[0]  // Panics if empty
}

#[test]
#[should_panic]
fn test_empty_vec() {
    get_first(&vec![]);  // Should panic!
}
}

Or even better, specify the panic message:

#![allow(unused)]
fn main() {
#[test]
#[should_panic(expected = "index out of bounds")]
fn test_empty_vec() {
    get_first(&vec![]);
}
}

3. Test edge cases

Don't just test the happy path!

#![allow(unused)]
fn main() {
pub fn average(numbers: &[i32]) -> f64 {
    let sum: i32 = numbers.iter().sum();
    sum as f64 / numbers.len() as f64
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_average_normal() {
        assert_eq!(average(&[1, 2, 3]), 2.0);
    }

    #[test]
    fn test_average_single() {
        assert_eq!(average(&[42]), 42.0);
    }

    #[test]
    #[should_panic]  // We expect this to panic!
    fn test_average_empty() {
        average(&[]);  // Division by zero!
    }
}
}

Think about: What could go wrong? Test those cases!

More Testing Principles to Consider

4. Test behavior, not implementation

Focus on what your function does, not how it does it
If you refactor internal logic, tests shouldn't need to change

5. Don't test the language/library

Don't test that vec![] creates an empty vector
Test your logic, not Rust's or your imports' behavior

6. Keep tests independent

Each test should work on its own, in any order
Don't rely on one test running before another

Exception: Integration Tests

Unit tests (what we've been writing): Test individual functions in isolation

Integration tests: Test how multiple parts work together

In Rust, integration tests go in a separate tests/ directory:

my_project/
  src/
    lib.rs
  tests/
    integration_test.rs  <- Integration tests here!

Example: Testing that multiple components work together:

#![allow(unused)]
fn main() {
// tests/integration_test.rs
use my_project::*;

#[test]
fn test_full_pipeline() {
    let data = load_data("test.csv");
    let cleaned = remove_outliers(&data, 2.0);
    let result = calculate_statistics(&cleaned);
    assert!(result.mean > 0.0);
}
}

Integration tests can depend on multiple functions working correctly - that's the point!

The car gap story.

For this class: Focus on unit tests. Integration tests are good to know about for larger projects.

Example: Testing Data Processing

Real-world example from data science - smoothing time series data:

#![allow(unused)]
fn main() {
pub fn moving_average(data: &[f64], window_size: usize) -> Vec<f64> {
    let mut result = Vec::new();
    for i in 0..data.len() {
        let start = if i < window_size { 0 } else { i - window_size + 1 };
        let window = &data[start..=i];
        let avg = window.iter().sum::<f64>() / window.len() as f64;
        result.push(avg);
    }
    result
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_moving_average_basic() {
        let data = vec![1.0, 2.0, 3.0, 4.0, 5.0];
        let result = moving_average(&data, 3);
        assert_eq!(result[2], 2.0);  // (1+2+3)/3 = 2.0
        assert_eq!(result[4], 4.0);  // (3+4+5)/3 = 4.0
    }

    #[test]
    fn test_moving_average_window_one() {
        let data = vec![1.0, 2.0, 3.0];
        let result = moving_average(&data, 1);
        assert_eq!(result, data);  // Window of 1 = original data
    }

    #[test]
    fn test_moving_average_small_data() {
        let data = vec![5.0];
        let result = moving_average(&data, 3);
        assert_eq!(result, vec![5.0]);  // Works with single element
    }

    #[test]
    fn test_moving_average_window_larger_than_data() {
        let data = vec![2.0, 4.0, 6.0];
        let result = moving_average(&data, 10);
        // Window larger than data: uses all available data
        assert_eq!(result[0], 2.0);        // 2/1 = 2.0
        assert_eq!(result[1], 3.0);        // (2+4)/2 = 3.0
        assert_eq!(result[2], 4.0);        // (2+4+6)/3 = 4.0
    }
}
}

Testing Best Practices: Quick Summary

Write tests BEFORE or AS you code - don't wait until the end
Test edge cases - empty inputs, negative numbers, zero, etc.
One assertion (or at least one concept) per test (when possible) - easier to debug
Use descriptive names - test_average_with_negative_numbers
Test behavior, not implementation - catch bugs early!

In data science:

Test your data cleaning functions
Test statistical calculations with known results
Test that your parsers handle bad data gracefully

Calling Rust from Python

Python is great for:

Data exploration
Quick prototyping
Rich ecosystem (pandas, scikit-learn, pytorch, etc.)
Easy to write

Rust is great for:

Performance (100x faster than Python)
Memory safety
Parallel processing

Best of both worlds: Write slow parts in Rust, call from Python!

Use Cases for Rust + Python

Real examples:

Polars: Fast DataFrame library (like pandas but Rust)
Cryptography: Python's cryptography library has Rust internals
Tokenizers: Hugging Face uses Rust for fast NLP tokenization

You might use it for:

Processing large datasets
Heavy numerical computations
Performance-critical parts of your pipeline

PyO3: The Bridge Between Rust and Python

PyO3 is a Rust library that lets you:

Call Rust from Python
Call Python from Rust
Create Python modules in Rust

Installation:

cargo add pyo3 --features extension-module

Exporting your Rust function to Python:

Rust code (src/lib.rs):

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyfunction]
fn add(a: i32, b: i32) -> i32 {
    a + b
}

#[pymodule]
fn my_rust_module(m: &Bound<'_, PyModule>) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(add, m)?)?;
    Ok(())
}
}

Building and using: the `maturin` tool

maturin is a tool for building Python packages in Rust.

Step 1: Create a new Rust-Python project

pip install maturin
maturin new my_project
cd my_project

This creates:

my_project/
  Cargo.toml          <- Rust configuration
  pyproject.toml      <- Python configuration
  src/
    lib.rs            <- Your Rust code goes here!

Step 2: Write your Rust code in src/lib.rs

Step 3: Build and install into Python

maturin develop

This compiles the Rust code and installs it as a Python module in your current Python environment.

Step 4: Use it in Python (anywhere on your machine)!

import my_project
result = my_project.add(2, 3)
print(result)  # 5

How it works: maturin develop compiles src/lib.rs into a binary that Python can load, then installs it where Python can find it (like site-packages).

The maturin develop command compiles your Rust code and installs it into your Python environment!

A Practical Example: Fast String Processing

Rust side (src/lib.rs):

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyfunction]
fn count_words(text: String) -> usize {
    text.split_whitespace().count()
}

#[pymodule]
fn string_tools(m: &Bound<'_, PyModule>) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(count_words, m)?)?;
    Ok(())
}
}

Python side:

import string_tools

text = "The quick brown fox jumps over the lazy dog"
count = string_tools.count_words(text)
print(f"Word count: {count}")  # Word count: 9

Simple, but imagine this with millions of strings!

Realistic example with mixed code

# Python file: main.py

# Use Rust for the slow parts
import my_fast_rust_module

# Use Python for the easy parts
import pandas as pd
import matplotlib.pyplot as plt

# Load data with pandas
df = pd.read_csv("data.csv")

# Process with Rust (fast!)
results = my_fast_rust_module.process_dataframe(df)

# Visualize with matplotlib (easy!)
plt.hist(results)
plt.show()

Summary: Testing + Python Integration

Testing:

Use #[test] and assert! macros
Run with cargo test
Test edge cases and expected errors
Write tests as you code!

Rust from Python:

Use PyO3 + maturin
#[pyfunction] for functions
#[pymodule] for modules
Build with maturin develop
Use when you need performance

Both are about making Rust practical for real projects!

Activity: Writing tests for Friday's activity

Go to our site: https://trgardos.github.io/ds210-fa25-private/b1/activities/activity_28.html for code and instructions

Lauren's DS210 Materials