Lecture 28 - Testing and Rust from Python

Logistics

  • HW4 corrections are due Wednesday night
  • Friday we'll have a stack/heap and hand-coding redo opportunity (NEW problems, no grade limit)
  • NEXT Tuesday (11/18) corrections during discussion

Today:

  • Two independent topics that are super useful
    • Part 1: Writing tests in Rust (2/3 of lecture, part of HW6)
    • Part 2: Calling Rust from Python (1/3 of lecture)

But first! Solutions to Friday's activity

Why Keep Things Private?

  1. Encapsulation: External code can't directly modify Person fields, preventing invalid states
  2. Validation: validate_age() ensures age is always valid when creating a Person
  3. Flexibility: Can change internal implementation without affecting code that uses these modules
  4. Clarity: The public API shows exactly what's intended to be used
  5. Safety: Prevents accidental misuse of helper functions

Learning Objectives

By the end of today, you should be able to:

  • Write unit tests in Rust using #[test] and assert! macros
  • Run tests with cargo test and understand test output (though you've already been doing this!)
  • Understand why testing matters for data science and systems programming
  • Know the basics of PyO3 and how to call Rust from Python
  • Understand when you might want to use Rust from Python

Why Write Tests?

The reality: Everyone's code has bugs.

Not writing tests is like not wearing a seatbelt because you think you'll never get in an accident.

What everyone does at first:

  1. Write code
  2. Run it
  3. See if it works
  4. Fix bugs
  5. Hope you didn't break something else

Better approach:

  1. Write code
  2. Write tests
  3. Run tests automatically
  4. Fix bugs
  5. Tests catch if you broke something else!

EVEN BETTER:

  1. Write tests that capture your desired behavior
  2. Write code until it passes the tests
  3. Improve and clean ("refactor") the code

(This is called test-driven development, or TDD - and it's effectively how the homeworks work!)

The simplest test

Tests in Rust are just functions with #[test] attribute:

#![allow(unused)]
fn main() {
#[test]
fn it_works() {
    assert_eq!(2 + 2, 4);
}
}

Run with:

cargo test

Output:

running 1 test
test it_works ... ok

test result: ok. 1 passed; 0 failed

Writing Your First Real Test

Let's test a simple function:

#![allow(unused)]
fn main() {
// Our code
pub fn add(a: i32, b: i32) -> i32 {
    a + b
}

// Our test
#[test]
fn test_add() {
    assert_eq!(add(2, 3), 5);
    assert_eq!(add(-1, 1), 0);
    assert_eq!(add(0, 0), 0);
}
}

Three parts:

  1. #[test] - tells Rust this is a test
  2. Function name (usually starts with test_)
  3. Assertions (checking if things are true)

Assert Macros: Your Testing Tools

Rust provides several macros for testing:

assert!(condition)

Checks if something is true:

#![allow(unused)]
fn main() {
#[test]
fn test_positive() {
    let x = 5;
    assert!(x > 0);
}
}

assert_eq!(left, right)

Checks if two things are equal:

#![allow(unused)]
fn main() {
#[test]
fn test_length() {
    let v = vec![1, 2, 3];
    assert_eq!(v.len(), 3);
}
}

assert_ne!(left, right)

Checks if two things are NOT equal:

#![allow(unused)]
fn main() {
#[test]
fn test_different() {
    let a = String::from("hello");
    let b = String::from("world");
    assert_ne!(a, b);
}
}

All three panic if the assertion fails - that's how tests fail!

Adding Custom Error Messages

You can add custom messages to make test failures more helpful:

#![allow(unused)]
fn main() {
#[test]
fn test_score_in_range() {
    let score = calculate_score(&data);
    assert!(
        score >= 0.0 && score <= 100.0,
        "Score should be between 0 and 100, but got: {}",
        score
    );
}

#[test]
fn test_parse_result() {
    let result = parse_data("invalid");
    assert_eq!(
        result.len(),
        0,
        "Expected empty result for invalid input, got {} items",
        result.len()
    );
}
}

When the test fails, you'll see your custom message along with the values!

This is especially helpful when debugging complex data or explaining why a test should pass.

Testing in Modules: Where Tests Live

Convention: Tests (typically) live in a tests module at the bottom of your file:

#![allow(unused)]
fn main() {
pub fn is_even(n: i32) -> bool {
    n % 2 == 0
}

pub fn is_positive(n: i32) -> bool {
    n > 0
}
}
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;  // Import everything from parent module

    #[test]
    fn test_is_even() {
        assert!(is_even(2));
        assert!(is_even(0));
        assert!(!is_even(3));
    }

    #[test]
    fn test_is_positive() {
        assert!(is_positive(1));
        assert!(!is_positive(0));
        assert!(!is_positive(-1));
    }
}
}

Why #[cfg(test)]?

  • Only compiles tests when running cargo test
  • Keeps your final binary smaller

What Makes a Good Test?

1. Test one thing at a time

Bad:

#![allow(unused)]
fn main() {
#[test]
fn test_everything() {
    assert_eq!(add(1, 2), 3);
    assert_eq!(multiply(2, 3), 6);
    assert_eq!(divide(10, 2), 5);
}
}

Good:

#![allow(unused)]
fn main() {
#[test]
fn test_add() {
    assert_eq!(add(1, 2), 3);
}

#[test]
fn test_multiply() {
    assert_eq!(multiply(2, 3), 6);
}

#[test]
fn test_divide() {
    assert_eq!(divide(10, 2), 5);
}
}

Why? If one fails, you know exactly which function broke!

2. Use #[should_panic] for expected errors

Sometimes functions should panic:

#![allow(unused)]
fn main() {
pub fn get_first(v: &Vec<i32>) -> i32 {
    v[0]  // Panics if empty
}

#[test]
#[should_panic]
fn test_empty_vec() {
    get_first(&vec![]);  // Should panic!
}
}

Or even better, specify the panic message:

#![allow(unused)]
fn main() {
#[test]
#[should_panic(expected = "index out of bounds")]
fn test_empty_vec() {
    get_first(&vec![]);
}
}

3. Test edge cases

Don't just test the happy path!

#![allow(unused)]
fn main() {
pub fn average(numbers: &[i32]) -> f64 {
    let sum: i32 = numbers.iter().sum();
    sum as f64 / numbers.len() as f64
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_average_normal() {
        assert_eq!(average(&[1, 2, 3]), 2.0);
    }

    #[test]
    fn test_average_single() {
        assert_eq!(average(&[42]), 42.0);
    }

    #[test]
    #[should_panic]  // We expect this to panic!
    fn test_average_empty() {
        average(&[]);  // Division by zero!
    }
}
}

Think about: What could go wrong? Test those cases!

More Testing Principles to Consider

4. Test behavior, not implementation

  • Focus on what your function does, not how it does it
  • If you refactor internal logic, tests shouldn't need to change

5. Don't test the language/library

  • Don't test that vec![] creates an empty vector
  • Test your logic, not Rust's or your imports' behavior

6. Keep tests independent

  • Each test should work on its own, in any order
  • Don't rely on one test running before another

Exception: Integration Tests

Unit tests (what we've been writing): Test individual functions in isolation

Integration tests: Test how multiple parts work together

In Rust, integration tests go in a separate tests/ directory:

my_project/
  src/
    lib.rs
  tests/
    integration_test.rs  <- Integration tests here!

Example: Testing that multiple components work together:

#![allow(unused)]
fn main() {
// tests/integration_test.rs
use my_project::*;

#[test]
fn test_full_pipeline() {
    let data = load_data("test.csv");
    let cleaned = remove_outliers(&data, 2.0);
    let result = calculate_statistics(&cleaned);
    assert!(result.mean > 0.0);
}
}

Integration tests can depend on multiple functions working correctly - that's the point!

The car gap story.

For this class: Focus on unit tests. Integration tests are good to know about for larger projects.

Example: Testing Data Processing

Real-world example from data science - smoothing time series data:

#![allow(unused)]
fn main() {
pub fn moving_average(data: &[f64], window_size: usize) -> Vec<f64> {
    let mut result = Vec::new();
    for i in 0..data.len() {
        let start = if i < window_size { 0 } else { i - window_size + 1 };
        let window = &data[start..=i];
        let avg = window.iter().sum::<f64>() / window.len() as f64;
        result.push(avg);
    }
    result
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_moving_average_basic() {
        let data = vec![1.0, 2.0, 3.0, 4.0, 5.0];
        let result = moving_average(&data, 3);
        assert_eq!(result[2], 2.0);  // (1+2+3)/3 = 2.0
        assert_eq!(result[4], 4.0);  // (3+4+5)/3 = 4.0
    }

    #[test]
    fn test_moving_average_window_one() {
        let data = vec![1.0, 2.0, 3.0];
        let result = moving_average(&data, 1);
        assert_eq!(result, data);  // Window of 1 = original data
    }

    #[test]
    fn test_moving_average_small_data() {
        let data = vec![5.0];
        let result = moving_average(&data, 3);
        assert_eq!(result, vec![5.0]);  // Works with single element
    }

    #[test]
    fn test_moving_average_window_larger_than_data() {
        let data = vec![2.0, 4.0, 6.0];
        let result = moving_average(&data, 10);
        // Window larger than data: uses all available data
        assert_eq!(result[0], 2.0);        // 2/1 = 2.0
        assert_eq!(result[1], 3.0);        // (2+4)/2 = 3.0
        assert_eq!(result[2], 4.0);        // (2+4+6)/3 = 4.0
    }
}
}

Testing Best Practices: Quick Summary

  1. Write tests BEFORE or AS you code - don't wait until the end
  2. Test edge cases - empty inputs, negative numbers, zero, etc.
  3. One assertion (or at least one concept) per test (when possible) - easier to debug
  4. Use descriptive names - test_average_with_negative_numbers
  5. Test behavior, not implementation - catch bugs early!

In data science:

  • Test your data cleaning functions
  • Test statistical calculations with known results
  • Test that your parsers handle bad data gracefully

Calling Rust from Python

Python is great for:

  • Data exploration
  • Quick prototyping
  • Rich ecosystem (pandas, scikit-learn, pytorch, etc.)
  • Easy to write

Rust is great for:

  • Performance (100x faster than Python)
  • Memory safety
  • Parallel processing

Best of both worlds: Write slow parts in Rust, call from Python!

Use Cases for Rust + Python

Real examples:

  • Polars: Fast DataFrame library (like pandas but Rust)
  • Cryptography: Python's cryptography library has Rust internals
  • Tokenizers: Hugging Face uses Rust for fast NLP tokenization

You might use it for:

  • Processing large datasets
  • Heavy numerical computations
  • Performance-critical parts of your pipeline

PyO3: The Bridge Between Rust and Python

PyO3 is a Rust library that lets you:

  • Call Rust from Python
  • Call Python from Rust
  • Create Python modules in Rust

Installation:

cargo add pyo3 --features extension-module

Exporting your Rust function to Python:

Rust code (src/lib.rs):

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyfunction]
fn add(a: i32, b: i32) -> i32 {
    a + b
}

#[pymodule]
fn my_rust_module(m: &Bound<'_, PyModule>) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(add, m)?)?;
    Ok(())
}
}

Building and using: the maturin tool

maturin is a tool for building Python packages in Rust.

Step 1: Create a new Rust-Python project

pip install maturin
maturin new my_project
cd my_project

This creates:

my_project/
  Cargo.toml          <- Rust configuration
  pyproject.toml      <- Python configuration
  src/
    lib.rs            <- Your Rust code goes here!

Step 2: Write your Rust code in src/lib.rs

Step 3: Build and install into Python

maturin develop

This compiles the Rust code and installs it as a Python module in your current Python environment.

Step 4: Use it in Python (anywhere on your machine)!

import my_project
result = my_project.add(2, 3)
print(result)  # 5

How it works: maturin develop compiles src/lib.rs into a binary that Python can load, then installs it where Python can find it (like site-packages).

The maturin develop command compiles your Rust code and installs it into your Python environment!

A Practical Example: Fast String Processing

Rust side (src/lib.rs):

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyfunction]
fn count_words(text: String) -> usize {
    text.split_whitespace().count()
}

#[pymodule]
fn string_tools(m: &Bound<'_, PyModule>) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(count_words, m)?)?;
    Ok(())
}
}

Python side:

import string_tools

text = "The quick brown fox jumps over the lazy dog"
count = string_tools.count_words(text)
print(f"Word count: {count}")  # Word count: 9

Simple, but imagine this with millions of strings!

Realistic example with mixed code

# Python file: main.py

# Use Rust for the slow parts
import my_fast_rust_module

# Use Python for the easy parts
import pandas as pd
import matplotlib.pyplot as plt

# Load data with pandas
df = pd.read_csv("data.csv")

# Process with Rust (fast!)
results = my_fast_rust_module.process_dataframe(df)

# Visualize with matplotlib (easy!)
plt.hist(results)
plt.show()

Summary: Testing + Python Integration

Testing:

  • Use #[test] and assert! macros
  • Run with cargo test
  • Test edge cases and expected errors
  • Write tests as you code!

Rust from Python:

  • Use PyO3 + maturin
  • #[pyfunction] for functions
  • #[pymodule] for modules
  • Build with maturin develop
  • Use when you need performance

Both are about making Rust practical for real projects!

Activity: Writing tests for Friday's activity

Go to our site: https://trgardos.github.io/ds210-fa25-private/b1/activities/activity_28.html for code and instructions