Testing

Behind every casino floor sits a small room with bulletproof glass called the cage. The dealers shove pots, the players cash chips, the pit bosses watch the felt — and then every count, every payout, every claim of a winning hand walks back to the cage where a second person verifies the number before any money moves. The cage is unglamorous. Nobody who plays at the table ever sees it. But take the cage out and the floor starts losing chips inside a week. Tests in Rust are that cage. They sit behind the code, they check the answers your functions hand back, and they refuse to pass the work along if the numbers do not match.

The test pyramid — many small unit tests at the base, fewer integration tests above, a handful of end-to-end tests at the peak.

Kent Beck wrote the first widely-used unit test framework in 1998, for Smalltalk, then ported it to Java as JUnit with Erich Gamma. The idea was older — researchers had been writing test harnesses since the 1960s — but JUnit packaged it in a way every working programmer could pick up in an afternoon. The Rust team built the same pattern straight into the language and into Cargo. When you write #[test] above a function, the compiler quietly compiles that function into a separate test binary that only exists when you run cargo test. Your release builds never carry the test code. The cage is real, but it does not slow down the floor.

The functions we want to test come from the poker work in the last lesson — a function that picks the higher of two hand ranks and a function that splits a pot into equal shares with any leftover chip going to the dealer's left.

/// Returns the higher of two poker hand ranks.
///
/// Hand ranks are scored as integers — pair=2, two-pair=3, three-of-a-kind=4,
/// straight=5, flush=6, full house=7, four=8, straight flush=9, royal=10.
/// Ties resolve to the left hand.
fn winner(left: u32, right: u32) -> u32 {
    if left >= right { left } else { right }
}

/// Splits a pot into equal whole-chip shares for n players.
/// Any remainder chips go to the first player (the dealer's left).
fn split_pot(pot: u32, players: u32) -> (u32, u32) {
    if players == 0 {
        return (0, 0);
    }
    let share = pot / players;
    let remainder = pot % players;
    (share, remainder)
}

Two small functions. The kind of thing that looks obvious until somebody asks the question Beck always asked his students — how do you know it works? Staring at the code is not knowing. Running it once with one input is not knowing. Knowing is checking it against a list of cases, including the cases that look weird — zero players, ties, a pot of one chip — and writing those checks down in code so the next person who edits the function reruns every check by typing four characters at the terminal.

To show what a test is before we hand the work to cargo test, here is a four-line test framework written by hand. A function called run_test takes a name and a closure. The closure does the checking and hands back Ok(()) if everything matched, or Err(reason) if it did not.

fn run_test(name: &str, body: impl Fn() -> Result<(), String>) {
    match body() {
        Ok(()) => println!("PASS: {name}"),
        Err(reason) => println!("FAIL: {name} -- {reason}"),
    }
}

That is the whole runner. Beck's JUnit is a thousand times bigger and does a thousand things this does not — parallelism, fixtures, parameterized tests — but the core idea is the four lines above. A function under test, a function that asserts, a runner that catches the result and labels it.

The checks themselves use a tiny helper called assert_eq_int that compares two numbers and either returns Ok(()) or builds an error string. The five tests cover the cases worth thinking about — the high hand wins, ties go left, the pot splits cleanly, the remainder lands on the dealer's left, and zero players returns zero instead of crashing on the divide.

fn assert_eq_int(actual: u32, expected: u32) -> Result<(), String> {
    if actual == expected {
        Ok(())
    } else {
        Err(format!("expected {expected}, got {actual}"))
    }
}

fn check_winner_picks_higher() -> Result<(), String> {
    assert_eq_int(winner(9, 6), 9)
}

fn check_winner_ties_go_left() -> Result<(), String> {
    assert_eq_int(winner(7, 7), 7)
}

fn check_pot_splits_evenly() -> Result<(), String> {
    let (share, remainder) = split_pot(60, 3);
    assert_eq_int(share, 20)?;
    assert_eq_int(remainder, 0)
}

fn check_pot_remainder_goes_to_dealer() -> Result<(), String> {
    let (share, remainder) = split_pot(61, 3);
    assert_eq_int(share, 20)?;
    assert_eq_int(remainder, 1)
}

fn check_zero_players_returns_zero() -> Result<(), String> {
    let (share, remainder) = split_pot(50, 0);
    assert_eq_int(share, 0)?;
    assert_eq_int(remainder, 0)
}

Run the program and the cage clerk reads each verdict aloud.

running mini test suite for the poker helpers:

PASS: winner_picks_higher_rank
PASS: winner_ties_go_left
PASS: pot_splits_evenly
PASS: pot_remainder_goes_to_dealer
PASS: zero_players_returns_zero

5 tests run -- this is what cargo test does for you, automatically.

Five PASS lines, in the order the runner called them. If the winner function ever drifted — say somebody changed >= to > and broke the tie-handling — the second line would flip to FAIL and tell you exactly which case stopped agreeing. That single signal is the entire point. The faster the feedback, the smaller the bug that reaches the floor.

The hand-built runner is for showing the shape. Real Rust code does not write its own. It uses #[test] and lets Cargo do the labeling. The same five checks, written the way you would actually write them in a Rust file, look like this — note the #[cfg(any())] attribute, which tells the compiler to ignore this whole block in our lesson binary so the snapshot stays small. In a real crate you would drop the #[cfg(any())] and let cargo test find the #[test] functions on its own.

// What cargo test actually compiles when you write real Rust tests.
// Gated with #[cfg(any())] so the block never compiles inside this lesson
// binary -- cargo test for THIS crate already runs tests/output.rs against
// the snapshot. The reader sees what real test code looks like here.
#[cfg(any())]
mod tests {
    use super::*;

    #[test]
    fn winner_picks_higher_rank() {
        assert_eq!(winner(9, 6), 9);
    }

    #[test]
    fn pot_splits_evenly() {
        assert_eq!(split_pot(60, 3), (20, 0));
    }

    #[test]
    #[should_panic]
    fn divide_by_zero_panics() {
        let _x = 1u32 / 0u32;
    }
}

Three macros do all the heavy lifting. assert_eq! panics if the two arguments differ and prints both values so you see what the function returned versus what you expected. assert_ne! is the opposite. assert! panics if the boolean is false. The #[should_panic] tag on the last test inverts the meaning — the test passes only if the body panics. That is how you write a test for a function that is supposed to fail loudly.

The standard cargo test runner does four things our hand-built one does not. It compiles every #[test] function into a separate test binary. It runs the tests in parallel across threads, since they should not share state. It catches panics so a failing test does not stop the others. And it prints a green summary at the end with pass and fail counts. Run cargo test in any crate that has tests and you see that output without writing a single line of harness code.

Three flavors of test live in a Rust project, and the difference matters because they catch different kinds of bugs. A unit test sits inside the file with the function it tests, usually under a mod tests block at the bottom, and it has access to private items — the helpers that are not exported. An integration test lives in a separate tests/ folder at the root of the crate, gets its own binary, and can only call the crate's public surface — the same way an outside user would call it. That separation forces you to design a clean public API, because if your integration test cannot reach a function it needs, your real users cannot either. A doc-test is the third flavor and the most quietly clever — any fenced code block inside a /// doc comment is also a test. Drop a triple-backtick example into the comment on winner and cargo test compiles and runs that example alongside the unit tests, so the documentation can never lie about how the function works.

An assertion as a turnstile — the value either matches and passes, or differs and gets stopped at the gate.

The standard library's three macros handle the everyday cases. Two crates from the ecosystem solve harder ones. proptest runs your function against thousands of randomly generated inputs and tries to find the smallest input that breaks it — instead of writing one test for split_pot(60, 3), you tell proptest that for any pot and any non-zero player count, the share times the player count plus the remainder must equal the pot. The tool then hunts for a counterexample. insta is the snapshot tester that this very lesson uses behind the scenes — the file tests/output.rs runs our binary, captures the stdout, and compares it byte-for-byte against the saved snapshot. When you change the code and the output changes, insta prints a diff and waits for you to decide whether the new output is the right output. The cage and the camera, side by side.

A question worth asking — why does the snapshot for this lesson live in a separate tests/output.rs file instead of inside main.rs? Because the main binary is what the reader copies and types into their own editor. Mixing test scaffolding into the binary would force every reader to learn two things at once — the concept and the harness. The harness lives next door, the binary stays clean, and the cage still does its job.

Tests catch the bugs you thought of. They do not catch the bugs you did not think of, and the worst production failures almost always come from a case nobody put on the list. That is what the next lesson solves — design patterns that make the unthought-of case impossible to construct in the first place, so the cage has less work to do.