Terminal Chatbot with an LLM

Picture the front desk of an old hotel. A guest walks up and says something. The clerk does not answer from memory of that one sentence. He flips open the green logbook on the counter, reads every line of every exchange the desk has had with that guest today, and only then writes the reply. Then he writes the guest's new sentence underneath, writes his own reply underneath that, and closes the book until the guest speaks again. A chatbot is exactly that desk. Every reply the model gives is a function of the entire logbook, not just the last thing said, which is why a chat that started 40 turns ago can still answer "what did I order first?" without missing a beat.

A hotel front-desk logbook recording an exchange between guest and clerk.

The shape comes from Alan Turing's 1950 test — a teleprinter, two rooms, alternating lines — but the modern logbook idea arrived in November 2022 when OpenAI shipped ChatGPT. The interface was not new. The trick under it was that every API call sent the full conversation back to the model, every time, with each turn tagged as either the user's words or the assistant's. The model never remembered anything between requests. The transcript itself was the memory. Anthropic shipped Claude with the same shape three months later. Every chatbot on the market today is the same loop — append the user's turn to the logbook, hand the whole book to a function that returns the next line, append that line too, repeat.

Start with the type the logbook is made of.

#[derive(Copy, Clone, PartialEq, Eq)]
enum Role {
    System,
    User,
    Assistant,
}

struct Message {
    role: Role,
    content: String,
}

fn role_label(r: Role) -> &'static str {
    match r {
        Role::System => "System",
        Role::User => "User",
        Role::Assistant => "Assistant",
    }
}

fn message(role: Role, content: &str) -> Message {
    Message {
        role,
        content: content.to_string(),
    }
}

A Message is a role plus the words. The role is one of three things, which is exactly the case enum was built for. System is the standing instructions written at the top of the book — "you are a helpful assistant, keep answers short, never give legal advice" — the rules the clerk has to follow no matter who walks in. User is the guest. Assistant is the clerk himself. Three variants, no possibility of a fourth, and the compiler will refuse a match that forgets one of them. A junior version of this design would use a string field called kind with values "system", "user", "assistant", and the day someone typos "assitant" the bug ships silently because strings agree to anything.

The next knob is the function that turns a logbook into the next reply. This is the only piece that changes when you swap a real language model in for a fake one. Everything else above and below it stays the same.

fn chat(history: &[Message], next_user: &str) -> Message {
    let turns = history.iter().filter(|m| m.role == Role::User).count();
    let lower = next_user.to_lowercase();
    let reply = if lower.contains("hi") || lower.contains("hello") {
        "hello! what can i help you with?".to_string()
    } else if lower.contains("weather") {
        "sunny.".to_string()
    } else if lower.contains("name") {
        "i am a tiny mock model. no name yet.".to_string()
    } else if lower.contains("bye") {
        format!("goodbye. {} turns logged.", turns + 1)
    } else {
        "noted.".to_string()
    };
    message(Role::Assistant, &reply)
}

The signature is the contract — chat takes a slice of every message so far and the new thing the user just said, and hands back exactly one Message tagged Assistant. The body in this lesson is a toy switch on keywords because we are stuck inside a snapshot test that has to produce the same bytes on every machine. A real backend would do the same call shape with very different bodies behind it. The shape is what travels.

Now the loop that drives the desk.

fn run_conversation() {
    let user_turns = [
        "hi there",
        "what is your name",
        "how is the weather today",
        "okay bye",
    ];
    let mut history: Vec<Message> = Vec::new();
    history.push(message(Role::System, "you are a helpful assistant."));
    for turn in user_turns.iter() {
        let assistant = chat(&history, turn);
        history.push(message(Role::User, turn));
        history.push(assistant);
    }
    println!("--- transcript ---");
    for m in &history {
        println!("{}: {}", role_label(m.role), m.content);
    }
    println!();
    println!("turns in history: {}", history.len());
}

Read the order carefully. We push the system message first because the standing rules belong at the top of the book and the model reads them before anything else. Then for each turn the guest speaks, we hand the current logbook plus the new line to chat, get back the clerk's reply, and only then write both lines into the book. Calling chat before writing the guest's line keeps the logbook honest — the function takes the words as an argument and the page only flips once we hold the answer in hand. After 4 turns the loop ends and we print the whole book top to bottom so the reader sees every line the desk wrote down.

--- transcript ---
System: you are a helpful assistant.
User: hi there
Assistant: hello! what can i help you with?
User: what is your name
Assistant: i am a tiny mock model. no name yet.
User: how is the weather today
Assistant: sunny.
User: okay bye
Assistant: goodbye. 4 turns logged.

turns in history: 9

The transcript reads cleanly. The system message at the top sets the rules. The user says hi and the assistant greets back. The user asks a name and the assistant answers from the keyword switch. Weather flips to "sunny." Goodbye includes a turn count built from filtering the history for user messages and adding one. The final line — "turns in history: 9" — is the proof that the logbook grew with every exchange. One system + four user turns + four assistant replies = 9 messages. Nothing was forgotten.

One question worth asking — why does chat take history: &[Message] by borrowed slice instead of taking ownership of a Vec<Message>? Because the caller is going to keep using that history after the call returns. If chat took ownership the loop above would have to clone the entire transcript on every turn, which on a 40-turn chat with a long system prompt is real wasted work. A borrowed slice says "let me read this for a moment, I will not change it, I will not keep it". The compiler enforces the promise and the caller spends no allocations.

The chat loop: history plus the new user line goes into chat(), the returned message is appended to history.

The piece this lesson cannot honestly show is the actual model call. A real chatbot's chat function builds a JSON body from the message list, sends it over HTTPS to Anthropic or OpenAI, and parses the streamed reply. Doing that here would break the snapshot — the network would change the bytes on every run, the API key would change per machine, and the test would flake. Look at the shape it would have, gated so the compiler skips it inside this lesson.

// What a real backend call looks like. Gated with #[cfg(any())] so it never
// compiles in this lesson -- a real call would block on the network, return
// different bytes on every run, and break the snapshot. Swap chat() for
// chat_remote() and the loop above keeps working unchanged.
#[cfg(any())]
mod remote {
    use super::{message, Message, Role};

    fn chat_remote(history: &[Message], next_user: &str) -> Message {
        let mut body = String::from("{\"model\":\"claude-opus\",\"messages\":[");
        for m in history {
            let role = match m.role {
                Role::System => "system",
                Role::User => "user",
                Role::Assistant => "assistant",
            };
            body.push_str(&format!(
                "{{\"role\":\"{}\",\"content\":\"{}\"}},",
                role, m.content
            ));
        }
        body.push_str(&format!(
            "{{\"role\":\"user\",\"content\":\"{}\"}}]}}",
            next_user
        ));
        let resp = reqwest::blocking::Client::new() // allow:network teaching a real backend
            .post("https://api.anthropic.com/v1/messages")
            .header("x-api-key", std::env::var("ANTHROPIC_API_KEY").unwrap()) // allow:env teaching a real backend
            .body(body)
            .send()
            .unwrap()
            .text()
            .unwrap();
        message(Role::Assistant, &resp)
    }
}

The body of chat_remote does what the mock did, in three steps. Walk the history, build the JSON messages array with the role names the API expects. Append the new user turn to the end. Send it with the API key from the environment, read the response, wrap it in a Message tagged Assistant. The loop above does not know or care which chat it is calling. Swap one for the other and the desk keeps running. That swap — pure-function-by-pure-function, with the contract sitting in the type signature — is what makes Rust comfortable for production agent code in a way that a script tangled with the network never is.

Side-by-side comparison of the mock keyword backend and the real HTTPS call to an LLM API.

The thing this design cannot do yet is stream the reply word by word the way ChatGPT does it on screen. To get that, the return type has to change from a single Message to something that yields chunks over time, which is what async streams and the broader async ecosystem give you, and what the next lesson on SDKs picks up.