AWS Lambda in Rust

A coat-check counter at a club only does work when somebody hands over a coat. No customers, no attendant on the clock. One coat walks up, the attendant takes it, hangs it on a hook, hands back a numbered ticket, then stands still until the next coat arrives. The club owner pays per coat, not per hour. AWS Lambda is that counter. You write the attendant — one function that takes one event and returns one response — and Amazon spins up a tiny machine to run it the instant a coat shows up, then bills you in milliseconds for the time the attendant was actually working.

One event enters the handler, one response leaves -- the only contract a Lambda function honors.

Amazon launched the service in November 2014. Tim Wagner ran the team. The bottleneck they were solving had been visible inside Amazon for years — EC2 charges by the hour even if your server sits idle for fifty-nine of those minutes, and a real service that handles a few requests a day still has to keep a box warm around the clock. Wagner's group built a runtime that loaded your function on demand, ran it on a thin Firecracker microVM Amazon also wrote in Rust, and billed in 100-millisecond slices. Five years later the slice shrank to one millisecond. A function that runs for 80 ms costs the same as a function that runs for 80 ms — no rounding up to the nearest hour, no idle time on the bill.

The shape of the function is the only contract you have to honor. Three things define it — what an event looks like coming in, what a response looks like going out, and the function that turns the first into the second.

struct Event {
    source: &'static str,
    detail: &'static str,
}

struct Response {
    status: u16,
    body: String,
}

An Event is a record. source says where the coat came from — the upstairs cloakroom, the back door, the VIP line — and detail carries whatever facts the source thought were worth attaching. A Response is the claim ticket. status is the HTTP-style number the caller checks first, and body is the message the caller actually reads. Both types are plain structs with no behavior of their own. The behavior lives in one function.

fn handler(event: Event) -> Response {
    match event.source {
        "aws:s3" => Response {
            status: 200,
            body: format!("queued thumbnail job for {}", event.detail),
        },
        "aws:apigateway" => Response {
            status: 200,
            body: format!("HTTP 200 -- echoed {}", event.detail),
        },
        "aws:events" => Response {
            status: 204,
            body: format!("ran scheduled task at tick {}", event.detail),
        },
        other => Response {
            status: 400,
            body: format!("unknown source: {other}"),
        },
    }
}

Read handler as the attendant's whole job. The match on event.source is the attendant looking at where the coat came from and deciding which hook to use. An S3 upload event lands the file on a hook and starts a thumbnail job. An API Gateway HTTP event reads the request and ships back a 200. A scheduled timer event from EventBridge fires once on the cron, runs whatever cleanup task the function holds, and returns a 204 to say nothing came back in the body. The fourth arm catches anything Amazon adds tomorrow — an SQS queue, a Kinesis stream, an IoT message — and rejects it loudly with a 400 so the function never silently swallows a coat it does not know how to hang.

Different AWS services fan into the same handler -- S3, API Gateway, EventBridge, SQS, Kinesis, IoT all speak the same event shape.

Drive three real events through the same handler and watch the responses come back.

fn demo() {
    let events = [
        Event {
            source: "aws:s3",
            detail: "uploads/cat.jpg",
        },
        Event {
            source: "aws:apigateway",
            detail: "GET /hello",
        },
        Event {
            source: "aws:events",
            detail: "0900Z",
        },
    ];

    for (i, event) in events.iter().enumerate() {
        let invocation = i + 1;
        println!("--- invocation {invocation} ---");
        println!("event.source: {}", event.source);
        println!("event.detail: {}", event.detail);
        let response = handler(Event {
            source: event.source,
            detail: event.detail,
        });
        println!("response.status: {}", response.status);
        println!("response.body: {}", response.body);
        println!();
    }
}

The loop is hardcoded so the binary is deterministic, but the shape is the shape Amazon's runtime uses. The Lambda service holds the handler in memory, waits for an event, hands it to the function, captures the return value, ships it back to whoever invoked the function, then waits for the next event. Three events fire here. Three responses come out. The function never knows the events arrived seconds apart on a real cron, microseconds apart on a load test, or out of order from a queue with millions of messages backed up.

--- invocation 1 ---
event.source: aws:s3
event.detail: uploads/cat.jpg
response.status: 200
response.body: queued thumbnail job for uploads/cat.jpg

--- invocation 2 ---
event.source: aws:apigateway
event.detail: GET /hello
response.status: 200
response.body: HTTP 200 -- echoed GET /hello

--- invocation 3 ---
event.source: aws:events
event.detail: 0900Z
response.status: 204
response.body: ran scheduled task at tick 0900Z

Read the output top to bottom. Invocation 1 is the S3 path — the cloakroom drops off uploads/cat.jpg, the handler returns a 200 with the thumbnail-job message. Invocation 2 is the API Gateway path — the HTTP request lands, the handler echoes it back with a 200. Invocation 3 is the scheduled tick — the cron fires at 0900Z, the handler runs the cleanup task and returns a 204 because there is no body to send. Same function, three sources, three honest responses.

The hidden cost the output cannot show is cold start. The first time a coat arrives the counter is empty, the attendant has to walk up from the break room, put on their gloves, and clock in. That walk is the cold start, and the language you wrote the function in decides how long it takes. A Rust function compiles to a small native binary that loads in roughly 50 milliseconds. A Java function loads a JVM and an entire object graph and takes closer to 5 seconds the first time. Python and Node sit in the middle around 200 to 400 ms. The reason teams started rewriting hot-path Lambdas in Rust in 2021 is the cold start — when a user request hits a function nobody has called in 15 minutes, 50 ms feels instant and 5 seconds feels broken.

Cold-start time by runtime -- the gap between event arrival and the first line of user code running.

The real entry point looks almost identical to the one above, with the runtime crate doing the loop for you.

// What the same handler looks like wired to the real lambda_runtime crate.
// Gated with #[cfg(any())] so it never compiles inside this lesson -- the
// real crate pulls in tokio and serde, which would break our stdlib-only
// rule. The reader sees the shape of a production Lambda entry point here.
#[cfg(any())]
mod real_runtime {
    use lambda_runtime::{run, service_fn, Error, LambdaEvent};
    use serde::{Deserialize, Serialize};

    #[derive(Deserialize)]
    struct Event {
        source: String,
        detail: String,
    }

    #[derive(Serialize)]
    struct Response {
        status: u16,
        body: String,
    }

    async fn handler(event: LambdaEvent<Event>) -> Result<Response, Error> {
        let body = format!("got {} from {}", event.payload.detail, event.payload.source);
        Ok(Response { status: 200, body })
    }

    #[tokio::main]
    async fn main() -> Result<(), Error> {
        run(service_fn(handler)).await
    }
}

The lambda_runtime crate exists so you do not write the event loop by hand. run(service_fn(handler)) opens a long-lived connection to the Lambda service, sits there, and every time the service hands it an event it deserializes the JSON into the typed Event, calls your function, serializes the Response back to JSON, and ships it. The #[tokio::main] attribute is there because the runtime talks to the Lambda control plane over async HTTP and tokio is the executor that drives the futures. Production Lambdas almost always run on tokio for that reason — the demo binary above leaves it out because the stdlib version makes the shape of the contract easier to read.

Lambda bills two things -- count of invocations and GB-seconds of memory used while the function ran.

Pricing is the part that decides whether Lambda is the right tool for the job. Amazon charges for two things — the count of invocations and the GB-second of memory used while the function ran. A function with 128 MB of memory that runs for 100 ms costs about a millionth of a cent. The first million invocations every month are free. The 15-minute ceiling is the hard limit — the attendant cannot stay on the clock longer than that for any single coat, which means video transcoding longer than 15 minutes or training runs that take hours have to go somewhere else. Sustained load is the other place Lambda gets expensive. A workload running at 100% utilization 24/7 costs roughly 3 to 5 times what the same workload would cost on a reserved EC2 instance, because the per-millisecond price assumes idle gaps. Spiky traffic loves Lambda. A steady firehose hates it.

The thing this design cannot do on its own is talk to other AWS services without bringing in the SDK. The handler knows how to take an event and return a response, but uploading the resized thumbnail to S3, reading a row from DynamoDB, or putting a message on SQS all happen through the aws-sdk-rust crate — which is the next bottleneck the rest of this section solves.