From Transistors to Microprocessors

A microprocessor is a city built on a fingernail. Every transistor is a building, every wire a street, and the whole grid runs on the same patch of silicon a grain of rice could cover. In 1971 a team at Intel finished the first one — the 4004 — and crammed 2,300 buildings onto that patch. By 2023 a single Apple chip held 92 billion. The plot of land barely grew. The city did. This is the story of how the village on silicon turned into Manhattan, and why your phone is faster than the supercomputers your parents grew up reading about.

The Intel 4004 die, 1971 — 2,300 transistors on a single quarter-inch square of silicon.

Before the 4004 a CPU was a whole closet full of chips wired together on a green board. The computer industry had spent the 1960s shrinking room-sized mainframes into refrigerator-sized minis, but the brain was still spread across dozens of separate slabs of silicon, each holding maybe a few hundred transistors. A Japanese calculator company called Busicom walked into Intel in 1969 wanting twelve custom chips for a desk calculator. An engineer named Federico Faggin looked at the order and asked a different question — what if one chip did the work of all twelve? He talked Busicom into the change, then spent the next year laying out every transistor by hand on sheets of mylar bigger than a kitchen table. When the 4004 shipped in November 1971, it was a single quarter-inch square that held the entire central processor of a computer. The village fit on one block.

The reason this worked is a trick called photolithography. A chip is built like a printed photograph — shine ultraviolet light through a stencil, etch the pattern into a wafer of silicon, repeat for each layer. Make the stencil finer and the buildings get smaller. Same fingernail, more rooms. In 1965 a Fairchild Semiconductor engineer named Gordon Moore noticed that the industry was halving the size of its features every two years like clockwork. He wrote a four-page article predicting it would keep doubling the transistor count on a chip every two years for at least a decade. He was off by about 50 years. That guess became Moore's Law, and Moore went on to co-found Intel three years later — the same Intel that would ship his prediction's first major proof in the 4004.

Transistor count per chip, 1971 to 2026, plotted on a log scale.

Look at what doubling every two years actually means. The 4004's 2,300 transistors became the 8086's 29,000 in 7 years. The 8086 became the original Pentium's 3.1 million in another 15. The Pentium became the Apple M3 Max's 92 billion in another 30. Each step on a normal chart is invisibly small compared to the next, because the curve is exponential. The only honest way to draw it is on a logarithmic scale, where every tick on the y-axis is a doubling. Here is the data, printed by a tiny Rust program — chip, year, transistor count, and a row of # marks where each extra # is one more doubling.

struct Chip {
    year: u16,
    name: &'static str,
    transistors: u64,
}

const CHIPS: &[Chip] = &[
    Chip { year: 1971, name: "Intel 4004",       transistors: 2_300 },
    Chip { year: 1978, name: "Intel 8086",       transistors: 29_000 },
    Chip { year: 1985, name: "Intel 386",        transistors: 275_000 },
    Chip { year: 1993, name: "Intel Pentium",    transistors: 3_100_000 },
    Chip { year: 2000, name: "Pentium 4",        transistors: 42_000_000 },
    Chip { year: 2008, name: "Core i7 (Nehalem)",transistors: 731_000_000 },
    Chip { year: 2015, name: "Apple A9",         transistors: 2_000_000_000 },
    Chip { year: 2020, name: "Apple M1",         transistors: 16_000_000_000 },
    Chip { year: 2023, name: "Apple M3 Max",     transistors: 92_000_000_000 },
    Chip { year: 2026, name: "Apple M5 Max*",    transistors: 180_000_000_000 },
];

fn log2(mut n: u64) -> u32 {
    let mut bits = 0;
    while n > 1 {
        n >>= 1;
        bits += 1;
    }
    bits
}

fn main() {
    println!("Moore's Law: transistors per chip, 1971 to 2026");
    println!("year  chip                   transistors      log2(count)  bar");
    println!("----  ---------------------  ---------------  -----------  ------------------------------------");
    for chip in CHIPS {
        let bits = log2(chip.transistors);
        let bar: String = "#".repeat(bits as usize);
        println!(
            "{:<4}  {:<21}  {:>15}  {:>11}  {}",
            chip.year, chip.name, chip.transistors, bits, bar,
        );
    }
    println!("----  ---------------------  ---------------  -----------  ------------------------------------");
    println!("* 2026 figure is a projection from the 2-year doubling trend.");
}

Run it and read the bars sideways like a city's skyline growing over time.

Moore's Law: transistors per chip, 1971 to 2026
year  chip                   transistors      log2(count)  bar
----  ---------------------  ---------------  -----------  ------------------------------------
1971  Intel 4004                        2300           11  ###########
1978  Intel 8086                       29000           14  ##############
1985  Intel 386                       275000           18  ##################
1993  Intel Pentium                  3100000           21  #####################
2000  Pentium 4                     42000000           25  #########################
2008  Core i7 (Nehalem)            731000000           29  #############################
2015  Apple A9                    2000000000           30  ##############################
2020  Apple M1                   16000000000           33  #################################
2023  Apple M3 Max               92000000000           36  ####################################
2026  Apple M5 Max*             180000000000           37  #####################################
----  ---------------------  ---------------  -----------  ------------------------------------
* 2026 figure is a projection from the 2-year doubling trend.

The bars climb by one # for every doubling. From 1971 to 2023, that bar grew from 11 marks to 36 — twenty-five doublings in fifty-two years. The number of buildings on the same plot of land went up by a factor of 33 million. That is what made the personal computer possible. The 4004 was an industrial part, sold to calculator makers. The 8086 in 1978 ran the IBM PC. The 386 in 1985 brought real multitasking. By the time Apple shipped the M1 in 2020 the entire computer — CPU, GPU, memory controller, neural engine — fit on one slab smaller than a Saltine cracker. The city had become a planet.

Same fingernail, more buildings — the 4004 next to a modern Apple M-series die.

The reason this mattered for everyday people is that price tracks transistor count the wrong way. As you cram more buildings onto the same plot, each building gets cheaper, because the cost of making the wafer barely moves. A 4004 cost about Dollar 200 in 1971 money — that is over Dollar 1,500 today. An Apple M1, with seven million times more transistors, costs Apple maybe Dollar 50 to make. This is why a phone in your pocket has more compute than the Cray-1 supercomputer that NASA bought in 1976 for Dollar 8.8 million. Moore's Law did not just make computers faster. It made them disposable. It made them yours.

The 1976 Cray-1 next to a modern smartphone — same compute, different decade.

There is one wrinkle. The doubling cannot last forever. A transistor in 2023 is about 3 nanometers wide. A silicon atom is 0.2 nanometers. We are running out of atoms to make the buildings out of. The industry has been patching around this for the last decade with tricks — stacking chips vertically, splitting work across many cores, building specialized parts for one job only — but the simple "shrink and double" curve is bending. The 2026 projection in the table above is a guess, not a guarantee. The next bottleneck is what to do when you cannot make the buildings any smaller, only smarter about how they are laid out.

A 3nm transistor sits 15 silicon atoms wide — the floor is in sight.

Next lesson — what the CPU actually does with all those transistors, and why every chip on Earth still speaks one of a handful of languages called instruction sets.