Nine Years of Struggle, Then $20B in 30 Days — Stanford MS&E 435 Class #2

Stanford MS&E 435 Class #2 · April 15, 2026

Sunny Madra · 17:38 "At the same power, you get 2.5x the tokens."

Stanford MS&E 435 Class #2 (published April 15, 2026, around 56 minutes). A conversation with an investor and the former president of Groq.

Groq spent nine years from its 2016 founding being told "your SRAM-centric deterministic architecture is interesting, but niche," eclipsed by NVIDIA. In December 2025, that same company was acquired by NVIDIA for $20 billion — the largest M&A deal in NVIDIA's history. The trigger was a single text message. From showing Jensen a working system to wiring $20 billion: roughly 30 days.

Two speakers tell the story. Brad Gerstner — founding CEO of Altimeter Capital, a major investor in OpenAI, Anthropic, Cerebras, and Groq. Over 18 years, he grew the hedge fund to over $15 billion in AUM. And Sunny Madra — former president of Groq, a serial entrepreneur with four exits behind him (Pivotal / Ford / Groq / NVIDIA). The discussion is moderated by MS&E 435 instructor Apoorv Agrawal (partner at Altimeter Capital).

The premise carries over from Class #1: the "software ate the world" formula doesn't apply to AI. In a world where every additional user consumes GPU compute, marginal cost isn't zero. Because that constraint binds globally and simultaneously, "tokens per watt" has become the fundamental economic metric of the era. That's the starting frame.

The proposal Groq brought to NVIDIA was technically simple. Separate inference into "prefill" (input processing) and "decode" (output generation), and further separate the "compute-intensive" and "memory-bandwidth-intensive" parts of decode. NVIDIA's GPUs are strong at compute and HBM (though memory is off-chip and slow). Groq's SRAM chips have less compute but roughly an order of magnitude more SRAM bandwidth. Connect them via NVLink, and at the same power footprint, you get 2.5x the tokens — that's the basis for "same power, 2.5x tokens."

Key Observations

Brad sat on the email to Jensen for a week (18:00)

The moment Sunny told Brad "we can partner with NVIDIA, send a message to Jensen," Brad froze. Proposing collaboration to the boss of a direct competitor — and Brad had used his own political fundraising to build the close relationship with Jensen. "I didn't want to be seen as suggesting a crazy idea." A week of silence. When Sunny followed up — "well, did you send it?" — Brad finally hit send. Jensen replied immediately: "interesting, let's talk." A scene where the human side leaks out — even a venture capitalist has self-protective biases.

"From working system to $20B wire — 30 days" (20:23)

Apoorv pins down the timeline. Show Jensen a working prototype → NVIDIA wires $20 billion: just 30 days. A company that had been called "niche" for nine years became the target of the largest M&A in NVIDIA's history within a month. $20 billion is roughly the entire market cap of Nintendo. In 30 days. The industry norm — "partnerships and acquisitions take six months in a world where quarterly earnings drive timing" — was compressed by the urgency of physical constraints (power, memory).

"NVIDIA isn't making GPUs anymore" (20:39)

Sunny's observation: NVIDIA has already built a vertically integrated ecosystem of seven chip families and five rack designs. "NVIDIA isn't making GPUs — they're making the entire inference system." So there was already room internally for a decode-specialized chip or an SRAM-centric chip. When Groq's working prototype arrived, NVIDIA could decide "the culture and engineering are complementary." If Groq had simply been building "a better GPU," internal conflict would have blocked the acquisition. Sunny says this outright.

Video Outline

  • (00:00) Shared premise — the "software ate the world" formula doesn't apply to AI
  • (00:30) Guest introduction — Brad Gerstner (Altimeter), Invest America
  • (03:55) Sunny Madra introduction — four consecutive exits across Pivotal / Ford / Groq / NVIDIA
  • (05:11) 2,000 years of human per-capita GDP — flat for 1,800 years, then explosive growth from the 1800s
  • (15:00) The origin of the Groq → NVIDIA partnership
  • (16:04) Inference split into prefill / decode, then decode further subdivided
  • (16:34) Structural difference between GPU (compute + slow HBM) and Groq (SRAM-centric, an order of magnitude faster)
  • (17:00) NVLink Fusion for chip-to-chip connection
  • (17:38) 2.5x the tokens at the same power footprint
  • (17:51) Sunny's text — "we can partner with NVIDIA"
  • (18:00) Brad sat on the email for a week
  • (20:23) "Working system → $20B wire in 30 days"
  • (20:39) NVIDIA isn't making GPUs — it's making the entire inference system
  • (21:46) Marc Andreessen's observation — the era when the people around you spend $100–$1,000 a day on Claude / ChatGPT tokens

Sources

Class #2 | MS&E 435: Economics of the AI Supercycle, Stanford University Spring '26, Apoorv Agrawal

comment is stripped from the HTML output. */}