Teahose.
SIGN IN
NEW HERE — WHAT TEAHOSE DOES
We read the entire AI & tech firehose — so you don't have to.
PODPodcastsAll-In, No Priors, Acquired…
NEWNewslettersStratechery, Newcomer…
PAPPapersPhysical AI research
PHProduct Huntdaily launches
VCInvestor ScoutSequoia, a16z, Benchmark…
CLAUDE DISTILLS →
7 reads, 30 sec each — free, 6 AM ET.
+ a live graph of the companies, people & themes underneath.
HOME/SCALE AI BLOG/Ten Years of Scale
NEWS
// NEWSLETTER ISSUE
SCALE AI BLOG

Ten Years of Scale

DATE May 14, 2026SOURCE SCALE AI BLOGPARTICIPANTS JASON DROEGE (CEO, SCALE AI)
In this episode
// SUMMARY

1. Key Themes


Data Infrastructure Is a Foundational Moat, Not a Commodity

Scale's core thesis—that data quality and infrastructure matter as much as models and compute—proved prescient and is now embedded in the world's leading AI systems.

"Scale was built on a belief that was unpopular: data matters as much as models and compute... Most major frontier models today were originally built on the data, human feedback, and evaluation infrastructure that Scale spent a decade perfecting."


The AI Reliability Gap Is the Next Major Investment Frontier

The next competitive battleground is not building capable AI, but building verifiably reliable AI—especially as it moves into high-stakes domains.

"AI is moving into every domain that matters: medicine, education, law, defense, finance, public services. These are not places where you get to guess or take shortcuts. They require AI applications that work reliably, every single time."


Enterprise and Government AI Deployment Is Accelerating Into High-Stakes Sectors

Scale has repositioned from a training data vendor into mission-critical infrastructure for institutions where errors have irreversible consequences.

"That infrastructure is now inside the institutions where the stakes are highest: hospitals where misdiagnosis costs lives, financial institutions where bad calls cost billions, and national security environments where the margin for error is zero."


Early Contrarian Bets on Unglamorous Work Create Category-Defining Positions

Scale's long-term advantage was built by doing work the industry considered low-status—before anyone recognized it as a category worth owning.

"In the early days, that meant doing work most of the industry thought was unglamorous: labeling data, creating data to capture edge cases, red-teaming to make sure models were safe, and building pipelines from scratch. There wasn't a playbook for this, but we started building the infrastructure for AI before that was a category people recognized."


2. Contrarian Perspectives


Defense Tech Was a Viable—Even Necessary—Bet Before It Was Socially Acceptable

In 2020, working with the Pentagon was reputationally risky (other major tech companies were retreating from defense contracts). Scale made the opposite call, and the defense tech sector has since exploded in legitimacy and investment.

"We chose to work with the Pentagon in 2020, before defense tech was a popular choice, because we believed the stakes were too high to sit on the sidelines."


The "Reliability Race" Will Supersede the "Capability Race"

The consensus narrative has centered on model capability (GPT-4 vs. Claude vs. Gemini). Scale is implicitly arguing that the next decade's competition shifts to verifiable reliability—a distinct and underappreciated dimension.

"Scale was never just about labeling data. It was about building AI that verifiably works. Our mission is to develop reliable AI systems for the world's most important decisions."


Foundational AI Work Predates Public Recognition by Years—Patience Is a Strategic Asset

Scale was partnering with Waymo and OpenAI long before either autonomous vehicles or large language models entered mainstream consciousness—suggesting that category-defining infrastructure companies are built well before their markets arrive.

"We started working with Waymo eight years before you could ride one to the office... We partnered with OpenAI on InstructGPT... a full year before ChatGPT took the world by storm."


3. Companies Identified

CompanyDescriptionWhy MentionedQuote
Scale AIAI data infrastructure and reliability companyCentral subject; 10-year retrospective"What started as an idea in a college dorm room has grown into one of the defining companies behind the most consequential technology of our time."
WaymoAutonomous vehicle companyEarly Scale customer; used as proof of long-term, pre-market conviction"We started working with Waymo eight years before you could ride one to the office, because we believed in the future of self-driving."
OpenAIAI research and deployment companyEarly Scale partner on InstructGPT, the precursor to ChatGPT"We partnered with OpenAI on InstructGPT, the predecessor to ChatGPT and the first time a language model could reliably follow instructions, a full year before ChatGPT took the world by storm."
U.S. Pentagon / Department of DefenseU.S. federal defense institutionScale's early government customer; cited as a contrarian bet that proved correct"We chose to work with the Pentagon in 2020, before defense tech was a popular choice, because we believed the stakes were too high to sit on the sidelines."

4. People Identified

PersonDescriptionWhy MentionedQuote
Jason DroegeCEO, Scale AIAuthor of the article; provides the strategic framing for Scale's decade of work"The reliability race is on, and we're building the team to meet the moment."

5. Operating Insights


Do the Unglamorous Work Before the Market Validates It

Scale built durable competitive advantage by investing in data labeling, edge-case capture, and red-teaming when no one else thought it mattered. Operators should look for foundational, unsexy work in emerging categories that will become essential infrastructure.

"There wasn't a playbook for this, but we started building the infrastructure for AI before that was a category people recognized."


Evaluation Infrastructure Is as Strategically Important as Training Infrastructure

Scale emphasizes not just building models, but measuring them—suggesting that evaluation tooling is an underbuilt and undervalued layer of the AI stack.

"We've taken everything we've learned from training models—their strengths and weaknesses, how to turn them into applications, how to evaluate them—and applied it to the hardest problems inside enterprises and governments."


Pay Your Data Contributors—It's Both Ethics and Moat

Scale has paid over $1 billion to contributors globally. At scale, this creates a network of human feedback that is difficult to replicate and reinforces data quality flywheel effects.

"Over $1 billion paid to the contributors who made it possible."


6. Overlooked Insights


Multilingual AI Coverage at 150 Languages Is a Rarely Discussed Competitive Differentiator

Global AI deployment requires language and locale coverage that most companies underinvest in. Scale's breadth here likely underpins its ability to serve international governments and enterprises in ways competitors cannot easily replicate.

"AI developed across 150 languages and locales."


15 Billion Human Decisions Is a Proprietary Data Flywheel That Compounds

This figure is easy to skim past, but it represents a training signal corpus of extraordinary scale—one that took a decade and over a billion dollars to assemble, creating a structural barrier to entry that a new entrant cannot shortcut.

"15 billion human decisions applied to train the world's leading models."