Freda Duan | 张小珺Jùn｜商业访谈录 Summary

Podcast: 张小珺Jùn｜商业访谈录 Episode: 141 - Freda's Investment Notes, Episode 2: Tokenmaxxing, Putting Electric Motors into Steam Engines, Relay Race to Basketball Game, Loneliness, Human Connection Participants: Freda Duan (Partner, Ultimetry Capital, Bay Area investor); Xiao Jun Zhang (Host)

1. Key Themes

Token Economics Are Widely Misunderstood — and That Misunderstanding Is Creating Both Distortions and Opportunities

The market treats token consumption as a proxy for AI value creation, but this is fundamentally flawed. The key metric that matters is Token Per Task — how efficiently a model completes a unit of work — not raw token volume. Models with higher quality can complete the same coding task in 100 lines vs. 1,000 lines, consuming drastically different amounts of compute. Freda draws an analogy to industrial energy efficiency:

"I would imagine it as the industrial era equivalent of looking at Dollar Per Kilowatt Hour — it's a concept of comparing energy-output ratio." 00:02:41

And on the direction of travel:

"The right answer is definitely not more is better — in fact, it's the opposite. Because ultimately everyone will realize what you want is the outcome, and you want to achieve that outcome with maximum efficiency." 00:04:34

The billing model will inevitably shift from per-token to outcome-based pricing, with Sierra (AI customer service) already pioneering this:

"Sierra charges purely based on outcomes. If the AI resolves the customer service issue without transferring to a human agent, they charge. If it transfers, they don't charge... As a customer, I'm completely interest-aligned with Sierra — we both want to solve the problem, and we both want to minimize token burn." 00:10:45

The AI Coding Loop May Be Creating an Irreversible Moat — For the First Time

For two years, AI model leadership rotated every few months. Freda now questions whether that dynamic has fundamentally changed, because coding agents have created a recursive self-improvement loop: better AI trains the next generation of better AI faster.

"Once this cycle starts running, it has a slight flavor of what researchers call recursive self-improvement. If it truly passes a certain critical threshold, the curve becomes extremely steep, and then you're approaching something like singularity — however you want to define it. Those trying to catch up later may find it truly meaningless." 00:12:30

The analogy he uses is the horse-to-automobile transition: early cars broke down frequently, so fast horses could keep up. But once the engine ran reliably, the comparison became meaningless. The implication: every major tech company (OpenAI reorganizing around coding, Google with Sergey Brin personally overseeing it, Meta building its own coding model, xAI acquiring Cursor) is racing for the same reason.

AI's Economic Diffusion Will Lag Technology Diffusion by Years — We Are Still in the "Electric Motor in a Steam Engine" Phase

Freda synthesizes Dario Amodei's framework of technology diffusion vs. economic diffusion. The historical precedent is stark: after the invention of electricity, it took ~40 years for societal productivity to actually improve. Factories simply replaced steam engines with electric motors but kept the same vertical factory layout. The productivity leap only came when factories were redesigned from first principles around electricity (i.e., the assembly line).

"When I look at where AI is on the curve right now, I think we're probably at the stage of 'putting electric motors into steam engines' — everyone has added AI into their own workflows, but no one has truly asked: why does this process exist in its current form? Why does Meta, or any company, need 80,000 people? Why do we need so many layers of hierarchy?" 00:39:43

The computing parallel: PCs were in offices and banks by the 1980s, but macro productivity didn't improve until the mid-to-late 1990s, when companies like Walmart and Amazon redesigned their entire businesses around computers and networks.

2. Contrarian Perspectives

The TAM for AI Coding Was Wrong by an Order of Magnitude — and Most Investors Are Still Using the Wrong Framework

The consensus early framing was: ~4-5 million US developers × ~$200/month = ~$10 billion market. Anthropic alone has already blown past that. The correct TAM, as Dario argued from the start, is the entire white-collar labor market — $30-40 trillion globally — because anything a computer can automate is within scope.

"The true TAM for coding is actually anything that can be operated by a computer. As the world becomes increasingly digital, this scope is continuously expanding — including using AI to operate Excel, and eventually AI to execute trades." 00:18:18

"Every time I think about Cowork — a product built by just two people — I find it genuinely astonishing." 00:18:46

Token Volume as an Investment Signal Is a Lagging, Misleading Metric — Like Bragging About High Electricity Consumption

Using token consumption as a bullish signal for a company is the AI-era equivalent of a factory bragging about how much electricity it burns:

"Just like in the industrial era, you wouldn't brag about a light bulb that consumed a lot of electricity. Instead you'd buy the more expensive LED. I think over time this will change somewhat." 00:08:46

A large company CFO buying Cursor credits sees token usage exploding and concludes employees love it and productivity is up — but may be missing that Cursor consumes more tokens for the same output than more efficient competitors. The market is still rewarding token consumption, not token efficiency.

Software Companies Are Structurally Impaired — and Private Market Valuations Haven't Caught Up Yet

Public software stocks have already fallen 50%+. But private market valuations are still priced on pre-AI assumptions, creating a dangerous dislocation:

"Many primary market star companies, regardless of what they do, or some very large, very good companies — if they went public today, their valuations would be severely cut... And software companies have an additional problem: if their stock price falls 50% and they still want to give employees the same equity compensation, they have to issue twice as many shares." 00:30:35

The structural argument: Anthropic has ~3,000 employees, generates revenue per employee an order of magnitude higher than traditional SaaS companies (potentially $10M+ per employee vs. $500K for traditional software), and has no meaningful sales team. This calls into question whether the traditional enterprise software GTM model is defensible at all.

Institutions Need to Learn from Retail Investors — Not the Other Way Around

Markets are becoming increasingly thematic, narrative-driven, and retail-dominated. Quant trading already represents 60-70%+ of US equity volume, but high-frequency quant is largely a momentum amplifier, not a directional force. The actual directional price-setters are retail investors and medium-frequency players.

"The future is no longer institutions educating retail investors — instead, institutions need to learn from retail investors, because the market very clearly is becoming increasingly stylized, increasingly thematic, and increasingly narrative-driven. Retail investors are inherently part of price formation." 00:47:43

The "Negative Snowball" Business Model for Foundation Models Has a Second Escape Valve That Even Dario Didn't Anticipate

Dario's original framework said the only way to escape the negative snowball (training costs growing faster than revenue) was for scaling to slow down. But Freda identifies a second escape mechanism that actually happened: revenue growth outpacing the training cost curve.

"The most recent months have proven there's actually another path: if your revenue growth slope is steeper — not just 30 vs. 3, but much higher — your company can suddenly become profitable." 00:20:40

"Anthropic has actually pushed its predictions to an extremely refined, beautiful form — very first-principles. But there's a very simple formula here: if your training costs as a percentage of revenue are lower than your inference gross margin, the company can make money." 00:20:48

3. Companies Identified

Sierra AI customer service company. Mentioned as the leading example of outcome-based pricing in AI — charges only when the AI resolves the issue without escalating to a human, with pricing tiered by problem complexity. Freda views this as the gold standard business model for AI applications.

"As a customer, I'm completely interest-aligned with Sierra — we both want to solve the problem, and we both want to minimize token burn. So it's actually an optimal solution." 00:10:45

Anthropic Foundation model company (OpenAI, Anthropic, ByteDance are portfolio companies of Ultimetry). Called out as the clearest example of a company that has broken prior TAM assumptions, grown from $1B to $5B ARR in under 6 months (faster than OpenAI at comparable stage), and effectively operates without a real sales team while serving 80% enterprise customers.

"Anthropic has about 3,000 employees. Traditional software companies might have revenue per employee of around $500K. Anthropic's revenue per employee should be well into the millions — that's an order of magnitude difference. And Anthropic doesn't even have a proper sales team." 00:28:43

Cursor AI coding tool. First company where Freda recognized the TAM miscalculation — Cursor's ARR rapidly hit ~$1B, which at the time represented ~10% of the entire estimated TAM, signaling the framework was completely wrong.

"The earliest signal was probably Cursor — its coding revenue very quickly shot up to maybe $1 billion, then Claude Code came along. At that point, $1B already represented 10% of our TAM estimate, which clearly wasn't normal. That was when a very timely rethinking was needed." 00:19:13

Sierra (again — worth noting separately as investment theme) Pioneer of outcome-based AI billing. Already validated by enterprise customer acceptance.

"Its clients are very accepting of this billing model — if AI helps you resolve the customer service issue without transferring to a human, I charge. If I transfer, I don't charge." 00:10:45

Harvey / Magorra Legal AI companies. Called out as having crossed ~$1B ARR, among the first non-coding AI verticals to generate meaningful revenue.

"Legal, whether Harvey or Magorra, has exceeded $1 billion." 00:50:42

Abridge / Open Evidence Healthcare AI companies. Among the first vertical AI companies outside coding to generate ~$200M+ ARR.

"Healthcare — whether Abridge or Open Evidence — has around $200 million." 00:50:13

AgentMail Early-stage startup building email infrastructure specifically for AI agents (not humans). Gmail's API rate limits are designed for human usage patterns; agents need something fundamentally different.

"There's a startup called AgentMail — it's an email service built specifically for agents to send emails." 00:56:47

Cerebras / Groq AI inference chip companies. Called out as having reached "several billion" in revenue, representing meaningful scale in the AI infrastructure stack.

"Chip companies, whether Cerebras (about to IPO) or Groq, are also big — this covers almost all startups that have reached the billion-dollar revenue scale." 00:50:57

Together AI / Fireworks AI inference infrastructure companies. Both cited as having generated "several billion" in industry revenue collectively, representing the AI inference infrastructure layer.

"AI inference infrastructure — whether Together or Fireworks — the industry has several billion in revenue." 00:50:42

NVIDIA Called out specifically for its open-source autonomous driving model "Alp Mayo" (likely "NVIDIA DRIVE" or similar), which Freda views as potentially becoming the Android of the automotive industry.

"The most interesting thing is NVIDIA, which has launched its own open-source model called Alp Mayo. If this really takes off — like the Android system for automakers — if it succeeds, the industry landscape will change. It won't be just one or two automakers doing autonomous driving; there will be a rapid, broad rollout." 00:59:07

4. People Identified

Dario Amodei (CEO, Anthropic) Repeatedly cited as the clearest strategic thinker on foundation model economics. His "negative snowball" framework, white-collar TAM slide ($30-40T), and distinction between technology diffusion vs. economic diffusion are all cited as foundational frameworks Freda uses.

"Going back to what Dario said — the entire global white-collar workforce. His very first slide in his investor pitch is the white-collar TAM — he says it's a $30-40 trillion market. That's what he's targeting." 00:18:00

Stanley Druckenmiller Referenced in context of AI-driven unemployment risk. His intellectual humility on the question — refusing to make definitive calls on whether AI will cause structural unemployment — is cited as a model for how investors should approach irreducible uncertainty.

"Someone recently asked Druckenmiller: will AI definitely cause unemployment and deflation? His exact words were: if you hold that view dogmatically, you're arrogant and not open-minded." 00:09:04

Geoffrey Hinton Referenced as a cautionary example of expert overconfidence on AI displacement timelines. In 2015-16, Hinton declared radiologists would be replaced by deep learning. Today, radiologist headcount and salaries are at all-time highs.

"Back in 2015-16, Hinton said we shouldn't be training any more radiologists because deep learning had already surpassed human radiologists. Yet today the number of radiologist positions hasn't decreased — it's actually hit new highs, with rising salaries." 00:12:40

Sergey Brin Called out as personally overseeing Google's coding AI efforts, signaling the highest-level strategic priority within Alphabet.

"Google has Sergey personally overseeing Coding." 00:14:35

5. Operating Insights

Redesign Organizational Structure as a Relay Race → Basketball Team

The current enterprise product development process (PM writes PRD → designer interprets → developer builds → QA tests → GTM launches, taking 6 months end-to-end) is a relay race where AI will sequentially eliminate each bottleneck, making the next step the new constraint. The correct response isn't to optimize each handoff — it's to redesign the entire structure.

"I think previously it might have been a relay race, passing the baton one by one. After this, it will be more like a small team sport — basketball, perhaps — with maybe 3-5 people in a small squad, with all the necessary skills within the team. This team should be able to make decisions directly, only escalating truly major issues upward." 00:42:45

Practical implication: QA should be embedded in development. PMs need to be more full-stack. The entire 6-month pipeline should collapse to weeks for most products.

Capture Decision Process Data, Not Just Decision Outcomes — This Is the Next Software Category

Current CRM records what decision was made (25% discount given) but not why (CFO's concern, which options were considered, who persuaded whom). This decision-process data has always been too unstructured to capture. AI can now process raw, unstructured information — making the capture of decision context both possible and enormously valuable.

"Everything about the decision-making process just disappears. Previously this couldn't be recorded because there was too much unstructured data. But in the AI era, you no longer need a compression tool — you can directly process this more raw information. This is potentially a significant wave of opportunity." 00:35:24

Build Infrastructure for Agents, Not Humans — Every Layer Needs to Be Rebuilt

Agent behavior patterns are fundamentally different from human behavior patterns. Gmail's API rate limits, browser authentication flows, payment compliance — all designed for humans — actively break agent workflows. This creates an entire infrastructure rebuild opportunity.

"I think all infrastructure going forward — whether browsers, identity, payments, compliance — will all need to be redone." 00:57:11

The practical signal: Discord (WebSocket-based, persistent real-time connection) works well for agents. Slack (HTTP-based, request-response) does not. Invest/build accordingly.

6. Overlooked Insights

Google's TPU Strategy Shift Is the Most Important Strategic Change in Its Last Decade — and the Market Is Missing It

Freda drops this almost as a parenthetical while discussing chip sector dynamics, but it's potentially massive. Google TPUs were historically internal-only. Google has now begun selling TPUs directly as hardware to external customers. This is not just a new revenue stream — it represents a fundamental strategic repositioning: Google is now competing directly with NVIDIA in the merchant silicon market, using its own proprietary architecture.

"Google's TPU has changed the most — it previously didn't sell externally at all. But now it has started selling TPUs directly to customers as hardware. I think this is the most important change in its strategy over the past decade." 00:03:53

Why this is underappreciated: If Google successfully commoditizes TPU access externally, it (1) generates a new high-margin hardware revenue stream, (2) creates an alternative GPU supply chain that reduces NVIDIA pricing power for Google's own cloud customers, and (3) positions Google Cloud as the only major cloud provider with a differentiated silicon offering. Amazon's Trainium is also quietly scaling to "independent chip company" revenue levels (~$200B+ per Amazon's earnings), suggesting this trend is broad. The investor implication: custom silicon from hyperscalers may be a more credible long-term NVIDIA competitor than any pure-play chip startup.

Agentic B2B Commerce in Cross-Border Trade Is a Massive, Underdiscussed Opportunity

Freda dismisses B2C agentic commerce briefly (consumers enjoy browsing; AI just placing orders adds limited value). But he then makes a much stronger point about B2B cross-border commerce that gets almost no follow-up: the transaction complexity (volume-based pricing, import/export compliance, early-payment discount structures, certification requirements) makes it a perfect agent use case — and the market is enormous and highly fragmented.

"Cross-border e-commerce — say your company is in the US but you want to remotely purchase a batch of office furniture. There's an extremely broad information dimension: your order quantity, unit price, volume discounts, certifications, import/export compliance, payment terms, early-payment discounts... It's a very complex, very long chain with a lot of communication costs. Anywhere communication costs are high is naturally, extremely well-suited for agents." 00:58:11

Why this is underappreciated: B2B cross-border commerce is a multi-trillion dollar market that is still predominantly handled by email chains, spreadsheets, and human brokers. It has none of the consumer experience considerations that make B2C agentic commerce a weak value proposition. The friction is entirely in information asymmetry, compliance complexity, and negotiation — all things agents are specifically good at. No major VC-backed startup appears to be attacking this head-on yet.