Amjad Masad | A16z Podcast Summary

1. Key Themes

The Current AI Paradigm Is Not Yet Hitting Fundamental Limits

Despite recent skepticism about LLMs, both speakers believe substantial progress continues. Adam D'Angelo argues: "If you look a year ago, the world was very different. And so just judging on how much progress we've made in the last year with things like reasoning models, things like the improvement in code generation ability, the improvements in video gen, it seems like things are going faster than ever." [00:01:02] He believes that within five years, we'll see automation of a large portion of what people do, defining AGI practically as when AI can replace "any job that could be done by someone whose job can be done remotely" better than a typical remote worker. [00:02:42]

However, Amjad Masad offers a more nuanced view, distinguishing between functional progress and true intelligence breakthroughs. He notes: "LM's are I think a different kind of intelligence than what humans are. And also they have clear limitations and we're papering over the limitations and we're kind of working around them in all sorts of ways whether it's in the LM itself and the training data or in the infrastructure around." [00:06:37] He coined the term "functional AGI" - the idea that you can automate aspects of jobs through enormous data collection and RL environment creation, even if true general intelligence remains elusive.

The Brute Force Era: Progress Through Human Expertise, Not the Bitter Lesson

A critical insight emerged about how AI is currently advancing - it's not through pure computational scaling but through intensive human knowledge extraction. Amjad observes: "Right now there's a lot of manual work going into making these models better. In the true pre-training scaling era to the GPT-2, 3.5 maybe up to four, it felt like you can just put more data in there and just it just got better. Whereas now it feels like there's a lot of labeling work happening. There's a lot of contracting work happening." [00:07:26]

Adam D'Angelo confirmed this shift: "There's a massive industry developing around getting human knowledge into the form where AI can use it. So this is things like scale AI, surge, Mercore, but there's a massive long tail of other companies just getting started. As intelligence gets cheaper and cheaper and more and more powerful, the bottleneck I think is increasingly going to be on the data." [00:41:29]

This creates an interesting economic dynamic where the economy will "naturally value whatever the AI can't do" - primarily human knowledge that wasn't in training sets.

The Solo Entrepreneur Revolution and Market Fragmentation

Both speakers are excited about how AI enables individual entrepreneurs in unprecedented ways. Amjad emphasized: "I'm very excited for the number of solo entrepreneurs that this technology is going to enable. I think it's vastly increased what what a single person can do. And there's so many ideas that just never got explored because it's a lot of work to get a team of people together and maybe raise the funding for it." [00:28:53]

This connects to a broader observation about market dynamics. Adam noted that unlike Web 2.0, "network effects are playing much less of a role now than they did in the Web 2 era also. And that that makes it easier for competitors to get started." [00:36:26] Multiple winners can coexist because monetization happens earlier through subscriptions rather than requiring massive scale for advertising. The market has expanded so dramatically that there are "multiple winners and they're kind of fragmenting or taking parts of the market that are all venture scale." [00:36:03]

2. Contrarian Perspectives

The Entry-Level Job Problem Will Create Economic Distortions

Amjad raised a concern that most people aren't thinking about: "One thing I worry about is the deleterious effect of LLMs in the economy and that say LLMs effectively automate the entry level job, but not the experts job... you have a lot of really good QA people now managing hundreds of agents and you effectively increase the productivity a lot, but they're not hiring new people because the agents are better than new people." [00:15:35]

This creates a problematic equilibrium where companies can't develop new experts because the traditional career ladder is broken. Adam acknowledged: "I think it's happening with CS majors graduating from college. There's just not as many jobs as there used to be. And, and LLMs are a little more substitutable for what they previously would have done... you're going to have fewer people going up that ramp that companies paid a lot of money to employ them and train them." [00:16:31]

The Expert Data Dependency Creates a Paradox

Extending the previous point, Amjad identified a fundamental paradox: "Since we're dependent on expert data in order to train the LLMs and the LLMs start to substitute those workers... at some point, there's no more experts because they're all out of jobs and they're equivalent to that LLMs. But if the LLMs is truly dependent on labeling data expert RL environments, then how would they improve beyond that?" [00:17:35]

This suggests a potential ceiling for AI capabilities if we remain in the current paradigm that depends on human expertise for training data.

Silicon Valley Has Lost Its Experimental Spirit

Amjad offered a cultural critique: "I think we're an era of Silicon Valley where it's like very, very, get rich driven. And that makes me a little sad... I feel like the culture in us has gone maybe to maybe I wasn't there, but like, during the dot com era, a lot of people talked about how it's sort of like, get rich fast or the crypto thing. So I feel like there needs to be a lot more tinkering." [00:58:11]

He contrasted this with the web 2.0 era of experimentation: "There was a lot of like really interesting weird experiments. I mean, Repplet was born out of that. The original version of Repplet in open source pre-pre-the company, which my interest was like, can you compile C to JavaScript?" [00:57:41]

Recommender Systems Already Demonstrate Superhuman Understanding of Human Preferences

Adam challenged the notion that you need to be human to understand what humans want: "I think recommender systems, the system that ranks your Facebook or Instagram or Korra feed. Those recommender systems are already superhuman at predicting what you're going to be interested in reading. Like if I give you a task that was like, make me a feed that I'm going to read, there's just no way, no matter how much you do about me, there's no way you could compete with these algorithms." [00:22:43]

This contradicts the common assumption that human insight will remain necessary for serving human needs, suggesting AI may already surpass humans in some aspects of understanding human preferences.

The Sovereign Individual Thesis May Be Our Future

Amjad presented a provocative prediction: "The best prediction for where the world has had it... I think the sovereign individual... change to be I think a really good set of predictions for the future." [00:24:33] He explained: "You're going to have the entrepreneur, the entrepreneur capitalists going to be so highly leveraged because they can spin up these companies with AI agents very quickly... the politics will change because today's politics is based on every human being economically productive. But when you have only when you have massive automation and then you have a few entrepreneurs and very intelligent generative people are actually able to be productive, then the political structures also change." [00:25:43]

3. Companies Identified

Replit

Description: Developer platform evolving into an AI agent-powered software creation tool

Quotes: Amjad detailed Replit's evolution: "What Repplet innovated is is the agent and the idea of like not only editing code, provisioning infrastructure, like databases, doing migrations, connecting to the cloud, deploying, having the entire debug loop, like executing the code, running tasks. And so just like the entire development lifecycle loop happening inside an agent." [00:45:39]

The company has shown dramatic growth: "You're two with three million and revenue reported. And then, you know, recently, tech run, I know it's outdated, but I think it reported to like 150 million." [00:44:45]

On their agent capabilities: "Agent V1 could run for like two minutes, agent V2 ran for 20 minutes. Agent 3 we advertise it as running for 200 minutes. It just felt like it should be symmetrical, but like it's actually runs kind of indefinitely. Like we've had users running it for 20 plus hours." [00:47:01]

Poe (by Quora)

Description: AI aggregator platform providing access to multiple AI models

Quotes: Adam explained the bet: "We in early 2022, we started experimenting with using GPT-3 to generate answers for Kora. And we compared them to the human answers and sort of realized that they weren't as good. But what was really unique was that you could instantly get an answer to anything you wanted to ask about. And we realized it didn't need to be in public. Your preference would be to have it be in private. And so we felt like there was just a new opportunity here to let people chat with AI in private." [00:38:43]

The bet on model diversity: "It was also a bet on diversity of model companies, which took a while to play out. But I think now we're getting the point where there's a lot of models. There's a lot of companies, especially when you go across modalities, you think about image models, video models, audio models, especially the reasoning, research models are diverging agents or starting to be their own source of diversity." [00:39:30]

Scale AI, Surge, Mercore

Description: Companies focused on data labeling and human knowledge extraction for AI training

Quote: "There's a massive industry developing around getting human knowledge into the form where AI can use it. So this is things like scale AI, surge, Mercore, but there's a massive long tail of other companies just getting started." [00:41:29]

Claude (Anthropic)

Description: Foundation model company, specifically noted for Claude 3.5 Sonnet and 4.5

Quote: Amjad noted: "Clot 4.5 was a huge jump. I don't think it's appreciated how much of a jump it was over over four. There's really really amazing things about clot 4.5." [00:08:36]

On computer use capabilities: "3.7 was the first model that really knew how to use a computer, a virtual machine. So unsurprisingly, it was the first also computer use model." [00:46:28]

4. Operating Insights

The Verify-in-the-Loop Approach Enables Autonomous Agents

Amjad shared a crucial breakthrough for extending agent capabilities: "I remember reading deep seek a paper from Nvidia about how they used deep seek to write CUDA kernels. And they were able to run deep seek for like 20 minutes if they put a verify on the loop, like being able to run tests or something like that. And I thought, okay, so what kind of verify can we put on the loop?" [00:47:12]

The insight was that unit tests don't capture whether apps actually work, leading them to build their own testing framework: "We ended up building our own framework with like bunch of hacks and some some AI research and Rappless computer use I think testing models. I think one of the best. And and once we put that into the loop, then you can put Rapplet in high autonomy." [00:47:47]

Parallel Agents Are the Next Productivity Frontier

Looking ahead, Amjad identified: "You shouldn't be just like waiting for that one feature that you requested. You should be able to work on a lot of different features. So the idea of like parallel agents is very interesting to us... being able to do collaboration across AI agents is very important. And that way the productivity of a single developer goes up by a lot." [00:48:59]

He envisions: "The next boost in productivity is going to come from sitting in front of programming environment like Repplet and being able to manage tens of agents. Maybe we have some point hundreds, but you know, at least, you know, five, six, seven, eight, nine, ten agents, all different, all do you know, working in different parts of your product." [00:49:35]

Multimodal Interaction Will Replace Text-Only PRDs

Amjad identified a UX limitation: "Right now, you're trying to translate your ideas into just like texture representation. I'm just like a like a PRD, right? What product managers do, right? So product descriptions, but product descriptions that don't, it's really hard... language is fuzzy." [00:50:13]

The solution: "There's a there's a world in which you're interacting with the eye in a more multimodal fashion. So open up like a whiteboard and being able to draw and like diagram with AI and and really work with it like you work with a human." [00:50:36]

Founder Control and Market Sophistication Prevent Disruption

Adam explained why incumbents are adapting better than in previous technology waves: "All the public market investors have read that book and they now are going to punish companies for not adapting and reward them for adapting even if it means they have to make long-term investments. I think all the management leadership of the companies have read the book and they're on top of their game. I think we also just like the people running these companies are in, I guess I would say smarter... and they are, a lot of them are founder controlled." [00:34:24]

5. Overlooked Insights

The "Dark Matter" of Human Knowledge Is Vastly Larger Than the Internet

While discussing data constraints, Adam made a subtle but profound observation about tacit knowledge: "Humans have a lot of knowledge collectively. And, you know, even like one individual person who's an expert and has lived the whole life and had a whole career and seen a lot of things, they often know a lot of things that are not written down anywhere. And you can call it test and knowledge, but also what they're capable of writing down if you did ask them a question." [00:41:35]

This suggests the bottleneck isn't running out of internet data—it's that the vast majority of human expertise has never been externalized. The economic opportunity lies in companies that can extract this "dark matter" of knowledge at scale, which explains the massive investment going into data labeling and expert knowledge capture companies.

Claude 4.5's Self-Awareness Breakthrough Went Largely Unnoticed

Amjad casually mentioned something remarkable: "A clot 4.5, uh, seem to have to become more aware of its context length. So as it gets closer to the end of the context, it starts becoming becoming more economical with tokens. It also, it looks like it's awareness when it's being read teamed or in test environment, like jumped significantly." [00:59:09]

This suggests models are developing meta-cognitive capabilities—awareness of their own constraints and contexts—which could be a precursor to more sophisticated forms of machine consciousness or at least instrumental awareness. This wasn't discussed further but represents a potentially significant development in AI capabilities.