Matt Bornstein | AI + a16z Summary

Podcast: AI + a16z | Guests: Scott Chacon (CEO, GitButler; Co-founder, GitHub) & Matt Bornstein (GP, a16z)

1. Key Themes

Git Was Never Designed — and That's Now a Critical Problem

Git's CLI was never intentionally designed as a user interface. It began as Unix plumbing commands by Linus Torvalds' team, with a volunteer's Perl scripts accidentally becoming the default interface. This lack of intentional design has compounded over 20 years, and now — with agents as a new and dominant "persona" — the mismatch is acute.

"It's just the user interface that we want to inject some taste and say, here's a way that we think people are trying to use Git and make it easy to do." [00:05:59]

"Git started as plumbing commands for the Linux kernel team. Unix primitives meant to be wrapped in whatever scripts each developer preferred. A volunteer wrote a unified interface. It got pulled into core. And for 20 years, almost nothing has changed." [00:00:55]

Agents Are a Fundamentally New Persona That Current Tooling Ignores

Coding agents don't behave like humans or like traditional scripts. They run status after every command, prefer human-readable output over JSON (compensating with Python scripts), and need next-step hints baked into CLI output. No existing tooling was built with this in mind.

"We added a dash dash status after to all the mutable commands because we're like, you're going to run this next. So we might as well just give you that as the output, right? That's stuff you would never do for scripting. You would never do for Unix philosophy. You'd never do for humans really, right? But agents really want it." [00:14:52]

"It turns out that the agents liked just actually getting the human data because they would kind of compensate by piping it through JQ or writing Python scripts to get the one piece of data that they want out of it." [00:14:24]

The Next Developer Superpower Is Communication, Not Coding

As implementation becomes commoditized by agents, the bottleneck shifts to the quality of specifications, write-ups, and inter-team communication. Developers who can write clearly and describe what they want precisely will dramatically outproduce those who cannot.

"The software developers that will be the best producers of product in the near future are the ones who can communicate, the ones who can write, the ones who can describe. That is, I think, the next superpower." [00:32:48]

"The why rather than the how becomes, I think, more and more valuable as the how becomes cheaper." [00:33:44]

2. Contrarian Perspectives

Agent-to-Agent Chat Channels Actually Hurt Performance

Scott's team built and tested a real-time communication channel between concurrent agents, something that sounds obviously valuable. The result was unambiguous: communication overhead slowed them down, and the agents were already inferring context from shared file state more efficiently than explicit messaging.

"We put it like through, very sadly, we were all devastated. It does not help. They will see that something else is happening. They'll figure out why. Like it'll be like, looks like somebody's working on this, some other feature because they added this stuff. So I'm going to leave that alone. And it's faster, right? Because they don't have the overhead of the communication." [00:20:05]

Agents Prefer Human-Readable Output Over Machine-Optimized Formats Like JSON

The conventional wisdom is to give machines machine-friendly data. Scott's empirical testing showed the opposite: agents preferred human-formatted output and wrote their own parsing scripts around JSON, suggesting that LLMs' flexible input processing inverts the traditional design assumption entirely.

"We thought it would love JSON and it doesn't like JSON that much. And so like, how do you optimize for what an agent really wants? Because we can put in stuff like, guess what I think the agent's next step is going to be and give it some context, some extra context that we wouldn't give a human." [00:15:49]

The "Next GitHub" Won't Look Anything Like GitHub

Scott argues that asking "what's the next GitHub?" is the wrong question — just as GitHub didn't look like SourceForge. The implication is that incumbents (including GitHub) cannot pivot fast enough, and the winner will be something entirely unfamiliar in form.

"Whatever is the next GitHub is going to be the same problem set, right? It's not going to look like GitHub. I think GitHub will be more like a SourceForge or something. The entire programming community changed so fast. And I think GitHub took advantage of that and was able to grow and provide tooling that nobody had, and nobody was really thinking about." [00:25:14]

Most Teams Are Massively Underutilizing Agents — Running More Agents Is the Wrong Optimization

Rather than parallelizing more agents, Scott argues the bottleneck is the quality and quantity of specs and write-ups being fed to agents. Teams are "not spending enough tokens" not because of capacity but because spec-writing and coordination haven't kept pace.

"I almost feel — I don't know if there's a phrase for this — but you're not spending enough tokens, right? Like you're not having enough things working at the same time. The problem set becomes training a team to be good at writing, right? And figuring out what that coordination looks like to decide which of these series of write-ups describes the product that we want to have." [00:35:32]

Code Review Is Already Mostly Theater — and Agents Expose This

Scott points out bluntly that developers rarely do serious code review today, and suggests that agent-assisted review (pull it, run it, test it, surface a short list of concerns) would be materially better than the current PR-based culture.

"If you ask almost any software developer, when you do code review, do you really read the whole PR? Like, do you go through every line and think it through? Do you pull it down and test it out and then leave the good feedback on each line? Agents are very good at that, right?" [00:00:00]

3. Companies Identified

GitButler A Git-compatible developer tool reimagining the Git user interface for both humans and coding agents, featuring parallel branch management, multi-modal interfaces (GUI, TUI, CLI), and agent-optimized output formats. Why mentioned: Core subject of the episode; building agent-native developer tooling on top of Git's storage layer without replacing it.

"With GitButler, you can go back and forth between Git and GitButler. We don't want it to be jujitsu where you have to have some co-located thing that's a really different way of doing stuff. We want it to be a Git compatible tool." [00:11:09]

GitHub The dominant global code collaboration platform co-founded by Scott Chacon, now owned by Microsoft. Why mentioned: As both a reference point for the scale of the opportunity ahead and as a cautionary tale about incumbents' ability (or inability) to pivot in paradigm shifts.

"It's a behemoth. That is both its advantage and disadvantage. Its advantage is it has everyone in the world using it. And so whatever it does, it does at scale, which is awesome. The question is, do they care enough to do that? Or do they have the vision to do that?" [00:23:50]

4. People Identified

Scott Chacon Co-founder of GitHub, author of Pro Git (the definitive reference book on Git), CEO of GitButler, based in Berlin. Why mentioned: Rare combination of deep technical Git expertise and product vision; uniquely positioned to rethink the foundational developer tooling layer for the agentic era.

"I started building some stuff and realized that the tooling for Git hasn't changed since I left. I was approached by A-Press to write a third edition of the book and I was like, why? It hasn't changed. It's exactly the same." [00:03:15]

Kirill (Co-founder, GitButler) Co-founder of GitButler (last name not mentioned). Why mentioned: Demonstrated a live example of two agents autonomously stacking branches when competing over the same file — a novel emergent behavior in multi-agent workflows.

"My co-founder Kirill today was showing me this thing where the two agents had, they were both trying to sort of vie for the same file and edit it in a way that wasn't really compatible. And so one stacked their branch on top of the other one and then they kept working and committing to their part of the stack." [00:21:26]

Linus Torvalds Creator of Linux and Git. Why mentioned: Illustrates the philosophical DNA of Git — designed for performance, not usability, and deliberately indifferent to collaboration primitives like PRs and issues.

"Linus has talked about us, right, where he's like, 'They're a good host. I hate PRs, and I hate issues, I hate everything else they have, but abuse them as a host if you want to.'" [00:05:04]

5. Operating Insights

Build Agent Personas the Way You Build User Personas — Then Test Empirically

Scott's team discovered that guessing what agents want (e.g., JSON output) was wrong. The correct process: instrument your CLI, observe actual agent behavior across tool calls, identify failure patterns, and iterate. The insight is that agents can even self-report their struggles when asked.

"You really have to dig in. You have to start asking it, look through the last 50 tool calls that you did — what did you struggle with? What had errors, what did not do what you expected? And weirdly, it will kind of tell us. And we can kind of work on the skill files and figure it out. But it's a new era of trying to figure out usability." [00:16:17]

Use Prototype + Spec Loops Instead of Pure Spec or Pure Vibe Coding

Scott describes a highly productive workflow: write a spec, build a prototype with agents, test it, update the spec based on what you learn, repeat. This avoids both the ambiguity of spec-only work and the drift of pure vibe coding — and produces alignment across the team through show-and-tell rather than document review.

"Every time I have a decision, I just make it, build it. Then I try it out and then I go back to the spec and I fix the spec and I tell it, okay, do it again. That's really nice because I don't have to spend all my time implementing it. And I don't have to just have a writeup that I'm trying to convince you to read. I can have show and tell all the time." [00:36:29]

Triage Code Review Like an ER — Not All Code Deserves the Same Human Scrutiny

Scott's team uses a red-wristband triage model: high-stakes API boundaries get intensive human review; lower-risk UX changes get agent-tested, eval-loop-run, and shipped. This dramatically accelerates throughput without compromising quality on what matters.

"If it's really — if something goes wrong, it's bad — then that's very human. Like we look at stuff, we handwrite stuff. There are sort of other levels of triage where it's not that important. If they're just calling those APIs and it's a UX problem, we refine vibe code that, right? Run it through the eval loop, it doesn't break anything, it makes things 10% faster — ship it." [00:31:22]

6. Overlooked Insights

Appending Agent Transcripts to Git Commits Is a Coming Data Infrastructure Crisis

Scott briefly mentions a metadata system that attaches agent thinking logs and chat transcripts to commits or branches. This is mentioned almost as an aside, but it signals a massive and underappreciated infrastructure problem: as agent-assisted development becomes standard, the provenance metadata (why was this code written, what was the agent thinking, what prompts drove this decision) will balloon into a data engineering challenge that current Git infrastructure is entirely unprepared for.

"We're trying to figure out a metadata system — add transcripts to commits or branches or whatever, right? And Git can't really do that very well. It becomes a big data problem. You'd be surprised how quickly that balloons, right? If you're really trying to keep every transcript, every tool call, or everything that the LLM was thinking — like it becomes a really, really big data problem very fast, even on small projects." [00:40:15]

This is investable: whoever builds the provenance and audit layer for AI-generated code — making agent reasoning searchable, reviewable, and attached to the version graph — will be foundational infrastructure for every software team within 3-5 years.

The CRDT-Based Continuous Working Directory Recorder Was Abandoned for Humans But Is Ideal for Agents

Scott mentioned and then quickly moved past the fact that GitButler's first version continuously recorded every file buffer save using CRDTs — a full scrub-able timeline of the working directory. They killed it because the UI was too complex for humans. But Scott himself noted it could be re-implemented for agents. This is a genuinely novel capability: the ability for an agent to rewind any working directory state to an arbitrary point in time is exactly the kind of undo/recovery primitive that makes agentic coding safe enough to run autonomously at scale.

"We did like a CRDT based version where it was just constantly recording every sort of file buffer save, right? Like everything you ever changed so you could just take a timeline and scrub it back to any point in your working directory. It wasn't even that much data, but the user interface was too much information for humans. It'd almost be interesting to re-implement some of that and have agents take a look — if you want to get the entire working directory back to what it was 27 minutes ago, here you go." [00:41:43]

No one in the conversation pressed on this. But continuous-state working directory snapshots, resurrected as an agent-facing primitive, could be a core safety and debugging layer for autonomous coding workflows — and is currently an open product opportunity.