Eric Jang
AI researcher known for work on robotics and deep RL, guest on Dwarkesh podcast discussing AlphaGo.
“Eric Jang, who came on to explain how AlphaGo works, did a similar thing when he was trying to build in a very strong Go bot. And he had interesting observations about the kinds of like it's really good at just running an experiment and going down that path. But it's bad at stopping at dead ends.”
Source→“The 18-month thing — I was actually surprised he brought it up first, because I had the same judgment independently.”
Source→“I highly recommend John Schulman's general advantage estimation paper as like a good treatment on how to think about various ways to compute it.”
Source→“A 10-layer neural network pass... 10 steps of reasoning... is able to amortize and approximate to a very high fidelity a nearly intractable search problem.”
Source→“You can also use kind of a Karpathy-style auto-research hyperparameter tuning to make your architecture pretty good.”
Source→“The beauty of how AlphaGo trains itself is that it actually can take this final search process, the outcome of the search process and tell the policy network, hey, like you know, instead of having MCTS do all this legwork to arrive here, why don't you just predict that from the get-go.”
Source→“In 2020, there was an open source project called Katago by David Wu from Jane Street, who basically achieved a 40x reduction in compute needed to train a really strong GoBot tabula rasa... This is what most Go practitioners today train against when they're playing an AI.”
Source→AI-extracted from podcast / newsletter / paper summaries. May contain errors.