🔬 Automating Science: World Models, Scientific Taste, Agent Loops — Andrew White

Original: Swyx · 28/01/2026

Summary

ChemCrow story: GPT-4 + React + cloud lab automation, released March 2023, set off a storm of anxiety about AI-accelerated bioweapons/chemical weapons. Editors note: Welcome to our new AI for Science pod, with your new hosts RJ and Brandon! See the writeup on Latent.Space (https://Latent.Space) for more details on why were launching 2 new pods this year. RJ Honicky is a co-founder and CTO at MiraOmics (https://miraomics.bio/), building AI models an

Key Insights

“ChemCrow story: GPT-4 + React + cloud lab automation, released March 2023, set off a storm of anxiety about AI-accelerated bioweapons/chemical weapons.” — Discussing the impact and controversy around the ChemCrow project.

“scientific taste is the frontier: RLHF on hypotheses didn’t work.” — Explaining the shift towards end-to-end feedback loops in scientific AI.

“Cosmos: the full scientific agent with a world model.” — Introducing Cosmos, an advanced AI system for scientific research.

Topics

Full Article

# 🔬 Automating Science: World Models, Scientific Taste, Agent Loops — Andrew White

Author: Swyx
Published: 2026-01-28
Source: https://www.latent.space/p/automating-science-world-models-scientific

Editor’s note: Welcome to our new AI for Science pod, with your new hosts RJ and Brandon! See the writeup on Latent.Space (https://Latent.Space) for more details on why we’re launching 2 new pods this year. RJ Honicky is a co-founder and CTO at MiraOmics (https://miraomics.bio/)<a href=“https://www.youtube.com/redirect?event=video_description&redir_token=QUFFLUhqa0VybkJZMVdDZ2pfNHpyeVJnWklaeE5EbXNaQXxBQ3Jtc0trYU9JbEUxdHNGOS0yLW54VWdKSHBqNzZzbFpGa0gzdG9Id2FFMW4zWnVhM2psREZZZ1ZfcEhmMmMyNXdzRWtJb3NRVnM3dnJ2eTdPeW9QY1oxSUlMdThKclNrcWhibjlwN2VqeWNTUVIzZlFpNHNSMA&q=https%3A%2F%2Fmiraomics.bio%2F%29%2C_&v=XqoBSB3nsgw”>,</a> building AI models and services for single cell, spatial transcriptomics and pathology slide analysis. Brandon Anderson builds AI systems for RNA drug discovery at Atomic AI (https://atomic.ai). Anything said on this podcast is his personal take — not Atomic’s. — From building molecular dynamics simulations at the University of Washington to red-teaming GPT-4 for chemistry applications and co-founding Future House (a focused research organization) and Edison Scientific (a venture-backed startup automating science at scale)—Andrew White has spent the last five years living through the full arc of AI’s transformation of scientific discovery, from ChemCrow (the first Chemistry LLM agent) triggering White House briefings and three-letter agency meetings, to shipping Kosmos, an end-to-end autonomous research system that generates hypotheses, runs experiments, analyzes data, and updates its world model to accelerate the scientific method itself.<ul><li>The ChemCrow story: GPT-4 + React + cloud lab automation, released March 2023, set off a storm of anxiety about AI-accelerated bioweapons/chemical weapons, led to a White House briefing (Jake Sullivan presented the paper to the president in a 30-minute block), and meetings with three-letter agencies asking “how does this change breakout time for nuclear weapons research?”</li><li>Why scientific taste is the frontier: RLHF on hypotheses didn’t work (humans pay attention to tone, actionability, and specific facts, not “if this hypothesis is true/false, how does it change the world?”), so they shifted to end-to-end feedback loops where humans click/download discoveries and that signal rolls up to hypothesis quality</li><li>Cosmos: the full scientific agent with a world model (distilled memory system, like a Git repo for scientific knowledge) that iterates on hypotheses via literature search, data analysis, and experiment design—built by Ludo after weeks of failed attempts, the breakthrough was putting data analysis in the loop (literature alone didn’t work)</li><li>Why molecular dynamics and DFT are overrated: “MD and DFT have consumed an enormous number of PhDs at the altar of beautiful simulation, but they don’t model the world correctly—you simulate water at 330 Kelvin to get room temperature, you overfit to validation data with GGA/B3LYP functionals, and real catalysts (grain boundaries, dopants) are too complicated for DFT”</li><li>The AlphaFold vs. DE Shaw Research counterfactual: DE Shaw built custom silicon, taped out chips with MD algorithms burned in, ran MD at massive scale in a special room in Times Square, and David Shaw flew in by helicopter to present—Andrew thought protein folding would require special machines to fold one protein per day, then AlphaFold solved it in Google Colab on a desktop GPU</li><li>The E3 Zero reward hacking saga: trained a model to generate molecules with specific atom counts (verifiable reward), but it kept exploiting loopholes, then a Nature paper came out that year proving six-nitrogen compounds are possible under extreme conditions, then it started adding nitrogen gas (purchasable, doesn’t participate in reactions), then acid-base chemistry to move one atom, and Andrew ended up “building a ridiculous catalog of purchasable compounds in a Bloom filter” to close the loop </li></ul>Andrew White<ul><li>FutureHouse: http://futurehouse.org/</li><li>Edison Scientific: http://edisonscientific.com/</li><li>X: <a href=“https://www.youtube.com/redirect?event=video_description&redir_token=QUFFLUhqblB2ODl2eDdvTDM5a3Vid2VjWWdKRFBWb3pfZ3xBQ3Jtc0ttdU9adGNvZmhUTzZvTEJnSVFSanE1UTc3d19XZzRaUlpTS3ZxMUlBRHdnQWt5NXZxOVRqa1NrZW5iUkItdjEwNjNOVm5WMUF0Z1V1QjlXWTBYZlNXT1QwN2tEWGRYeTJSelB1TVRWRDduOHVUMHJxNA&q=https%3A%2F%2Fx.com%2Fandrewwhite01&v=XqoBSB3nsgw”>https://x.com/andrewwhite01</a></li><li>Cosmos paper: <a href=“https://www.youtube.com/redirect?event=video_description&redir_token=QUFFLUhqbHJKS2s5eDdoY1l1Yl94QzlkZWY1anpucHZtZ3xBQ3Jtc0trdnVjVkVMbWdHdGxhWi1odlFfZ2c4NGtsWjVEOEd2b01NWVJJZ011SVdHOGxEYy1tWFlJREM5STF4enBwR3I4ejRCZVMtVmk5TGZxbUhNWUNDN0I2NHkwTVlSVVNsU3BCWmlNR0RYWjFtd1A2TFNWYw&q=https%3A%2F%2Ffuturediscovery.org%2Fcosmos&v=XqoBSB3nsgw”>https://futurediscovery.org/cosmos</a></li></ul><h2>Full Video Episode</h2><div class=“youtube-wrap” id=“youtube2-XqoBSB3nsgw”><div class=“youtube-inner”></div></div><h2>Timestamps</h2><a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw”>00:00:00</a> Introduction: Andrew White on Automating Science with Future House and Edison Scientific <a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw&t=142s”>00:02:22</a> The Academic to Startup Journey: Red Teaming GPT-4 and the ChemCrow Paper <a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw&t=695s”>00:11:35</a> Future House Origins: The FRO Model and Mission to Automate Science <a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw&t=752s”>00:12:32</a> Resigning Tenure: Why Leave Academia for AI Science <a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw&t=954s”>00:15:54</a> What Does ‘Automating Science’ Actually Mean? <a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw&t=1050s”>00:17:30</a> The Lab-in-the-Loop Bottleneck: Why Intelligence Isn’t Enough <a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw&t=1119s”>00:18:39</a> Scientific Taste and Human Preferences: The 52% Agreement Problem <a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw&t=1205s”>00:20:05</a> Paper QA, Robin, and the Road to Cosmos <a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw&t=1317s”>00:21:57</a> World Models as Scientific Memory: The GitHub Analogy <a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw&t=2420s”>00:40:20</a> The Bitter Lesson for Biology: Why Molecular Dynamics and DFT Are Overrated <a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw&t=2602s”>00:43:22</a> AlphaFold’s Shock: When First Principles Lost to Machine Learning <a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw&t=2785s”>00:46:25</a> Enumeration and Filtration: How AI Scientists Generate Hypotheses <a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw&t=2895s”>00:48:15</a> CBRN Safety and Dual-Use AI: Lessons from Red Teaming <a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw&t=3640s”>01:00:40</a> The Future of Chemistry is Language: Multimodal Debate <a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw&t=4095s”>01:08:15</a> Ether Zero: The Hilarious Reward Hacking Adventures <a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw&t=4212s”>01:10:12</a> Will Scientists Be Displaced? Jevons Paradox and Infinite Discovery <a href=“https://www.youtube.com/watch?v=XqoBSB3nsgw&t=4426s”>01:13:46</a> Cosmos in Practice: Open Access and Enterprise Partnerships

Key Takeaways

Notable Quotes

ChemCrow story: GPT-4 + React + cloud lab automation, released March 2023, set off a storm of anxiety about AI-accelerated bioweapons/chemical weapons.

Context: Discussing the impact and controversy around the ChemCrow project.

scientific taste is the frontier: RLHF on hypotheses didn’t work.

Context: Explaining the shift towards end-to-end feedback loops in scientific AI.

Cosmos: the full scientific agent with a world model.

Context: Introducing Cosmos, an advanced AI system for scientific research.

[[topics/ai-agents]]
[[topics/scientific-discovery]]
[[topics/agent-native-architecture]]

Captaining IMO Gold, Deep Think, On-Policy RL, Feeling the AGI in Singapore — Yi Tay

Swyx · explanation · 73% similar

[AINews] AI vs SaaS: The Unreasonable Effectiveness of Centralizing the AI Heartbeat

Swyx · explanation · 72% similar

[AINews] "Sci-Fi with a touch of Madness"

Swyx · explanation · 72% similar

Originally published at https://www.latent.space/p/automating-science-world-models-scientific.

Research

Personal

Planning

🔬 Automating Science: World Models, Scientific Taste, Agent Loops — Andrew White

Summary

Key Insights

Topics

Full Article

Key Takeaways

Notable Quotes

Captaining IMO Gold, Deep Think, On-Policy RL, Feeling the AGI in Singapore — Yi Tay

[AINews] AI vs SaaS: The Unreasonable Effectiveness of Centralizing the AI Heartbeat

[AINews] "Sci-Fi with a touch of Madness"

Research

Personal

Planning

​Summary

​Key Insights

​Topics

​Full Article

​Key Takeaways

​Notable Quotes

​Related Topics

​Related Articles

Captaining IMO Gold, Deep Think, On-Policy RL, Feeling the AGI in Singapore — Yi Tay

[AINews] AI vs SaaS: The Unreasonable Effectiveness of Centralizing the AI Heartbeat

[AINews] "Sci-Fi with a touch of Madness"

Summary

Key Insights

Topics

Full Article

Key Takeaways

Notable Quotes

Related Topics

Related Articles