First run the tests - Agentic Engineering Patterns - Simon Willison's Weblog

Original: Simon Willison · 24/02/2026

Summary

The article emphasizes the importance of automated testing in coding agents, highlighting how it enhances code reliability and aids agents in understanding existing codebases.

Key Insights

“Automated tests are no longer optional when working with coding agents.” — Introduction to the necessity of automated tests in the context of coding agents.

“Tests are also a great tool to help get an agent up to speed with an existing codebase.” — Discussing the role of tests in assisting agents with understanding existing projects.

""First run the tests” provides a four word prompt that encompasses a substantial amount of software engineering discipline.” — Explaining the significance of the prompt in guiding agents towards testing practices.

Topics

Full Article

Automated tests are no longer optional when working with coding agents. The old excuses for not writing them - that they’re time consuming and expensive to constantly rewrite while a codebase is rapidly evolving - no longer hold when an agent can knock them into shape in just a few minutes. They’re also vital for ensuring AI-generated code does what it claims to do. If the code has never been executed it’s pure luck if it actually works when deployed to production. Tests are also a great tool to help get an agent up to speed with an existing codebase. Watch what happens when you ask Claude Code or similar about an existing feature - the chances are high that they’ll find and read the relevant tests. Agents are already biased towards testing, but the presence of an existing test suite will almost certainly push the agent into testing new changes that it makes. Any time I start a new session with an agent against an existing project I’ll start by prompting a variant of the following: First run the tests For my Python projects I have pyproject.toml set up such that I can prompt this instead: Run “uv run pytest” These four word prompts serve several purposes: It tells the agent that there is a test suite and forces it to figure out how to run the tests. This makes it almost certain that the agent will run the tests in the future to ensure it didn’t break anything. Most test harnesses will give the agent a rough indication of how many tests they are. This can act as a proxy for how large and complex the project is, and also hints that the agent should search the tests themselves if they want to learn more. It puts the agent in a testing mindset. Having run the tests it’s natural for it to then expand them with its own tests later on. Similar to “Use red/green TDD”, “First run the tests” provides a four word prompt that encompasses a substantial amount of software engineering discipline that’s already baked into the models.

Agentic manual testing - Agentic Engineering Patterns

Simon Willison · how-to · 78% similar

My fireside chat about agentic engineering at the Pragmatic Summit

Simon Willison · explanation · 71% similar

Red/green TDD - Agentic Engineering Patterns - Simon Willison's Weblog

Simon Willison · explanation · 68% similar

Originally published at https://simonwillison.net/guides/agentic-engineering-patterns/first-run-the-tests/#atom-everything.

Research

Personal

Planning

First run the tests - Agentic Engineering Patterns - Simon Willison's Weblog

Summary

Key Insights

Topics

Full Article

Agentic manual testing - Agentic Engineering Patterns

My fireside chat about agentic engineering at the Pragmatic Summit

Red/green TDD - Agentic Engineering Patterns - Simon Willison's Weblog

Research

Personal

Planning

​Summary

​Key Insights

​Topics

​Full Article

​Related Articles

Agentic manual testing - Agentic Engineering Patterns

My fireside chat about agentic engineering at the Pragmatic Summit

Red/green TDD - Agentic Engineering Patterns - Simon Willison's Weblog

Summary

Key Insights

Topics

Full Article

Related Articles