Original: Anthropic Engineering · 31/01/2026
Summary
Research work involves open-ended problems where it’s very difficult to predict the required steps in advance. Claude now has Research capabilities that allow it to search across the web, Google Workspace, and any integrations to accomplish complex tasks. The journey of this multi-agent system from prototype to production taught us critical lessons about system architecture, tool design, and prompt engineeriKey Insights
“Research work involves open-ended problems where it’s very difficult to predict the required steps in advance.” — Discussing the inherent challenges and unpredictability in designing AI for research tasks.
“Once intelligence reaches a threshold, multi-agent systems become a vital way to scale performance.” — Explaining the importance of multi-agent systems in scaling AI capabilities beyond individual agent limitations.
“Multi-agent systems work mainly because they help spend enough tokens to solve the problem.” — Highlighting the efficiency and effectiveness of multi-agent systems in utilizing tokens for complex tasks.
“Multi-agent systems excel at valuable tasks that involve heavy parallelization, information that exceeds single context windows, and interfacing with numerous complex tools.” — Summarizing the types of tasks where multi-agent systems are most effective.
Topics
- Multi-agent systems
- Token efficiency in AI models
- System architecture and design
- AI research methodologies
Full Article
Published: 2026-01-31
Source: https://www.anthropic.com/engineering/multi-agent-research-system
Claude now has Research capabilities that allow it to search across the web, Google Workspace, and any integrations to accomplish complex tasks. The journey of this multi-agent system from prototype to production taught us critical lessons about system architecture, tool design, and prompt engineering. A multi-agent system consists of multiple agents (LLMs autonomously using tools in a loop) working together. Our Research feature involves an agent that plans a research process based on user queries, and then uses tools to create parallel agents that search for information simultaneously. Systems with multiple agents introduce new challenges in agent coordination, evaluation, and reliability. This post breaks down the principles that worked for us—we hope you’ll find them useful to apply when building your own multi-agent systems. Benefits of a multi-agent system Research work involves open-ended problems where it’s very difficult to predict the required steps in advance. You can’t hardcode a fixed path for exploring complex topics, as the process is inherently dynamic and path-dependent. When people conduct research, they tend to continuously update their approach based on discoveries, following leads that emerge during investigation. This unpredictability makes AI agents particularly well-suited for research tasks. Research demands the flexibility to pivot or explore tangential connections as the investigation unfolds. The model must operate autonomously for many turns, making decisions about which directions to pursue based on intermediate findings. A linear, one-shot pipeline cannot handle these tasks. The essence of search is compression: distilling insights from a vast corpus. Subagents facilitate compression by operating in parallel with their own context windows, exploring different aspects of the question simultaneously before condensing the most important tokens for the lead research agent. Each subagent also provides separation of concerns—distinct tools, prompts, and exploration trajectories—which reduces path dependency and enables thorough, independent investigations. Once intelligence reaches a threshold, multi-agent systems become a vital way to scale performance. For instance, although individual humans have become more intelligent in the last 100,000 years, human societies have become exponentially more capable in the information age because of our collective intelligence and ability to coordinate. Even generally-intelligent agents face limits when operating as individuals; groups of agents can accomplish far more. Our internal evaluations show that multi-agent research systems excel especially for breadth-first queries that involve pursuing multiple independent directions simultaneously. We found that a multi-agent system with Claude Opus 4 as the lead agent and Claude Sonnet 4 subagents outperformed single-agent Claude Opus 4 by 90.2% on our internal research eval. For example, when asked to identify all the board members of the companies in the Information Technology S&P 500, the multi-agent system found the correct answers by decomposing this into tasks for subagents, while the single agent system failed to find the answer with slow, sequential searches. Multi-agent systems work mainly because they help spend enough tokens to solve the problem. In our analysis, three factors explained 95% of the performance variance in the BrowseComp
Key Takeaways
Notable Quotes
Research work involves open-ended problems where it’s very difficult to predict the required steps in advance.Context: Discussing the inherent challenges and unpredictability in designing AI for research tasks.
Once intelligence reaches a threshold, multi-agent systems become a vital way to scale performance.Context: Explaining the importance of multi-agent systems in scaling AI capabilities beyond individual agent limitations.
Multi-agent systems work mainly because they help spend enough tokens to solve the problem.Context: Highlighting the efficiency and effectiveness of multi-agent systems in utilizing tokens for complex tasks.
Multi-agent systems excel at valuable tasks that involve heavy parallelization, information that exceeds single context windows, and interfacing with numerous complex tools.Context: Summarizing the types of tasks where multi-agent systems are most effective.
Related Topics
- [[topics/multi-agent-systems]]
- [[topics/token-efficiency-in-ai-models]]
- [[topics/system-architecture-and-design]]
- [[topics/ai-research-methodologies]]
Related Articles
Effective harnesses for long-running agents
Anthropic Engineering · how-to · 82% similar
Writing effective tools for agents — with agents
Anthropic Engineering · how-to · 79% similar
[AINews] OpenAI and Anthropic go to war: Claude Opus 4.6 vs GPT 5.3 Codex
Swyx · explanation · 77% similar
Originally published at https://www.anthropic.com/engineering/multi-agent-research-system.