Original: Anthropic Engineering · 20/02/2026
Summary
The article discusses advancements in agentic coding and infrastructure noise, highlighting various technical evaluations and best practices for AI agents.Key Insights
“Infrastructure configuration can swing agentic coding benchmarks by several percentage points.” — Discussing the impact of infrastructure on coding evaluations.
“Beyond permission prompts: making Claude Code more secure and autonomous.” — Highlighting advancements in security and autonomy for AI coding.
“Effective context engineering for AI agents.” — Emphasizing the importance of context in developing AI agents.
Topics
Full Article
FeaturedQuantifying infrastructure noise in agentic coding evalsInfrastructure configuration can swing agentic coding benchmarks by several percentage points—sometimes more than the leaderboard gap between top models. Building a C compiler with a team of parallel ClaudesFeb 05, 2026Designing AI-resistant technical evaluationsJan 21, 2026Demystifying evals for AI agentsJan 09, 2026Effective harnesses for long-running agentsNov 26, 2025Introducing advanced tool use on the Claude Developer PlatformNov 24, 2025Code execution with MCP: Building more efficient agentsNov 04, 2025Beyond permission prompts: making Claude Code more secure and autonomousOct 20, 2025Effective context engineering for AI agentsSep 29, 2025A postmortem of three recent issuesSep 17, 2025Writing effective tools for agents — with agentsSep 11, 2025Desktop Extensions: One-click MCP server installation for Claude DesktopJun 26, 2025How we built our multi-agent research systemJun 13, 2025Claude Code: Best practices for agentic codingApr 18, 2025The “think” tool: Enabling Claude to stop and think in complex tool use situationsMar 20, 2025Raising the bar on SWE-bench Verified with Claude 3.5 SonnetJan 06, 2025Building effective agentsDec 19, 2024Introducing Contextual RetrievalSep 19, 2024Related Articles
[AINews] Autoresearch: Sparks of Recursive Self Improvement
Swyx · explanation · 77% similar
Effective harnesses for long-running agents
Anthropic Engineering · how-to · 75% similar
[AINews] The high-return activity of raising your aspirations for LLMs
Swyx · explanation · 75% similar
Originally published at https://www.anthropic.com/engineering/.