Skip to main content
Original: Anthropic Engineering · 20/02/2026

Summary

The article discusses advancements in agentic coding and infrastructure noise, highlighting various technical evaluations and best practices for AI agents.

Key Insights

“Infrastructure configuration can swing agentic coding benchmarks by several percentage points.” — Discussing the impact of infrastructure on coding evaluations.
“Beyond permission prompts: making Claude Code more secure and autonomous.” — Highlighting advancements in security and autonomy for AI coding.
“Effective context engineering for AI agents.” — Emphasizing the importance of context in developing AI agents.

Topics


Full Article

FeaturedQuantifying infrastructure noise in agentic coding evalsInfrastructure configuration can swing agentic coding benchmarks by several percentage points—sometimes more than the leaderboard gap between top models. Building a C compiler with a team of parallel ClaudesFeb 05, 2026Designing AI-resistant technical evaluationsJan 21, 2026Demystifying evals for AI agentsJan 09, 2026Effective harnesses for long-running agentsNov 26, 2025Introducing advanced tool use on the Claude Developer PlatformNov 24, 2025Code execution with MCP: Building more efficient agentsNov 04, 2025Beyond permission prompts: making Claude Code more secure and autonomousOct 20, 2025Effective context engineering for AI agentsSep 29, 2025A postmortem of three recent issuesSep 17, 2025Writing effective tools for agents — with agentsSep 11, 2025Desktop Extensions: One-click MCP server installation for Claude DesktopJun 26, 2025How we built our multi-agent research systemJun 13, 2025Claude Code: Best practices for agentic codingApr 18, 2025The “think” tool: Enabling Claude to stop and think in complex tool use situationsMar 20, 2025Raising the bar on SWE-bench Verified with Claude 3.5 SonnetJan 06, 2025Building effective agentsDec 19, 2024Introducing Contextual RetrievalSep 19, 2024

[AINews] Autoresearch: Sparks of Recursive Self Improvement

Swyx · explanation · 77% similar

Effective harnesses for long-running agents

Anthropic Engineering · how-to · 75% similar

[AINews] The high-return activity of raising your aspirations for LLMs

Swyx · explanation · 75% similar