Quoting Thariq Shihipar

Original: Simon Willison · 20/02/2026

Summary

At Claude Code, we build our entire harness around prompt caching. A high prompt cache hit rate de

Key Insights

“Long running agentic products like Claude Code are made feasible by prompt caching which allows us to reuse computation from previous roundtrips and significantly decrease latency and cost.” — Discussing the role of prompt caching in improving Claude Code’s performance.

“A high prompt cache hit rate decreases costs and helps us create more generous rate limits for our subscription plans, so we run alerts on our prompt cache hit rate and declare SEVs if they’re too low.” — Explaining the operational benefits of maintaining a high prompt cache hit rate for Claude Code.

Topics

Full Article

# Quoting Thariq Shihipar

Author: Simon Willison
Published: 2026-02-20
Source: https://simonwillison.net/2026/Feb/20/thariq-shihipar/#atom-everything

Long running agentic products like Claude Code are made feasible by prompt caching which allows us to reuse computation from previous roundtrips and significantly decrease latency and cost. […] At Claude Code, we build our entire harness around prompt caching. A high prompt cache hit rate decreases costs and helps us create more generous rate limits for our subscription plans, so we run alerts on our prompt cache hit rate and declare SEVs if they’re too low.

— Thariq Shihipar

Key Takeaways

Notable Quotes

Long running agentic products like Claude Code are made feasible by prompt caching which allows us to reuse computation from previous roundtrips and significantly decrease latency and cost.

Context: Discussing the role of prompt caching in improving Claude Code’s performance.

A high prompt cache hit rate decreases costs and helps us create more generous rate limits for our subscription plans, so we run alerts on our prompt cache hit rate and declare SEVs if they’re too low.

Context: Explaining the operational benefits of maintaining a high prompt cache hit rate for Claude Code.

[[topics/agent-native-architecture]]
[[topics/prompt-engineering]]
[[topics/claude-code]]

[AINews] Anthropic's Agent Autonomy study

Swyx · explanation · 70% similar

Effective harnesses for long-running agents

Anthropic Engineering · how-to · 69% similar

I dream about AI subagents; they whisper to me while I'm asleep

Geoffrey Huntley · explanation · 68% similar

Originally published at https://simonwillison.net/2026/Feb/20/thariq-shihipar/#atom-everything.

Research

Personal

Planning

Summary

Key Insights

Topics

Full Article

Key Takeaways

Notable Quotes

[AINews] Anthropic's Agent Autonomy study

Effective harnesses for long-running agents

I dream about AI subagents; they whisper to me while I'm asleep

Research

Personal

Planning

​Summary

​Key Insights

​Topics

​Full Article

​Key Takeaways

​Notable Quotes

​Related Topics

​Related Articles

[AINews] Anthropic's Agent Autonomy study

Effective harnesses for long-running agents

I dream about AI subagents; they whisper to me while I'm asleep

Summary

Key Insights

Topics

Full Article

Key Takeaways

Notable Quotes

Related Topics

Related Articles