Original: Lenny Rachitsky · 11/02/2026
Summary
I compare GPT-5.3 Codex with Opus 4.6 (and Opus 4.6 Fast) by asking them to redesign my marketing website and refactor some genuinely gnarly components. I put the newest AI coding models from OpenAI and Anthropic head-to-head, testing them on real engineering work I’m actually doing. I compare GPT-5.3 Codex with Opus 4.6 (and Opus 4.6 Fast) by asking them to redesign my marketing website and refactor some genuinely gnarly components. Through side-byKey Insights
“I compare GPT-5.3 Codex with Opus 4.6 (and Opus 4.6 Fast) by asking them to redesign my marketing website and refactor some genuinely gnarly components.” — Introduction to the AI models being compared and the tasks they were assigned.
“Why Codex excels at code review but struggles with creative, greenfield work.” — Discussion on Codex’s strengths and weaknesses.
“The surprising way Opus and Codex complement each other in a real-world engineering workflow.” — Insight into how different AI models can be used together effectively.
Topics
Full Article
I put the newest AI coding models from OpenAI and Anthropic head-to-head, testing them on real engineering work I’m actually doing. I compare GPT-5.3 Codex with Opus 4.6 (and Opus 4.6 Fast) by asking them to redesign my marketing website and refactor some genuinely gnarly components. Through side-by-side experiments, I break down where each model shines—creative development versus code review—and share how I’m thinking about combining them to build a more effective AI engineering stack. Listen on YouTube, Spotify, or Apple PodcastsWhat you’ll learn:
- The strengths and weaknesses of OpenAI’s Codex vs. Anthropic’s Opus for different coding tasks
- How I shipped 44 PRs containing 98 commits across 1,088 files in just five days using these models
- Why Codex excels at code review but struggles with creative, greenfield work
- The surprising way Opus and Codex complement each other in a real-world engineering workflow
- How to use Git concepts like work trees to maximize productivity with AI coding assistants
- Why Opus 4.6 Fast might be worth the 6x price increase (but be careful with your token budget)
Brought to you by:
WorkOS—Make your app enterprise-ready todayIn this episode, we cover:
(00:00) Introduction to new AI coding models (02:13) My test methodology for comparing models (03:30) Codex’s unique features: Git primitives, skills, and automations (09:05) Testing GPT-5.2 Codex on a website redesign task (10:40) Challenges with Codex’s literal interpretation of prompts (15:00) Comparing the before and after with Codex (16:23) Testing Opus 4.6 on the same website redesign task (20:56) Comparing the visual results of both models (21:30) Real-world engineering impact: 44 PRs in five days (23:03) Refactoring components with Opus 4.6 (24:30) Using Codex for code review and architectural analysis (26:55) Cost considerations for Opus 4.6 Fast (28:52) ConclusionTools referenced:
• OpenAI’s GPT-5.3 Codex: • Anthropic’s Claude Opus 4.6: • Cursor: • GitHub:Other references:
• Tailwind CSS: • Git: • Bugbot:Where to find Claire Vo:
Related Articles
🎙️ This week on How I AI: Opus vs. Codex showdown, and AI for accessibility
Lenny Rachitsky · explanation · 85% similar
How to build your own AI developer tools with Claude Code | CJ Hess (Tenex)
Lenny Rachitsky · how-to · 83% similar
Head of Claude Code: What happens after coding is solved | Boris Cherny
Lenny Rachitsky · explanation · 80% similar
Originally published at https://www.lennysnewsletter.com/p/claude-opus-46-vs-gpt-53-codex-how.