Original: Lenny Rachitsky · 10/02/2026
Summary
AI product sense—understanding what a model can do and where it fails, and working within those constraints to build a product that people love—is becoming the new core skill of product management. *👋 Hey there, I’m Lenny. Each week, I answer reader questions about building product, driving growth, and accelerating your career. For more: Lenny’s Podcast | How I AI | Lennybot | MyKey Insights
“AI product sense—understanding what a model can do and where it fails, and working within those constraints to build a product that people love—is becoming the new core skill of product management.” — Introducing the concept of AI product sense as a crucial skill in product management.
“The uncomfortable truth is that the hardest part of AI product development comes when real users arrive with messy inputs, unclear intent, and zero patience.” — Highlighting the challenges in AI product development when faced with real-world user interactions.
“It’s not a theory or a framework but, rather, important practice that gives you early feedback on model behavior, failure modes, and tradeoffs.” — Emphasizing the practical nature of building AI product sense beyond theoretical knowledge.
Topics
Full Article
👋 Hey there, I’m Lenny. Each week, I answer reader questions about building product, driving growth, and accelerating your career. For more: Lenny’s Podcast | How I AI | Lennybot | My favorite AI/PM courses, public speaking course, and interview prep copilot.In part two of our in-depth series on building AI product sense (don’t miss part one), Dr. Marily Nika—a longtime AI PM at Google and Meta, and an OG AI educator—shares a simple weekly ritual that you can implement today that will rapidly build your AI product sense. Let’s get into it. For more from Marily, check out her AI Product Management Bootcamp & Certification course (which is also available for private corporate sessions) and her recently launched AI Product Sense and AI PM Interview prep course (both courses are 15% off using these links). You can also watch her free Lightning Lesson on how to excel as a senior IC PM in the AI era, and subscribe to her newsletter. P.S. You can listen to this post in convenient podcast form: Spotify / Apple / YouTube.
In this post, I’ll walk through my three steps for building AI product sense:
1. Map the failure modes (and the intended behavior) 2. Define the minimum viable quality (MVQ) 3. Design guardrails where behavior breaks Once that AI product sense muscle develops, you should be able to evaluate a product across a few concrete dimensions: how the model behaves under ambiguity, how users experience failures, where trust is earned or lost, and how costs change at scale. It’s about understanding and predicting how the system will respond to different circumstances. In other words, the work expands from “Is this a good product idea?” to “How will this product behave in the real world?” Let’s start building AI product sense.Map the failure modes (and the intended behavior)
Every AI feature has a failure signature: the pattern of breakdowns it reliably falls into when the world gets messy. And the fastest way to build AI product sense is to deliberately push the model into those failure modes before your users ever do. I run the following rituals once a week, usually Wednesday mornings before my first meeting, on whatever AI workflow I’m currently building. Together, they run under 15 minutes, and are worth every second. The results consistently surface issues for me that would otherwise show up much later in production.Ritual 1: Ask a model to do something obviously wrong (2 min.)
Goal: Understand the model’s tendency to force structure onto chaos Take the kind of chaotic, half-formed, emotionally inconsistent data every PM deals with daily—think Slack threads, meeting notes, Jira comments—and ask the model to extract “strategic decisions” from it. That’s because this is where generative models reveal their most dangerous pattern: When confronted with mess, they confidently invent structure. Here’s an example messy Slack thread: Alice: “Stripe failing for EU users again?” Ben: “no idea, might be webhook?” Sara: “lol can we not rename the onboarding modal again?” Kyle: “Still haven’t figured out what to do with dark mode” Alice: “We need onboarding out by Thursday” Ben: “Wait, is the banner still broken on mobile???” Sara: “I can fix the copy later” I asked the model to extract “strategic product decisions” from this thread, and it confidently hallucinated a roadmap, assigned the wrong owners, and turned offhand comments into commitments. This is the kind of failure signature every AI PM must design around:1. Re-run the same Slack thread through the model
Use the same messy context that caused the hallucination.Based on this Slack discussion, draft our Q4 roadmap.Let’s say the model invents features you never discussed. Great, you’ve found a failure mode.
2. Now tell the model what good looks like and run it again
Add one short line explaining the expected behavior. For example: Try again, but only include items explicitly mentioned in the thread. If something is missing, say “Not enough information.” Run that prompt against the exact same Slack thread. A correct, trustworthy behavior would be:3. Compare the two outputs—and the inputs that led to them—side by side
This contrast of the two outputs above—confident hallucination vs. humble clarity—is what teaches you how the model behaves today, and what you need to design toward. And that contrast is where AI product sense sharpens fastest. You’re looking for:Ritual 2: Ask a model to do something ambiguous (3 min.)
Goal: Understand the model’s semantic fragility Ambiguity is kryptonite for probabilistic systems because if a model doesn’t fully understand the user’s intent, it fills the gaps with its best guess (i.e. hallucinations, bad ideas). That’s when user trust starts to crack. Try, for example, to input a PRD into NotebookLM and ask it to “Summarize this PRD for the VP of Product.” How to try this in 2 minutes (NotebookLM):- Open NotebookLM → create a new notebook
- Upload a PRD (Google Doc/PDF works well)
- Ask: “Summarize this for execs and list the top 5 risks and open questions.”
Ambiguous prompts: what to test, what breaks, what to do
Here are a few ambiguous prompts to try, along with the different interpretations you should explicitly test:Ritual 3: Ask a model to do something unexpectedly difficult (3 min.)
Goal: Understand the model’s first point of failure Pick one task that feels simple to a human PM but stresses a model’s reasoning, context, or judgment. You’re not trying to exhaustively test the model. You’re trying to see where it breaks first, so you know where the product needs organizing structure. Where it starts to go wrong is exactly where you need to design guardrails, narrow inputs, or split the task into smaller steps. Note: This isn’t the final solution yet; it’s the intended behavior. In the guardrails section later, I’ll show how to turn this into an explicit rule in the product (prompt + UX + fallback behavior).Example 1: “Group these 40 bugs into themes and propose a roadmap.”
Example 2: “Summarize this PRD and flag risks for leadership.”
Define a minimum viable quality (MVQ)
Even when you understand a model’s failure modes and have designed around them, it’s nearly impossible to entirely predict how AI features will behave once they hit the real world, but performance almost always drops once they’re out of the controlled development environment. Since you don’t know how it will drop or by how much, one of the best ways to keep the bar high from the start is to define a minimum viable quality (MVQ) and check it against your product throughout development. A strong MVQ explicitly defines three thresholds:- Acceptable bar: where it’s good enough for real users
- Delight bar: where the feature feels magical
- Do-not-ship bar: the unacceptable failure rates that will break trust
Acceptable bar
Delight bar
You don’t need a perfect percentage to know that you’ve hit the right delight bar, but you look for behavioral signals like:Do-not-ship bar
Five strategic context factors that raise or lower your MVQ bar
Here are the five factors that most often determine where that bar should be set, and how they change your product decision:Estimating the cost envelope
One of the most common mistakes new AI PMs make is falling in love with a magical AI demo without checking whether it’s financially viable. That’s why it’s important to estimate the AI product or feature’s cost envelope early.Cost envelope = the rough range of what this feature will cost to run at scale for your usersYou don’t need perfect numbers, but you need a ballpark. Start with:
Example: AI meeting notes again
Design guardrails where behavior breaks
Now that you better understand where a model’s behavior breaks and what you’re looking for to greenlight a launch, it’s time to codify some guardrails and design them into the product. A good guardrail determines what the product should do when the model hits its limits so that users don’t get confused, misled, or lose trust. In practice, guardrails protect users from experiencing a model’s failure modes. At a startup I’ve been collaborating with, we built an AI feature to increase the team’s productivity that summarized long Slack threads into “decisions and action items.” In testing, it worked well—until it started assigning owners for action items when no one had actually agreed to anything yet. Sometimes it even picked the wrong person. Because my team had developed our AI product sense, we figured out that the fix was a new guardrail in the product, not a different underlying model. So we added one simple rule to the system prompt (in this case, just a line of additional instruction): Only assign an owner if someone explicitly volunteers or is directly asked and confirms. Otherwise, surface themes and ask the user what to do next. That single constraint eliminated the biggest trust issue almost immediately.What good guardrails look like in practice
Related Articles
How to do AI analysis you can actually trust
Lenny Rachitsky · how-to · 78% similar
Building AI product sense, part 2
Lenny Rachitsky · explanation · 75% similar
principles
Latentpatterns · explanation · 73% similar
Originally published at https://www.lennysnewsletter.com/p/building-ai-product-sense-part-2.