New

Now in Claude, ChatGPT, Cursor & more with our MCP server

Back to docs
Research Methods

How Many User Interviews Do You Need? The Sample Size Guide for Qualitative Research

Discover the right number of user interviews for your research. Learn about data saturation, theoretical saturation, and practical frameworks for knowing when you've collected enough qualitative data.

How Many User Interviews Do You Need? The Sample Size Guide for Qualitative Research

The short answer: For most user research studies, 5–8 interviews are the minimum for exploratory work with a tightly defined audience, 10–15 is the standard range for discovery research, and 20–30 are needed when studying multiple distinct user segments. The real answer depends on your research goals, population diversity, and a concept most researchers misunderstand: data saturation.

One of the most debated questions in qualitative research is also one of the most consequential: how many interviews do you actually need? Run too few and you risk missing critical patterns — anchoring on the idiosyncrasies of 3 participants instead of the real signal. Run too many and you waste time and budget on diminishing returns. The answer lies in understanding saturation, and in recognizing why your instinct to stop early may be the single biggest threat to your research quality.

Why There's No Universal Answer

Unlike quantitative research, qualitative interviews don't follow statistical power calculations. You cannot plug numbers into a formula and get a sample size. Qualitative research is iterative and inductive — you're building understanding, not measuring frequency.

The goal is not to statistically represent a population. It's to reach saturation: the point at which new interviews stop revealing new themes, patterns, or insights. When your 14th participant echoes what your 8th said, you've likely reached saturation for that topic and audience.

But saturation is notoriously hard to predict in advance. New research from 2025 published in the Journal of Marketing Research shows that data saturation and theoretical saturation are distinct milestones — and conflating the two is one of the leading causes of incomplete qualitative studies that produce weak, unsupported conclusions.

Data Saturation vs. Theoretical Saturation: Why Both Matter

These terms are used interchangeably, but they describe very different milestones:

Data saturation is the point at which no new information emerges from interviews. You've heard the same themes, quotes, and patterns enough times that new conversations add nothing novel. This can happen in as few as 9–12 interviews for a well-defined, homogeneous audience.

Theoretical saturation goes further: it means your emerging theory or explanation of the phenomenon is robust enough to stand on its own. The categories you've identified are fully developed, their properties are clearly understood, and the relationships between them are mapped. This typically requires 1.5–2x as many interviews as data saturation alone.

A 2025 study in SAGE journals found that relying solely on data saturation "often leads to premature closure and weak theorization." The practical implication: hearing repeated themes does not mean you fully understand why those themes exist. That deeper explanatory understanding — which drives better product decisions — requires more data.

"The challenge is that saturation is a property of the data, not of the sample size. You can't know in advance that 12 interviews will be enough — you discover it by coding as you go."
— Guest, Bunce & Johnson, foundational saturation research (2006)

What the Research Actually Shows

Despite the contextual nature of qualitative research, empirical studies produce useful benchmarks:

  • A systematic review of 23 saturation studies (Guest, Bunce & Johnson, 2006) found that 94% of all codes emerged after just 12 interviews in a homogeneous population — a finding cited across UX research for nearly two decades
  • JMIR research (2024) found that true code saturation required 16–24 interviews across diverse populations, while near-saturation occurred at 9–17
  • Nielsen Norman Group advises 5 participants for usability testing (catching approximately 85% of interface problems), but explicitly states that interview-based studies require more due to higher variability in the information collected
  • A 2021 systematic review of empirical tests of saturation found that most studies in behavioral and social research require between 9 and 17 interviews to reach data saturation — with the higher end required when participant populations are more diverse

The wide variance in these numbers reflects the single most important variable: how homogeneous your population is. The more similar your participants are in terms of experience, role, and context, the faster you will reach saturation.

The "5 Users" Rule: Why It Gets Misapplied

Jakob Nielsen's famous recommendation to test with 5 users is one of the most misunderstood heuristics in UX. The 5-user principle applies specifically to usability testing — where you're identifying discrete, observable problems with a known interface. In that context, a mathematical model shows that 5 users surface roughly 85% of all usability issues.

But Nielsen Norman Group itself is unambiguous: "In an interview-based study, because there's more variability in the kinds of information being collected, the point of saturation is often higher than for user tests, so 5 interviews are often not enough."

For open-ended research — understanding mental models, motivations, decision-making processes, pain points — 5 participants creates too thin a base. You risk confusing one participant's strong personality for a pattern. You risk your confirmation bias anchoring you to themes that appeared early and never got challenged.

This matters because many product teams use the 5-user heuristic as a shortcut to justify stopping research early. The result: insights that reflect the researcher's prior beliefs more than the user's actual reality.

4 Factors That Determine Your Actual Sample Size

1. Population Homogeneity

A tightly defined audience — say, "B2B SaaS product managers at 50–500 person companies evaluating project management software" — will converge on saturation far faster than a broad one like "anyone who has managed a project." The more defined your target participant profile, the fewer interviews you need to uncover consistent patterns.

2. Research Scope and Depth

Narrowly scoped studies (e.g., "understand friction in the onboarding flow") reach saturation faster than broad exploratory studies (e.g., "understand how PMs think about prioritization and trade-offs"). The latter has more dimensions, which means more conversations before patterns fully emerge.

3. Interview Structure

Highly structured interviews with specific, focused questions generate more repeatable data — saturation arrives sooner. Open-ended, exploratory interviews produce more varied data — you'll need more conversations before the landscape clarifies. Structured question types like scales and multiple-choice also help: quantitative patterns aggregate quickly across participants, even when qualitative probing continues.

4. Analysis Method

Inductive coding — developing themes from the data bottom-up — typically requires more interviews than deductive coding, where you test predefined hypotheses. Grounded theory studies usually need 20+ interviews. If you're testing a specific hypothesis with a pre-defined codebook, 8–12 may be sufficient.

Practical Guidance by Research Type

Research TypeRecommended RangeKey Consideration
Usability testing5–8 per segmentBased on Nielsen Norman's 85% issue capture principle
Exploratory discovery8–12For homogeneous B2B or B2C audiences
Problem validation10–15Testing known hypotheses against defined segments
Multiple user segments6–8 per segmentEach segment treated as an independent mini-study
Jobs to Be Done switch interviews12–20Need diversity across timeline moments
Voice of customer (VoC) programsOngoing batches of 5–10Continuous model; saturation resets with new contexts
Employee listening studies15–25Organizational complexity and hierarchy require more

How to Know When You've Reached Saturation

Reaching saturation isn't just a count — it's a judgment call requiring ongoing analysis. Here are the signals that tell you it's time to stop:

  • New interviews produce no new codes. Your last 3–4 sessions generated no themes not already present in your codebook. Everything new participants share fits existing categories.
  • You can predict what participants will say. When you start anticipating responses before they're given, you've likely heard enough.
  • Contradictions have been resolved. Early interviews often surface conflicting data points. Saturation means you've interviewed enough to understand why those contradictions exist — which segment they belong to, what context drives them.
  • You could write a complete, defensible report. If you sat down right now to write your findings and couldn't fill out each key theme with multiple supporting quotes, you haven't reached saturation.

A practical technique: analyze interviews in batches of 3–5. After each batch, assess how many new codes emerged. Plot this on a simple graph. When the new-code curve flattens toward zero, you've hit saturation.

How AI-Moderated Research Changes the Calculus

Traditional qualitative research is constrained by a simple economic reality: a 45-minute moderated user interview costs $150–$300 in researcher time plus $75–$200 per participant in recruiting costs. At that rate, running 20 interviews costs $4,500–$10,000 — which is why most teams stop at 5–7.

AI-native research platforms like Koji eliminate the per-interview labor cost entirely. When interviews run automatically — with an AI consultant conducting conversations at any hour, in any language, without scheduling overhead — the marginal cost of the 15th interview is the same as the 5th.

This changes what's possible for every research team:

  • Run larger samples as standard practice, not as a special high-budget project
  • Test across multiple segments simultaneously — run 8 interviews per segment across 4 segments in parallel, completing 32 interviews in the time it would take a human researcher to run 5 sequential sessions
  • Reach theoretical saturation, not just data saturation — because additional interviews cost almost nothing, there's no economic reason to stop at the first plateau
  • Use structured questions — Koji's 6 question types (open_ended, scale, single_choice, multiple_choice, ranking, yes_no) let you mix quantitative patterns and qualitative depth in a single study, so scale responses aggregate automatically across 50+ participants while open-ended probing captures the why behind every number

With Koji's automatic thematic analysis, you don't manually code 25 interviews. The AI surfaces patterns, highlights where new themes stopped emerging, and flags outliers that suggest you need more participants from a specific segment — all in real time.

Teams using AI-assisted research report 60% faster time-to-insight compared to traditional manual research workflows — not because they run fewer interviews, but because analysis no longer creates a bottleneck.

A Practical Decision Framework

Use this four-step process to scope your next study:

Step 1: Rate your population tightness (1–5)

  • 1 = very diverse (e.g., "US consumers aged 18–45")
  • 5 = very specific (e.g., "enterprise procurement managers who evaluated 3+ vendors in the past 6 months")

Step 2: Rate your research depth (1–5)

  • 1 = narrow and evaluative (usability test, concept test)
  • 5 = broad and exploratory (new market discovery, understanding unmet needs)

Step 3: Apply the matrix

  • Tight population + shallow depth → 5–8 interviews
  • Tight population + deep exploration → 12–18 interviews
  • Diverse population + shallow depth → 15–20 interviews
  • Diverse population + deep exploration → 25–40 interviews

Step 4: Plan for iteration. Start at the lower bound of your range. After every 5 interviews, assess whether new themes are emerging. Stop when new-code frequency drops to near zero. This iterative approach is more reliable than pre-committing to a fixed number.

The Cost of Stopping Too Early

Research teams that stop short of saturation often pay the price later — not in their research budget, but in product decisions made on incomplete data. A feature built for a "pattern" that turned out to be two idiosyncratic participants. A segment's unique needs missed entirely because the team talked to 6 people from one segment and assumed they represented all three. A pivot that looked obvious from 5 interviews and was refuted by the 10th.

The irony is that with AI-moderated research, running 20 interviews now costs less time and money than running 7 interviews through traditional means. The constraint has shifted from cost to willingness — and the teams who embrace larger samples are consistently reaching more reliable insights.

Key Takeaways

  • There is no universal sample size for qualitative research — the right number depends on population homogeneity, research depth, and analysis method
  • Data saturation (no new themes emerging) typically occurs at 9–17 interviews; theoretical saturation requires 1.5–2x that number
  • The "5 users" rule applies to usability testing, not interview-based discovery research — for the latter, aim for 10–15 as a baseline
  • Factors accelerating saturation: homogeneous audience, structured questions, narrow scope, deductive coding
  • Factors requiring more interviews: diverse audiences, broad exploratory scope, grounded theory methods, multiple user segments
  • AI-moderated interviews remove the cost barrier to running sufficient samples — with Koji, running 25 interviews is as feasible as running 5

Related Resources