New

Now in Claude, ChatGPT, Cursor & more with our MCP server

Back to docs
Research Methods

User Interviews vs Usability Testing: When to Use Each (and How They Work Together)

User interviews vs usability testing — the difference between generative and evaluative research, when to use each, and how one AI platform can run both.

Short answer (BLUF): Use user interviews to learn what to build and usability testing to learn whether what you built works. Interviews are generative research — they explore needs, motivations, and problems before a design exists. Usability testing is evaluative research — it watches people attempt tasks on a prototype or product to find where the design breaks (NN/g; Dovetail). They answer different questions, run at different moments, and the strongest teams use both — often in the same week. Here is how to tell them apart and when to reach for each.

The core difference: generative vs. evaluative

The cleanest way to separate the two is the research stage they belong to:

  • User interviews are generative (discovery). You sit with a person and ask about their world: what they are trying to accomplish, what frustrates them, how they solve the problem today. There is usually no product in front of them. The goal is to generate direction — problems worth solving, unmet needs, the language users actually use. This is the home of customer discovery and jobs-to-be-done.
  • Usability testing is evaluative. You give a person a specific task on a real interface ("sign up and invite a teammate") and watch where they hesitate, misclick, or give up. The goal is to evaluate a design and find friction. This is usability testing, often using the think-aloud protocol.

As the Nielsen Norman Group frames it, generative methods like interviews shape what you build, while evaluative methods like usability testing measure how well the thing you built actually works (NN/g).

Side-by-side comparison

User interviewsUsability testing
Research typeGenerative / discoveryEvaluative
Question answered"What should we build, and why?""Does this design work?"
When in the processBefore design; ongoing discoveryAfter a prototype or product exists
What you show the userNothing — you ask about their lifeA prototype, mockup, or live product
StructureOpen conversation, semi-structuredTask-based scenarios
Primary outputNeeds, pain points, opportunities, JTBDFriction points, task success, severity
Typical sample5–15+ for themes~5 users finds ~85% of issues
Risk if skippedYou build the wrong thingYou ship the right thing, broken

A key planning number sits in that table: for usability testing, 5 users uncover roughly 85% of an interface's usability problems (Nielsen & Landauer, NN/g), which is why evaluative rounds stay small and frequent. As Jakob Nielsen put it:

"The best results come from testing no more than 5 users and running as many small tests as you can afford." — Jakob Nielsen, Nielsen Norman Group

Interviews scale differently: because you are looking for the range of needs and motivations rather than counting task failures, you typically keep going until themes repeat — often more than 5.

When to use which

Reach for user interviews when:

  • You are exploring a new problem space or audience.
  • You do not yet know what to build, or whether the problem is real.
  • You need the why behind a metric ("why is activation dropping?").
  • You are validating a problem before committing design resources. (See problem vs. solution interviews.)

Reach for usability testing when:

  • You have a prototype, mockup, or live flow.
  • You want to find where users get stuck completing a real task.
  • You are choosing between two designs or iterating toward a fix.
  • You need to validate that a redesign actually improved task success.

The trap to avoid: running a usability test when you actually have a discovery question. If you show users a polished design and they "like it," you have learned almost nothing about whether you are solving a real problem. Conversely, asking people in an interview whether they would use a feature is notoriously unreliable — watch them attempt the task instead. Match the method to the question.

How they work together

The two are a loop, not a choice:

  1. Interview to find a problem worth solving.
  2. Design a solution.
  3. Usability test the design to find friction.
  4. Interview again when the test surfaces a deeper "why" you did not expect.

The classic example: a usability test shows users abandoning checkout at the shipping step. The test tells you where. A quick follow-up interview tells you why — they expected free shipping. You need both lenses.

Run both with Koji

Most teams use one tool for interviews and another for usability testing, then stitch the insights together by hand. Koji runs both generative interviews and evaluative, task-based sessions on one AI-moderated platform:

  • One AI moderator, two jobs. Run an open-ended discovery interview to explore a problem, or attach a prototype link and have the AI run a task-based usability session with real-time, non-leading probing. Both happen by voice or text, asynchronously, at the participant's convenience.
  • Six structured question types for either mode. Combine open_ended, scale, single_choice, multiple_choice, ranking, and yes_no — a generative "walk me through how you do this today" sits beside an evaluative SEQ scale and a yes_no task-success check. See the structured questions guide. Each aggregates into the right chart automatically.
  • Dozens in parallel, synthesized automatically. Whether discovery or usability, themes, quotes, and quantified answers compile into a live report as sessions finish — no re-watching recordings. (Turning interviews into insights.)
  • A quality gate on every session. Each interview is scored 1–5, so low-effort responses are flagged rather than diluting your findings.

While a traditional setup forces you to choose a generative tool or an evaluative one — and a researcher to moderate each session live — an AI-native platform like Koji lets you move between what to build and whether it works without switching tools or waiting on calendars. And you do not need a research background to run either.

Frequently asked questions

Is a user interview the same as a usability test? No. A user interview is generative — it explores needs and problems, usually with no product present. A usability test is evaluative — it watches a user attempt tasks on a specific design to find friction.

Which should I do first? Almost always interviews. Discovery interviews tell you what to build; usability testing then checks whether your design of that thing works. Testing a great design for the wrong problem is wasted effort.

A worked example

Say activation is dropping in a B2B SaaS onboarding flow. Here is how the two methods divide the work:

  1. Usability test (evaluative): You give five new users the task "set up your workspace and invite a teammate." Four of five stall at the integration step. The test tells you where the flow breaks — precisely and quickly.
  2. User interview (generative): You then ask those users — and a few who churned — about their first week. You learn that the integration is not just confusing; many do not understand why they would connect it at all. The interview tells you why, and reframes the problem from "fix the integration UI" to "explain the value of integrating in the first place."

Neither method alone gets you there. The usability test without the interview leads you to polish a step users do not understand the purpose of. The interview without the test leaves you guessing where the friction actually lives.

A simple cadence for both

You do not have to choose one method per quarter. A sustainable rhythm looks like:

  • Continuous discovery interviews — a steady trickle (even 2–3 a week) so you always have a live read on customer needs and language. See continuous discovery.
  • Usability testing every design iteration — small 5-user rounds whenever a flow changes, not just before launch.
  • Targeted interviews after each test — whenever a usability result surprises you, follow up with a short interview to capture the why.

The teams that ship the best products are not the ones that pick the "right" method once. They are the ones that loop between what to build and whether it works fast enough that the two never drift apart.

Mistakes teams make choosing between them

  • Using a usability test to validate an idea. Showing users a polished design and asking if they like it is not validation — it is confirmation bias. Validate the problem with interviews first.
  • Trusting interview claims about future behavior. "Would you use this?" is notoriously unreliable. If behavior is the question, watch a task instead of asking a hypothetical.
  • Treating "5 users" as a universal rule. Five is the sweet spot for evaluative usability rounds. Generative interview sample sizes depend on how many segments and how much variation you are studying — keep going until themes repeat.

Related resources

Related Articles

AI-Moderated Interviews: How Automated Research Works (And Why It Works Better)

Understand how AI-moderated interviews work, when to use them over human-moderated sessions, and how to get the most from automated qualitative research.

Generative vs. Evaluative Research: When to Use Each Method

Understand the difference between generative and evaluative research, when to use each, and how combining both leads to better product decisions. Includes a comparison table and decision framework.

How Many Interviews Are Enough? A Guide to Sample Size

Understand saturation, practical guidelines, and research-backed recommendations for qualitative sample sizes.

Problem Interviews vs. Solution Interviews: When to Use Each

Problem interviews uncover whether a pain is real and worth solving; solution interviews test whether your proposed answer actually fixes it. Learn the difference, when to run each, the questions to ask, and how to run both at scale with AI.

Structured Questions in AI Interviews

Mix quantitative data collection — scales, ratings, multiple choice, ranking — with AI-powered conversational follow-up in a single interview.

Think-Aloud Protocol: How to Run and Analyze Think-Aloud Sessions

A complete guide to the think-aloud protocol — the most widely used usability testing method. Learn how to set up sessions, moderate effectively, analyze verbal data, and run remote think-aloud studies.

How to Conduct Usability Testing: The Complete Guide

A comprehensive guide to usability testing for UX researchers and product managers. Covers types of testing, participant numbers, step-by-step facilitation, and the most common mistakes to avoid.

The Definitive Guide to User Interviews

Everything you need to plan, conduct, and analyze user interviews that produce actionable research insights.