{"site":{"name":"Koji","description":"AI-native customer research platform that helps teams conduct, analyze, and synthesize customer interviews at scale.","url":"https://www.koji.so","contentTypes":["blog","documentation"],"lastUpdated":"2026-05-11T15:26:33.068Z"},"content":[{"type":"documentation","id":"078813d6-3326-46e1-a29a-4b4c753f904f","slug":"quantitative-user-research-methods","title":"Quantitative User Research: Methods, Examples, and When to Use Them","url":"https://www.koji.so/docs/quantitative-user-research-methods","summary":"A pillar guide to quantitative user research methods. Defines quant vs qual research, walks through nine core methods (surveys, A/B testing, analytics, card sorting, tree testing, SUS, SEQ, preference/MaxDiff/conjoint, and NPS/CSAT/CES) with sample-size rules and limitations, plus a mixed-methods workflow and how AI-native platforms unify quantitative scale with qualitative depth.","content":"## What is quantitative user research? (TL;DR)\n\nQuantitative user research is the systematic collection of **numerical data about user behavior, preferences, or attitudes** to identify patterns, measure outcomes, and validate hypotheses at scale. Where qualitative research answers *\"why?\"*, quantitative research answers *\"what, how often, and how many?\"* — and lets you state findings with statistical confidence rather than narrative interpretation.\n\nIn 2026, quantitative methods are no longer optional even for design-led teams. According to the User Interviews 2025 State of User Research Report, the median researcher runs **2 mixed-methods, 3 qualitative, and 1 quantitative study every six months** — meaning every researcher is expected to be at least quant-literate. And per Lyssna's 2025 Research Synthesis Report, **surveys are used by 83% of research teams** and remain the second-most common research method after interviews (92%).\n\nThis guide covers the nine core quantitative UX research methods, when to use each one, how many participants you need, and how AI-native research is collapsing the wall between qualitative depth and quantitative scale.\n\n## Quantitative vs. qualitative research: the one-line distinction\n\n| Dimension | Qualitative | Quantitative |\n|---|---|---|\n| **Question type** | Why? How? In what context? | What? How many? How often? |\n| **Output** | Themes, quotes, narratives | Counts, percentages, statistical effects |\n| **Sample size** | 5–30 | 30–thousands |\n| **Generalizability** | Suggestive | Statistically defensible |\n| **Best for** | Generating hypotheses, exploring problem space | Validating hypotheses, measuring outcomes |\n| **Example output** | \"Users feel anxious about pricing because the per-seat math is unclear\" | \"62% of trial users visit the pricing page 2+ times before converting\" |\n\nThe two are complementary, not competitive. Nielsen Norman Group's landmark \"When to Use Which UX Research Methods\" framework places virtually every method on a spectrum between qualitative-attitudinal and quantitative-behavioral — and the strongest research programs deliberately use both. For a deeper dive see our [qualitative vs. quantitative research guide](/docs/qualitative-vs-quantitative-research).\n\n## When to use quantitative user research\n\nQuantitative methods earn their cost when you need to:\n\n- **Measure** the size or impact of a known problem\n- **Compare** two or more design options statistically\n- **Validate** an insight uncovered through qualitative work\n- **Prioritize** which of several issues affects the most users\n- **Benchmark** experience over time or against competitors\n- **Predict** behavior at scale (cohort analysis, propensity)\n\nDo *not* lead with quantitative research when the problem space is unclear or the hypothesis is unformed — quant is great at testing answers and terrible at generating questions.\n\n## The 9 core quantitative user research methods\n\n### 1. Surveys\n\n**What it is.** A structured questionnaire delivered to a sample of users to measure attitudes, behaviors, satisfaction, or preferences at scale.\n\n**When to use.** Whenever you need to estimate the proportion of users who think, feel, or do something. Surveys are the workhorse method — flexible enough to run as in-product intercepts, post-purchase emails, or full panel studies.\n\n**Sample size.** For an estimate within ±5% margin at 95% confidence on a binary question, you need roughly **385 responses** for any population over ~10,000.\n\n**Limitations.** Surveys measure what users *say*, not what they *do*. They are also vulnerable to self-selection bias and respondent fatigue. See our [survey fatigue guide](/docs/survey-fatigue) for details.\n\n### 2. A/B testing (split testing)\n\n**What it is.** A controlled experiment where two or more design variants are randomly assigned to user groups, and the difference in a target metric (conversion, engagement, retention) is measured.\n\n**When to use.** When you have two or more concrete options and a measurable success metric. Best for tactical optimization (button color, copy, layout) and validation of bigger changes once they're built.\n\n**Sample size.** Driven by your baseline conversion rate and the minimum effect you want to detect. A 2% absolute lift on a 10% baseline conversion typically needs ~17,000 users per variant to reach statistical significance.\n\n**Limitations.** Tells you *which* variant won but rarely *why*. Pair with qualitative research to interpret unexpected results. See our [A/B testing vs. user research comparison](/docs/ab-testing-vs-user-research) for the trade-offs.\n\n### 3. Product analytics\n\n**What it is.** Passive measurement of user behavior inside your live product — page views, feature usage, conversion funnels, retention curves, drop-off points.\n\n**When to use.** Always. Analytics is the foundation of behavioral quant research and the source of most \"we noticed X\" hypotheses that other methods then explore.\n\n**Sample size.** Whatever your product produces — usually thousands or millions of events.\n\n**Limitations.** Tells you *what* users did but never *why*. A 30% drop-off on step 3 of onboarding could be confusion, intentional skip, or technical bug — analytics can't distinguish.\n\n### 4. Card sorting (closed and hybrid)\n\n**What it is.** Participants group content items into categories and label them. *Open* card sorts let participants invent labels; *closed* card sorts use predefined categories. Quantitative analysis measures the percentage of participants who created similar groupings.\n\n**When to use.** When you're designing or restructuring information architecture (navigation, taxonomy, content hubs).\n\n**Sample size.** Per Nielsen Norman Group, **30+ participants** is the threshold for quantitatively reliable card-sort patterns.\n\n**Limitations.** Tests organization in isolation, not in the context of real tasks. See our [card sorting guide](/docs/card-sorting-guide) for full methodology.\n\n### 5. Tree testing\n\n**What it is.** Participants are given a navigation tree (no visual design) and asked to find specific items. Quantitative outputs include success rate, time to find, directness, and first-click accuracy.\n\n**When to use.** To validate or compare information architecture before investing in design and build. Often paired with card sorting (card sort to design IA, tree test to validate it).\n\n**Sample size.** **50+ participants** per tree is the standard for quantitative confidence.\n\n**Limitations.** Tests the IA only — not visual design, copy, or interaction. See our [tree testing guide](/docs/tree-testing-guide).\n\n### 6. System Usability Scale (SUS)\n\n**What it is.** A standardized 10-question Likert questionnaire that produces a single 0–100 usability score. SUS has been used in over 1,300 published studies, making it the most-validated usability metric in the field.\n\n**When to use.** When you need a comparable, benchmarkable usability score over time, across products, or against industry norms. Industry average SUS is ~68; a score above 80 is considered \"above average.\"\n\n**Sample size.** **30+ participants** for stable scores; smaller samples can produce wide confidence intervals.\n\n**Limitations.** A single composite score — useful for benchmarking, weak for diagnosing specific issues. See our [SUS guide](/docs/system-usability-scale-guide).\n\n### 7. Single Ease Question (SEQ) and task-level metrics\n\n**What it is.** A single 7-point post-task rating (\"Overall, how difficult or easy was this task?\"). Often paired with completion rate, time on task, and error count.\n\n**When to use.** Inside any moderated or unmoderated usability test where you want comparable difficulty scores across tasks or designs.\n\n**Sample size.** **15+ for stable averages** at the task level.\n\n**Limitations.** Rates perceived ease, not actual ease — useful but should be triangulated with completion rate. See our [SEQ guide](/docs/single-ease-question-seq-guide).\n\n### 8. Preference and ranking studies\n\n**What it is.** Participants choose between two or more design options or rank a list. Modern variants include **MaxDiff** (forced trade-offs across many items) and **conjoint analysis** (decomposing preferences across feature combinations).\n\n**When to use.** Pricing research, feature prioritization, message testing, brand positioning. MaxDiff is dramatically more discriminating than rating scales when you need to rank many items.\n\n**Sample size.** 100–300 for stable rankings; conjoint typically needs 200+.\n\n**Limitations.** Stated preference often diverges from revealed preference — what users say they want isn't always what they choose. See our [preference testing guide](/docs/preference-testing-guide), [MaxDiff guide](/docs/maxdiff-analysis-guide), and [conjoint analysis guide](/docs/conjoint-analysis-guide).\n\n### 9. Net Promoter Score (NPS), CSAT, and CES\n\n**What it is.** Three of the most common standardized customer-experience metrics:\n- **NPS** (\"how likely are you to recommend?\") — relationship-level loyalty\n- **CSAT** (\"how satisfied?\") — interaction-level satisfaction\n- **CES** (\"how easy was it?\") — effort to complete a goal\n\n**When to use.** As ongoing tracking metrics tied to specific touchpoints. Most useful when paired with an open-ended follow-up that captures the *why*.\n\n**Sample size.** Hundreds per cohort for stable scoring; thousands for trend detection.\n\n**Limitations.** All three are lagging indicators and notoriously sensitive to wording, channel, and timing. See our [NPS guide](/docs/nps-survey-guide), [CSAT guide](/docs/csat-survey-guide), and [CES guide](/docs/customer-effort-score-guide).\n\n## Quantitative research sample-size cheat sheet\n\n| Method | Recommended minimum | Notes |\n|---|---|---|\n| Survey (±5% margin, 95% CI) | 385 | For populations >10k |\n| A/B test (small effect) | 1,000s per variant | Driven by baseline + MDE |\n| Card sort | 30+ | Per NN/g |\n| Tree test | 50+ | Per NN/g |\n| SUS | 30+ | For stable composite |\n| SEQ at task level | 15+ | Triangulate with completion |\n| MaxDiff / conjoint | 200+ | Higher for many items |\n| NPS / CSAT / CES | 100s–1000s | Per cohort |\n\n## Common quantitative research mistakes\n\n- **Running quant before qual.** You can't measure something if you don't know what to measure. Almost every successful survey is preceded by 6–10 interviews that surface what to ask about.\n- **Ignoring statistical significance.** A 5% lift in conversion across 80 users is noise. Design tests with the sample size your effect size requires.\n- **Cherry-picking the metric.** \"Engagement is up 20%!\" — but conversion is flat. Pick a primary metric *before* running the test.\n- **Treating the score as the insight.** SUS=72 isn't actionable. SUS=72, with the lowest sub-scores on questions 4 and 10, is.\n- **Forgetting the comparison.** A standalone CSAT of 4.1/5 means nothing without a baseline, benchmark, or trend line.\n- **Survey-only research.** Surveys answer the questions you thought to ask. Pair with interviews to surface the questions you didn't.\n\n## Mixed methods: where quant gets its meaning\n\nThe modern best practice is **mixed-methods research** — pairing quantitative measurement with qualitative depth in the same study. NN/g recommends combining behavioral analytics (what users do) with attitudinal interviews (what users think) to triangulate findings.\n\nClassic mixed-methods workflow:\n\n1. **Discover** with 5–10 qualitative interviews → identify candidate problems\n2. **Quantify** with a survey or analytics → measure prevalence and impact\n3. **Validate** with A/B test or usability test → confirm intervention works\n4. **Track** with NPS / SUS / CSAT → monitor outcome over time\n\nSee our [mixed-methods research guide](/docs/mixed-methods-research-guide) for end-to-end examples.\n\n## How AI-native research closes the quant/qual gap\n\nFor decades, quantitative and qualitative research lived in separate tools, separate teams, and separate timelines. Quant teams shipped a survey in days; qualitative teams shipped a synthesis in weeks. AI-native research platforms are dissolving that wall.\n\nWhen an AI moderator can run **structured questions** and **adaptive open-ended probes** in the same conversation, every interview becomes a *mixed-methods study*. Per Lyssna's 2025 research, **54.7% of researchers now use AI-assisted analysis** — and the most advanced workflows are already running thousands of moderated conversations per month with full thematic analysis applied automatically.\n\n## How Koji unifies quant and qual in one study\n\nKoji is built on the premise that quantitative and qualitative data should come from the same conversation, not separate tools. Here is how that plays out:\n\n- **Six structured question types** in every study — *open_ended, scale, single_choice, multiple_choice, ranking, yes_no*. Scale and ranking questions produce quantitative distributions; open_ended questions produce qualitative depth, all in the same interview. See our [structured questions guide](/docs/structured-questions-guide).\n- **AI-moderated probing.** When a respondent picks \"3 out of 5\" on a scale, the AI consultant adaptively asks *why* — turning a single quant data point into a quote-backed insight.\n- **Real-time aggregation.** Distributions, charts, and theme clusters update live as responses come in. You see a histogram and the supporting verbatims side by side.\n- **Quality scoring.** Every interview is rated 1–5 on response depth, so low-effort responses don't pollute your quant or qual analysis.\n- **Statistical confidence indicators.** Reports show sample sizes and surface when a finding is durable vs. early-signal.\n\nWhile traditional setups require Qualtrics for the survey, Dovetail for the qualitative analysis, and a researcher to sit between them, Koji collapses the entire pipeline into a single AI-native workflow — typically delivering both quantitative and qualitative findings in 24–48 hours.\n\n## Quantitative research vs. analytics: what's the difference?\n\nA common confusion: isn't product analytics already quantitative research? Sort of, but with a key distinction:\n\n- **Analytics** is *passive observation* of behavior in your live product. It tells you what your existing users did.\n- **Quantitative research** is *active measurement* — typically a designed study with a hypothesis, a control, and a defined sample. It tells you what *all* potential users would likely do, or how a change would shift behavior.\n\nAnalytics is the cheapest source of quantitative signal you have, and most teams underuse it. But it can't answer questions about non-users, alternative designs, or root cause — that's where the nine methods above earn their keep.\n\n## Related Resources\n\n- [Qualitative vs. Quantitative Research](/docs/qualitative-vs-quantitative-research)\n- [Mixed Methods Research Guide](/docs/mixed-methods-research-guide)\n- [Structured Questions Guide](/docs/structured-questions-guide)\n- [Survey Design Best Practices](/docs/survey-design-best-practices)\n- [System Usability Scale Guide](/docs/system-usability-scale-guide)\n- [A/B Testing vs. User Research](/docs/ab-testing-vs-user-research)\n- [How to Analyze Open-Ended Survey Responses](/docs/ai-analyze-open-ended-survey-responses)","category":"Research Methods","lastModified":"2026-05-11T03:19:23.254593+00:00","metaTitle":"Quantitative User Research: 9 Methods, Examples & When to Use — Koji","metaDescription":"Complete guide to quantitative user research — the 9 core methods, sample-size rules, mixed-methods workflows, and how AI-native research closes the quant/qual gap.","keywords":["quantitative user research","quantitative research methods","quantitative ux research","quantitative research examples","quant ux research","quantitative vs qualitative research","user research methods","sample size user research"],"aiSummary":"A pillar guide to quantitative user research methods. Defines quant vs qual research, walks through nine core methods (surveys, A/B testing, analytics, card sorting, tree testing, SUS, SEQ, preference/MaxDiff/conjoint, and NPS/CSAT/CES) with sample-size rules and limitations, plus a mixed-methods workflow and how AI-native platforms unify quantitative scale with qualitative depth.","aiPrerequisites":["Familiarity with basic user research concepts","Understanding of qualitative interviews"],"aiLearningOutcomes":["Distinguish quantitative from qualitative research and know when each is appropriate","Identify and apply the nine core quantitative user research methods","Calculate appropriate sample sizes for surveys, A/B tests, card sorts, tree tests, and standardized questionnaires","Combine quantitative and qualitative methods in a mixed-methods workflow","Avoid the most common quantitative research mistakes","Recognize how AI-native research compresses the quant/qual divide"],"aiDifficulty":"intermediate","aiEstimatedTime":"16 min read"}],"pagination":{"total":1,"returned":1,"offset":0}}