{"site":{"name":"Koji","description":"AI-native customer research platform that helps teams conduct, analyze, and synthesize customer interviews at scale.","url":"https://www.koji.so","contentTypes":["blog","documentation"],"lastUpdated":"2026-05-21T13:44:11.047Z"},"content":[{"type":"documentation","id":"397f712c-aef9-4792-b2f7-85e9e24cf9e0","slug":"survey-sample-size-guide","title":"Survey Sample Size: How Many Responses Do You Really Need? (2026 Guide)","url":"https://www.koji.so/docs/survey-sample-size-guide","summary":"For most product surveys, 384 responses provide ±5% margin of error at 95% confidence for any population above ~20,000. This guide covers the n = (z² × p × (1-p)) / e² formula, finite population correction, statistical power for A/B comparisons, sample size benchmarks by use case (concept testing, NPS, pricing research, conjoint), and when 15–30 AI-moderated interviews beat 1,000 survey responses for depth.","content":"# Survey Sample Size: How Many Responses Do You Really Need? (2026 Guide)\n\n**Answer-first (BLUF):** For most product and marketing surveys, **384 responses give you a ±5% margin of error at 95% confidence** for any population larger than ~20,000 — and that number barely moves whether you have 50,000 users or 50 million. For directional decisions you can act on quickly, 100–200 responses are often enough. For statistical comparisons between segments, you need 384 per segment (not total). And for qualitative depth — the kind of \"why\" that no sample size formula can capture — switch from surveys to AI-moderated interviews, where 15–30 conversations consistently outperform 1,000 multiple-choice answers.\n\n## The one-paragraph version (if you only read this)\n\nIf you need a number right now: use **n = 384** for a one-time decision-grade survey on a population of any reasonable size, with 95% confidence and a ±5% margin of error. If you're comparing two groups (e.g., free vs. paid users), use 384 *per group*. If your population is under 1,000, use a finite-population correction (formulas below). And remember: the *quality* of your sample matters more than the *quantity* — 100 well-recruited responses crush 5,000 self-selected ones every time.\n\n## What sample size actually means\n\nIn survey research, sample size is the number of responses you collect from a defined target population. The reason it matters: you're using the sample to draw conclusions about the entire population, and the math of statistical inference says you can only be *so confident* in those conclusions based on how many people you ask.\n\nThree numbers drive every sample size calculation:\n\n1. **Confidence level** — the probability that your sample's answer is within your margin of error of the true population answer. **95% is the standard** in product research; 90% is acceptable for directional work; 99% is reserved for high-stakes regulatory or clinical contexts.\n2. **Margin of error** — the plus-or-minus accuracy you'll tolerate. ±5% is standard for product surveys; ±3% for more rigorous work; ±10% for early-stage exploration.\n3. **Population size** — the total number of people in the group you want to learn about. Here's the surprising part: above ~20,000 people, the population size stops affecting the required sample size meaningfully.\n\nA fourth number — **expected response variance** (often called *p*) — also matters. Researchers conservatively assume *p = 0.5* (maximum variance) when they don't know the population's answer distribution. This gives the largest required sample size and is the safe default.\n\n## The formula every researcher should memorize\n\nFor an infinite (or very large) population:\n\n```\nn = (z² × p × (1-p)) / e²\n```\n\nWhere:\n- **n** = required sample size\n- **z** = z-score for your confidence level (1.645 for 90%, **1.96 for 95%**, 2.576 for 99%)\n- **p** = expected proportion (use **0.5** when unknown — most conservative)\n- **e** = margin of error in decimal form (0.05 for ±5%)\n\nPlugging in the standard values (95% confidence, ±5% margin, p = 0.5):\n\n```\nn = (1.96² × 0.5 × 0.5) / 0.05²\nn = (3.8416 × 0.25) / 0.0025\nn = 384.16\n```\n\nThat's where the famous **384 number** comes from. It's the universal \"good enough\" sample size for any large population at standard rigor.\n\n### For smaller populations: the finite population correction\n\nIf your population is under ~20,000, apply the finite population correction:\n\n```\nn_adjusted = n / (1 + ((n - 1) / N))\n```\n\nWhere **N** is your total population size. For a population of 500, the corrected sample is 217 (not 384) — a meaningful savings for B2B research on small named-account lists.\n\n### Standard sample size reference table\n\n| Population size | Required sample (95% confidence, ±5%) |\n|----|----|\n| 100 | 80 |\n| 250 | 152 |\n| 500 | 217 |\n| 1,000 | 278 |\n| 5,000 | 357 |\n| 10,000 | 370 |\n| 50,000 | 381 |\n| 100,000+ | 384 |\n\n## What about statistical power?\n\nSample size formulas above answer the question \"how precisely can I estimate one number?\" If you're comparing groups (A/B testing, segment differences, pre/post analysis), you need **statistical power analysis** instead.\n\n**The 80% power rule of thumb:** Most researchers set statistical power at 0.80 — meaning if there really is a difference between groups, you'll detect it 80% of the time. According to peer-reviewed methodology research, \"the minimum power of a study required is ideally 80%, which is a commonly accepted benchmark in research methodology.\"\n\nThe sample size you need for an A/B comparison depends on:\n- **Effect size** (how big a difference you care about detecting)\n- **Significance level** (alpha, usually 0.05)\n- **Power** (usually 0.80)\n- **Baseline rate** (your control group conversion or response rate)\n\nFor a typical product survey comparing two segments where you want to detect a 5-percentage-point difference at 95% confidence and 80% power, you need roughly **385 responses per group** (770 total). To detect a smaller 2-point difference, that jumps to **2,400 per group**.\n\nThis is why \"we got 500 responses, let's slice it ten ways\" almost always produces underpowered analyses. Each slice needs to clear the per-group sample size threshold.\n\n## Sample size benchmarks by use case\n\nFormulas give you statistical floors. Real-world benchmarks tell you what working researchers actually use:\n\n| Use case | Practical sample size | Why |\n|----------|----------------------|-----|\n| Concept validation (single concept) | 50–150 | Directional read on appeal, fast turnaround |\n| Concept testing (multiple variants) | 100 per variant | A/B level comparison |\n| Pricing research (e.g., Van Westendorp) | 300–500 | Need range estimates, not just a point |\n| NPS measurement (single market) | 300–400 | Confidence interval on the score |\n| NPS comparison across segments | 300 per segment | Each segment needs its own n |\n| Brand tracking wave | 300–500 per wave | Detect quarter-over-quarter movement |\n| Customer satisfaction (CSAT) | 384+ | Standard ±5% precision |\n| Persona research | 50–100 per persona | Plus 15-30 qualitative interviews |\n| Internal employee survey | Census preferred | Just ask everyone if you can |\n| Pre/post product launch | 300 each wave | Power to detect 5-point lift |\n| Conjoint analysis | 300–500 | Need enough choice tasks |\n\n## The most common sample size mistakes\n\n### 1. Confusing total sample with per-segment sample\nIf you're going to slice your survey by industry, role, or company size, every slice you care about needs to clear the sample size threshold *independently*. A 400-person survey split across 5 industries gives you 80 per industry — underpowered for anything but the broadest claims.\n\n### 2. Treating self-selected respondents as a random sample\nThe sample size formula assumes random sampling. A pop-up survey on your homepage isn't random — it overweights frequent visitors. A LinkedIn poll skews toward your network. Calculate your sample size for the question you can actually answer (e.g., \"what do my homepage visitors think\") not the one you wish you could (\"what do users think\").\n\n### 3. Ignoring response rate when sizing distribution\nIf you need 400 completed responses and your typical email response rate is 5%, you need to send invitations to 8,000 people. Plan distribution backward from completes.\n\n### 4. Defaulting to \"as many as we can get\"\nThis isn't cost-free. Long surveys with too many respondents:\n- Inflate cost and incentive spend\n- Make analysis slower\n- Tempt you into over-slicing\n- Can introduce more noise as quality declines past the optimal sample\n\nDecide your target sample, hit it, and stop.\n\n### 5. Forgetting that quality > quantity\nA 100-response survey from a well-screened panel of your actual customer ICP will out-predict a 5,000-response survey from a Facebook ad. Sample size is the *floor* for statistical confidence; sample *quality* is the ceiling on insight.\n\n## When sample size is the wrong question\n\nSample size formulas assume you're measuring something you can already define — a known metric like NPS, a known choice like preference between concepts, a known proportion like \"% who would buy at price X.\"\n\nIf you're still trying to understand *what to measure* — what users actually care about, why they churn, what frustrates them about your category — sample size becomes a distraction. You need depth, not breadth. The right tool isn't a 1,000-person survey; it's 15–30 qualitative interviews.\n\nResearch from Nielsen Norman Group and others has consistently shown that **roughly 5 user interviews surface ~85% of the major usability issues** in a flow, and 15–30 conversations reach thematic saturation for most discovery questions. For exploratory work, you don't need more respondents — you need richer conversations with fewer.\n\nThis is where AI-moderated interview platforms have completely rewritten the trade-off.\n\n## The modern AI-native approach with Koji\n\nThe historical reason teams over-relied on surveys was simple: interviews were expensive. Recruiting, scheduling, moderating, transcribing, and analyzing 30 interviews cost more than running a 1,000-person survey — so PMs picked the survey, even when the question called for depth.\n\nAI-moderated platforms like Koji collapse the interview cost curve and change the calculus:\n\n- **AI moderates interviews 24/7.** A 30-person interview study that used to take 4–6 weeks now finishes in days, with Koji's AI conducting and probing each conversation in real time.\n- **Hybrid structured + open-ended in one session.** Koji supports all 6 structured question types ([scale, single_choice, multiple_choice, ranking, yes_no, open_ended](/docs/structured-questions-guide)) inside the same interview. You get survey-quality numbers *and* interview-depth context from every respondent — no need to choose.\n- **Automatic thematic analysis.** Instead of manually coding 30 transcripts (40+ hours of work), Koji surfaces themes, sentiment, and quotes automatically. You spend your time on interpretation, not data entry.\n- **Real-time reporting as responses come in.** Watch themes emerge while the study is still in field. Decide *during* the study whether you've reached saturation, instead of guessing at the start.\n- **Sample size flexibility.** Because each interview costs a fraction of traditional moderated research, you can comfortably run 50–200 person interview studies that previously would have been replaced by a thin survey.\n\nWhile traditional survey tools like SurveyMonkey and Qualtrics require you to pick \"wide and shallow\" *or* \"narrow and deep\" — and then plug in a sample size calculator to figure out wide-and-shallow — AI-native platforms like Koji let you have both at once. The sample size question itself shifts: instead of \"how many responses do I need to be confident?\" it becomes \"how many conversations do I need to understand the *why*?\"\n\nThat's a better question.\n\n## How to choose your sample size in 5 steps\n\n1. **Write down your decision.** What will you do differently based on the result? If the answer is \"nothing meaningful,\" reduce your sample size — you're over-investing.\n2. **Identify your target population.** B2B buyers at Series B SaaS companies? Free users of your iOS app? Decision matters: it changes both the formula and the recruiting strategy.\n3. **Pick your confidence level and margin of error.** Defaults: 95% and ±5%. Only deviate with a reason.\n4. **List the comparisons you need to make.** Every group you'll compare needs its own sample size, not a slice of one total.\n5. **Plan distribution for 3–5× your target n** based on expected response rate, screening attrition, and quality removals.\n\n## Quick sample size cheat sheet\n\n- **One-time directional read:** 100–200 responses\n- **Decision-grade single metric:** 384 responses\n- **Two-segment comparison:** 384 per segment (768 total)\n- **Pricing or conjoint:** 300–500 responses\n- **Brand tracking wave:** 300–500 per wave\n- **Qualitative depth:** 15–30 AI-moderated interviews (skip the survey)\n- **Anything involving slicing more than 5 ways:** rethink the study design — you probably need a different methodology\n\n## Related Resources\n\n- [Structured questions guide](/docs/structured-questions-guide) — Get survey-grade numbers and interview-grade depth in one session\n- [How many user interviews you need](/docs/how-many-user-interviews) — The qualitative counterpart to this guide\n- [Survey design best practices](/docs/survey-design-best-practices) — Get the *most* out of every response\n- [Qualitative vs quantitative research](/docs/qualitative-vs-quantitative-research) — Picking the right method, not just the right sample size\n- [Mixed methods research guide](/docs/mixed-methods-research-guide) — When you genuinely need both\n- [How to increase survey response rates](/docs/how-to-increase-survey-response-rates) — Practical tactics for hitting your target n\n\n---\n\n*Sources: Memon et al., \"Sample Size for Survey Research: Review and Recommendations,\" Journal of Applied Structural Equation Modeling (2020); Cochran, \"Sampling Techniques\" (1977); Hair et al., \"A Primer on Partial Least Squares Structural Equation Modeling\" (2017); Qualtrics Sample Size Calculator methodology documentation; CloudResearch sample size guide.*","category":"Research Methods","lastModified":"2026-05-21T03:24:55.795516+00:00","metaTitle":"Survey Sample Size: How Many Responses You Need (2026)","metaDescription":"For most surveys, 384 responses give ±5% margin at 95% confidence. Full guide with formulas, benchmarks by use case, and when to switch to AI interviews instead.","keywords":["survey sample size","how many survey responses","sample size calculator","statistical significance survey","sample size formula","survey statistical power","minimum sample size"],"aiSummary":"For most product surveys, 384 responses provide ±5% margin of error at 95% confidence for any population above ~20,000. This guide covers the n = (z² × p × (1-p)) / e² formula, finite population correction, statistical power for A/B comparisons, sample size benchmarks by use case (concept testing, NPS, pricing research, conjoint), and when 15–30 AI-moderated interviews beat 1,000 survey responses for depth.","aiPrerequisites":["Basic familiarity with surveys","Comfort with simple math"],"aiLearningOutcomes":["Calculate the correct sample size for any survey using the standard formula","Apply finite population correction for small populations","Use statistical power analysis for A/B segment comparisons","Pick the right sample size for common use cases (NPS, pricing, concept testing)","Recognize when depth (AI interviews) beats breadth (large surveys)"],"aiDifficulty":"beginner","aiEstimatedTime":"14 min read"}],"pagination":{"total":1,"returned":1,"offset":0}}