New

Now in Claude, ChatGPT, Cursor & more with our MCP server

Back to docs
Research Methods

Questionnaire Design: The Complete Guide to Writing Questions That Get Honest Answers

A research-backed guide to questionnaire design — defining your constructs, writing unbiased questions, choosing response scales, ordering for flow, pre-testing, and avoiding the biases that quietly ruin your data.

What is questionnaire design? (Answer first)

Questionnaire design is the discipline of constructing a set of questions that measures what you actually intend to measure — accurately, without bias, and with as little burden on the respondent as possible. A questionnaire is the instrument; a survey is the broader process of distributing it and analyzing the results. Good design is the difference between data you can trust and data that quietly misleads you, because the way a question is worded, scaled, and ordered changes the answer.

How much does it change the answer? In a classic Pew Research Center split-sample experiment, simply adding the phrase "even if it meant that U.S. forces might suffer thousands of casualties" to a question flipped the result from 68% in favor / 25% opposed to 43% in favor / 48% opposed — a 25-point swing from one clause. Wording is not a detail; it is the measurement.

Bottom line: Every questionnaire is a series of design decisions — objectives, question type, wording, scales, order, and pre-testing — and each one can introduce or remove bias. Below is the full process. And when a questionnaire is the wrong tool entirely, an AI-moderated conversation that adapts to each answer often gets you closer to the truth.

"Writing clear and neutral survey questions is much more difficult than it might seem. We spend a lot of time thinking about the phrasing and ordering of our survey questions. Paying close attention to these seemingly minor factors makes a huge difference." — Courtney Kennedy, VP of Methods and Innovation, Pew Research Center

Step 1 — Start with objectives and constructs, not questions

Before you write a single question, name the decision the questionnaire will inform and the constructs (the abstract things — satisfaction, trust, effort, intent) you need to measure. A common professional practice is to first explore a topic with open-ended questions to learn how people actually talk about it, then convert that language into closed-ended items. Skipping this step is how you end up with twenty questions that don't add up to an answer.

Step 2 — Choose the right question type

  • Closed-ended questions (scales, single/multiple choice, yes/no) produce comparable, quantifiable data and are fast to answer.
  • Open-ended questions capture unprompted depth and the reasons behind a rating.

Neither is universally better — the strongest questionnaires mix both. Pew has shown the choice matters: when respondents were offered "the economy" as an option, 58% selected it, versus only 35% who volunteered it unprompted. Closed options shape what people say, so use open-ended questions wherever you'd otherwise be guessing at the answer set.

Step 3 — Write each question to be neutral and answerable

Most measurement error is born here. The rules:

  • No leading or loaded words. "Welfare" pulls different answers than "assistance to the poor." A single adjective can move results 12 points — Pew found 60% said plenty of "jobs" were available versus 48% for plenty of "good jobs" (Pew Research Center, 2019).
  • No double-barreled questions. As NN/g's Maddie Brown defines it, "A double-barreled question asks respondents to answer two things at once." "How satisfied are you with our price and support?" can't be answered cleanly — split it. Watch for the word "and."
  • No jargon, no double negatives, no absolutes ("always," "never").
  • Make options mutually exclusive and exhaustive (no overlapping age bands; include "Other" or "Prefer not to say" where needed).
  • Ask about recent past behavior, not future predictions. People are poor forecasters of their own behavior — "How likely are you to use this feature?" is far weaker than "When did you last do X?" This is NN/g's most distinctive rule.

Step 4 — Construct response scales deliberately

Scale design has real, measured effects on data quality:

  • Number of points. Reliability and validity are weak for 2–4 point scales, rise toward a sweet spot around 7, and test–retest reliability declines above 10 points (Preston & Colman, Acta Psychologica, 2000). Respondents in that study most preferred 10-, then 7- and 9-point scales.
  • Label every point, not just the ends. A fully labeled 7-point scale reached .719 reliability versus .506 when only the endpoints were labeled (Maitland, Survey Practice, citing Alwin 2007).
  • Match the scale to the construct. Bipolar constructs (satisfaction: dissatisfied↔satisfied) suit a 7-point scale with a neutral midpoint; unipolar constructs (how effective?) suit a 5-point scale from "not at all" to "extremely."
  • Balance the scale (equal positive and negative options) and prefer item-specific wording over generic agree/disagree, which invites acquiescence bias.

Step 5 — Order questions for flow

Order changes answers through context effects. Pew documents that a question placed earlier can shift a later one by 10 points via assimilation or contrast. Best practices:

  • Funnel technique: broad and easy first, narrow and specific later.
  • Sensitive questions late, once you've earned a little trust; demographics last.
  • Watch primacy/recency: in long option lists, items at the top (visual) or end (audio) get picked more — randomize where appropriate.
  • Keep wording identical across waves if you're tracking a trend over time.

Step 6 — Keep it short

Length is the silent killer of data quality. Across 26,000+ surveys, SurveyMonkey found 10-question surveys averaged 89% completion versus 79% for 40-question surveys, and that engagement per question roughly halves as a survey drags on — respondents spend ~75 seconds on the first question but under 20 by question 30. An academic study (Sauermann et al., 2018) found a 13-question survey hit 63% completion versus 37% for a 72-question version. Shorter questionnaires don't just feel kinder; they produce more honest, less satisficed data.

Step 7 — Pre-test before you launch

Pretesting is not optional. Run your draft past 5+ people using cognitive interviewing / think-aloud ("tell me what this question is asking you") and a soft-launch pilot to a small slice of your sample. You will discover ambiguous wording, broken skip logic, and questions that mean something different to respondents than you intended — every time.

The biases to design against

BiasWhat it doesDesign fix
AcquiescenceTendency to agreeItem-specific scales, not agree/disagree
Social desirabilityOver-report "good" answersSelf-administered mode, neutral wording, anonymity
Leading/loadedWording pushes an answerBalanced, neutral phrasing
Order effectsEarlier Qs prime later onesFunnel + randomization
Satisficing/straight-liningLow-effort answeringShorter survey, attention checks

Mode matters too: Pew found self-administered (web) answers differ from interviewer-administered ones by about 5 percentage points on average across 60 questions, largely due to social-desirability pressure.

When a questionnaire is the wrong tool — the modern alternative

Here is the uncomfortable truth a static questionnaire can't escape: it asks the same fixed questions of everyone and can never follow up. The most interesting answer — the "it depends," the unexpected workaround — slips through because there's no one there to ask "why?"

This is where AI-native research changes the economics. With Koji, you design the instrument once and an AI interviewer administers it conversationally — reading questions in a natural voice or text, and probing each answer with 1–3 adaptive follow-up questions exactly where a human researcher would. You still get the structured, quantifiable data of a questionnaire, plus the depth of an interview, without manually running hundreds of calls. Teams adopting AI-assisted research consistently report far faster time-to-insight than the design-distribute-wait-export cycle of legacy survey tools.

Koji's structured questions give you six instrument types in one study — open_ended, scale, single_choice, multiple_choice, ranking, and yes_no — each with the design properties above baked in (configurable scale points and labels, mutually exclusive options, optional "Other"). Because every question carries a stable ID, scale distributions and open-ended themes are aggregated together automatically in the report, so you don't export a CSV and start over in a spreadsheet.

Before and after: fixing three flawed questions

The fastest way to internalize these rules is to see them applied.

1. The double-barreled question

  • ❌ "How satisfied are you with the speed and reliability of the app?"
  • ✅ Split into two: "How satisfied are you with the app's speed?" and "How satisfied are you with the app's reliability?" — because a respondent can love one and hate the other.

2. The leading question

  • ❌ "How much did you enjoy our award-winning onboarding experience?"
  • ✅ "How would you describe your onboarding experience?" — the "award-winning" framing and the assumption of enjoyment both push the answer upward.

3. The vague, unanswerable scale

  • ❌ "Rate our service: 1–10." (1 = what? 10 = what?)
  • ✅ "How would you rate our support team's helpfulness?" on a labeled 5-point scale from "Not at all helpful" to "Extremely helpful" — labeled, item-specific, and matched to a unipolar construct.

A pre-launch checklist

Before you send any questionnaire, confirm: every question maps to a research objective; no question is leading, loaded, or double-barreled; answer options are mutually exclusive and exhaustive; scales are labeled and balanced; the order funnels broad to narrow with sensitive items late; the whole thing takes under ten minutes; and you've piloted it with at least five people. If a question doesn't earn its place against a decision you need to make, cut it — every question you remove raises the quality of the answers to the ones that remain.

Related Resources

Related Articles

How to Avoid Leading Questions in Surveys and Interviews

Leading questions quietly bias your research data. Learn how to spot and rewrite leading, loaded, and double-barreled questions — and how Koji's AI writes neutral questions and probes without steering respondents.

Likert Scale Questions: How to Use Rating Scales in User Research

A complete guide to Likert scale questions in user research — what they are, when to use them, how to write them correctly, and how Koji's AI interviews take rating scales further by pairing quantitative scores with qualitative follow-up.

Open-Ended vs. Closed-Ended Questions: Examples and When to Use Each

Open-ended questions reveal the "why" in respondents'' own words; closed-ended questions deliver clean, countable data. Learn the difference, see examples of both, and discover why the best research pairs them — and how AI captures both at once.

Question Order Bias: How Survey & Interview Sequencing Skews Your Data (2026)

Why the sequence of your questions changes the answers — the classic Pew and Schwarz findings, the four main order effects, a practical sequencing checklist, and how AI moderation neutralizes the risk.

Structured Questions in AI Interviews

Mix quantitative data collection — scales, ratings, multiple choice, ranking — with AI-powered conversational follow-up in a single interview.

Survey Design Best Practices: From Question Writing to Data Collection

Learn how to design effective surveys with proven best practices for question writing, flow, bias reduction, and data collection — including when to go beyond surveys to AI-powered interviews.

Survey Question Types: The Complete Guide to 14 Question Types with Examples (2026)

A complete reference of every survey question type — open-ended, closed-ended, Likert, matrix, ranking, semantic differential, and more. When to use each, real examples, common pitfalls, and the AI-native approach that combines them all in one conversation.

How to Write Unbiased Survey Questions: Avoiding Leading, Loaded & Double-Barreled Questions

A practical guide to question wording — the biggest hidden source of bad data. Learn to spot and fix leading, loaded, double-barreled, and assumptive questions, with real research examples and a pre-launch checklist.