The Delphi Method: A Complete Guide to Reaching Expert Consensus
A practical guide to the Delphi method — the structured, multi-round technique for building expert consensus through anonymous questionnaires and controlled feedback. Learn the process, panel size, rounds, and modern AI-assisted alternatives.
The Delphi method is a structured research technique for reaching expert consensus. Instead of putting experts in a room — where the loudest voice or the most senior title often wins — you poll them anonymously across several rounds, share an anonymized summary of the group's answers between rounds, and let each expert revise. Over successive rounds, opinions converge toward a defensible consensus.
It is one of the most reliable ways to make a decision when hard data is thin and judgment is all you have: forecasting demand for a new category, prioritizing a roadmap with no historical analytics, or validating assumptions in a market no one has measured yet.
Where the Delphi Method Comes From
The Delphi method was developed at the RAND Corporation in the early 1950s by Olaf Helmer and Norman Dalkey. The first study, in 1951, was a classified U.S. Air Force project; the methodology was declassified and formally published in 1963 as Dalkey and Helmer's "An Experimental Application of the Delphi Method to the Use of Experts" in the journal Management Science.
In the developers' own framing, the method "was devised in order to obtain the most reliable opinion consensus of a group of experts ... by a series of intensive questionnaires interspersed with controlled opinion feedback." That sentence still defines the technique today.
The Four Defining Features
What separates a true Delphi study from "asking some experts what they think" is four structural properties:
- Anonymity. Experts never know whose answer is whose. This removes reputation, seniority, and personality from the equation so ideas are judged on merit.
- Iteration. The study runs in multiple rounds, giving experts repeated chances to refine their views.
- Controlled feedback. Between rounds, the facilitator returns an anonymized statistical summary — typically the median answer and the spread — so experts can react to the group without being pressured by it.
- Statistical group response. The final output is expressed as a distribution (a median or mean and its range), not a forced unanimous vote. Dissent is preserved and visible.
The Delphi Process, Step by Step
- Define the question. A sharp, answerable problem statement is half the battle. (See our guide on writing a research question.)
- Select the expert panel. Recruit people with genuine, relevant expertise — and deliberately seek diversity of perspective, not just agreement.
- Round one. Often an open-ended questionnaire to surface the full range of considerations. Responses are collected and analyzed.
- Round two. Experts receive the anonymized summary and re-answer, now able to see where the group stands. Many studies introduce rating scales here.
- Subsequent rounds. Repeat until your predefined consensus threshold or stopping rule is met.
- Synthesize. Report the final group position, the level of consensus, and — importantly — any persistent disagreement.
How Many Rounds and Experts?
Most Delphi studies run two or three rounds. A two-round design works well when there is a solid base of existing knowledge to anchor the first questionnaire; more rounds risk fatigue without proportionate gains.
On panel size, the methodology literature (Turoff) recommends roughly 10 to 50 experts, though many published studies use 7 to 15. There is no magic number — the right size depends on how homogeneous your expert pool is. A narrow, expert pool converges quickly; a diverse one needs more participants to be representative.
Defining and Measuring Consensus
Consensus is not "everyone agrees" — it is a threshold you set in advance. A systematic review by Diamond and colleagues (2014), examining 100 Delphi studies, found that while 98 claimed to assess consensus, only 72 actually defined it. The most common measure was percent agreement, with a median threshold of 75%. Rigorous studies also report the interquartile range or a stability measure (how little answers change between rounds).
The lesson is simple: decide your consensus definition and stopping rule before round one. Otherwise it is tempting to keep going until the data says what you hoped.
Why It Works — and the Evidence
The core advantage is bias reduction. By removing face-to-face dynamics, Delphi mutes the groupthink and dominant-voice effects that distort ordinary panels. In Rowe and Wright's review of the forecasting literature, Delphi outperformed the statistical average of individual experts in 12 studies to 2 and beat standard interacting groups in 5 studies to 1, with accuracy tending to improve across rounds. The authors are careful to note the evidence is mixed and improvement is not guaranteed — but the direction is encouraging.
Limitations to Respect
- It is slow. Multiple rounds of questionnaire-summarize-recirculate traditionally take weeks.
- Facilitator influence. How questions are framed and summaries are written shapes the outcome. Neutrality matters.
- Inconsistent consensus definitions. As Diamond et al. showed, many studies never rigorously define what they are measuring.
- Dropout. Attrition of 11 to 33% is common, and far higher in some studies. Because experts with divergent opinions are more likely to drop out, attrition can quietly manufacture a false consensus. Keep panels engaged and report your response rates.
When to Use Delphi (and When Not To)
Delphi shines when you need judgment, not behavior: forecasting, prioritizing under uncertainty, validating a roadmap with dispersed B2B experts, or building agreement among stakeholders who cannot easily meet. It is a poor fit when you can simply observe what users do (use usability testing or analytics) or when you need the depth of a single person's story (use expert interviews or JTBD interviews).
The Modern Approach: Delphi with AI
The Delphi method's biggest weakness has always been operational: the manual grind of distributing questionnaires, summarizing responses, and re-circulating them across rounds. While legacy survey tools like SurveyMonkey can collect the answers, they leave the synthesis — the hard, slow part — entirely to you. AI-native platforms change the economics.
How Koji Helps
- Automated synthesis between rounds. Koji's automatic thematic analysis summarizes open-ended round-one responses instantly, so the controlled feedback that defines a Delphi study no longer takes a researcher days to assemble.
- AI-moderated expert interviews in parallel. Rather than waiting on a slow questionnaire cycle, Koji can run AI-moderated voice or text interviews with each expert simultaneously, probing reasoning in real time — and keeping responses anonymous.
- Six structured question types. Koji's structured questions — open_ended, scale, single_choice, multiple_choice, ranking, and yes_no — map cleanly onto Delphi rounds: open-ended exploration first, then scale and ranking questions to measure convergence.
- Real-time consensus reporting. Track agreement and dispersion across rounds without manual tabulation, and see exactly when your threshold is reached.
Teams using AI-assisted research report sharply faster time-to-insight, turning a multi-week Delphi study into something you can run in days — without sacrificing the anonymity and iteration that make the method work. You do not need a methodology PhD to run one; you need a clear question and the right tool.
A Worked Example: A Delphi Study for a B2B Roadmap
Imagine a B2B SaaS team deciding which three capabilities to build next year in a market with no usage history to lean on. A Delphi study fits perfectly.
Panel. They recruit 14 experts: five long-tenured customers, four internal domain specialists, three industry analysts, and two implementation partners — deliberately diverse so consensus is earned, not assumed.
Round one is open-ended: "What unmet need will most shape buying decisions in this category over the next 18 months?" Responses are collected anonymously and clustered into eight distinct themes.
Round two returns the anonymized eight themes with a short rationale for each. Experts now rate every theme on a 1–7 scale for both impact and confidence, and rank their top three. The facilitator shares the median and interquartile range for each.
Round three shows experts where they sit relative to the group. Those whose ratings fall outside the interquartile range are asked to either revise or briefly justify their position — surfacing the reasoning behind genuine disagreement rather than burying it. Three themes cross the predefined 75% agreement threshold and stabilize across the round.
Outcome. The team leaves with a ranked, defensible set of three priorities, a documented record of why each was chosen, and an explicit note of the dissent that remains. Critically, the analyst who would have dominated a live meeting carried exactly the same weight as the quietest customer — which is the entire point.
This is the kind of structured, bias-resistant decision that is almost impossible to reach in a single workshop, and it is why Delphi endures more than 70 years after RAND first formalized it.
Related Resources
- Expert Interviews Guide — the one-on-one alternative when you need depth over consensus
- Writing a Research Question — get the question right before you recruit your panel
- Research Synthesis Guide — turn rounds of expert input into a decision
- Survey Design Best Practices — design questionnaires that produce clean data
- Structured Questions Guide — the six question types that power each Delphi round
- Jobs to Be Done Framework — pair expert judgment with customer need
Related Articles
Expert Interviews: How to Plan, Recruit, and Run Them
Expert interviews tap subject-matter experts to compress months of learning into a few conversations. Here is how to recruit experts, structure the interview, ask sharp questions, and analyze findings at scale.
Jobs to Be Done Framework: The Complete Guide
The definitive guide to the Jobs to Be Done (JTBD) framework — its history, two schools of thought, how to write JTBD statements, famous examples, how to conduct JTBD research, and how AI interviews enable JTBD at scale.
Research Synthesis: How to Combine Multiple Studies Into Clear Insights
A practical guide to synthesizing findings across multiple research studies — using thematic synthesis, triangulation, and structured data aggregation to build compounding organizational knowledge.
Structured Questions in AI Interviews
Mix quantitative data collection — scales, ratings, multiple choice, ranking — with AI-powered conversational follow-up in a single interview.
Survey Design Best Practices: From Question Writing to Data Collection
Learn how to design effective surveys with proven best practices for question writing, flow, bias reduction, and data collection — including when to go beyond surveys to AI-powered interviews.
Writing a Research Question
Learn how to frame a clear, focused research question that sets the foundation for a successful study.