{"site":{"name":"Koji","description":"AI-native customer research platform that helps teams conduct, analyze, and synthesize customer interviews at scale.","url":"https://www.koji.so","contentTypes":["blog","documentation"],"lastUpdated":"2026-06-06T07:23:21.441Z"},"content":[{"type":"documentation","id":"8a658bce-bf69-4322-a3f9-42326d68a177","slug":"desirability-testing-guide","title":"Desirability Testing: Measuring How a Design Makes Users Feel","url":"https://www.koji.so/docs/desirability-testing-guide","summary":"Desirability testing measures the emotional response a design evokes — the \"do I want this?\" that usability testing ignores. The classic Microsoft Reaction Cards method asks participants to pick adjectives describing how a design feels, then explain why. AI-moderated interviews scale this by presenting the words as a structured multiple-choice question and automatically probing each selection, yielding word-frequency data plus the reasoning in one unmoderated session.","content":"## What Is Desirability Testing? (BLUF)\n\nDesirability testing measures the **emotional and aesthetic response** a design, brand, or product evokes — the gut-level \"do I want this?\" that usability testing ignores. A usability test tells you whether people *can* use your interface; a desirability test tells you whether they *like* it and what feelings it triggers. The classic method, pioneered at Microsoft, hands participants a deck of ~118 adjective cards (\"Trustworthy,\" \"Confusing,\" \"Innovative,\" \"Slow\") and asks them to pick the words that describe how a design makes them feel — then to explain *why*.\n\nThe \"why\" is the hard part. Picking words is fast; understanding the reasoning behind those words is where the insight lives — and where traditional facilitation gets expensive. With a platform like Koji, you run desirability testing as an [AI-moderated interview](/docs/how-ai-interviewers-work) that presents the reaction words as a [multiple-choice question](/docs/multiple-choice-questions-ai-interviews), then [automatically probes](/docs/probing-and-follow-up-questions) each chosen word: *\"You called it 'overwhelming' — what specifically felt like too much?\"* You get the quantitative word-frequency data *and* the qualitative explanation in a single, unmoderated session.\n\n---\n\n## Why Desirability Matters\n\nPeople rationalize decisions, but they *make* them emotionally. Two products can be equally usable yet perform completely differently in the market because one feels trustworthy, modern, and effortless while the other feels clunky and generic. Desirability testing surfaces that gap early — before you ship, and while it''s still cheap to change.\n\nUse it to:\n\n- **Compare design directions** — which of three landing-page concepts feels most \"credible\" to your target buyer?\n- **Validate a rebrand** — does the new identity read as \"premium\" or \"cold\"?\n- **Catch emotional landmines** — a checkout flow that tests as usable but feels \"untrustworthy\" will still leak conversions.\n- **Differentiate from competitors** — discover the adjectives you own vs. the ones rivals own.\n\nDesirability pairs naturally with [usability testing](/docs/usability-testing-guide) and [concept testing](/docs/concept-testing-methodology): usability checks *can they*, concept checks *do they get it*, desirability checks *do they want it*.\n\n---\n\n## The Microsoft Reaction Cards Method\n\nMicrosoft researchers Joey Benedek and Trish Miner introduced the Product Reaction Cards in 2002: a set of 118 adjectives, deliberately balanced at roughly 60% positive and 40% negative/neutral so participants aren''t nudged toward flattery.\n\nThe classic protocol:\n\n1. **Expose** the participant to the design (a screenshot, prototype, live site, or brand mark).\n2. **Present the word set** and ask them to choose the words that best describe their reaction. (Traditionally 3–5 words to force prioritization.)\n3. **Probe** the top choices: *\"Why did you pick this word? What about the design made you feel that?\"*\n4. **Aggregate** across participants to find the dominant adjectives and the outliers.\n\nThe output is a word-frequency map (often shown as a sorted list or word cloud) plus the verbatim reasoning behind each high-frequency term.\n\n### A starter reaction-word set\n\nIf 118 cards is too many for an unmoderated session, a curated 20–30 word subset works well. Balance positive and negative:\n\n> Trustworthy · Innovative · Clean · Professional · Friendly · Fast · Confusing · Overwhelming · Generic · Dated · Cluttered · Intimidating · Calm · Premium · Approachable · Boring · Cheap · Reliable · Playful · Confident · Slow · Cold\n\n---\n\n## Running Desirability Testing at Scale with AI\n\nThe traditional bottleneck is moderation. Spreading physical or digital cards, recording selections, and interviewing each person about their choices is slow and doesn''t scale past a handful of sessions. Here''s how an AI-native workflow removes that ceiling:\n\n### 1. Configure the reaction set as a structured question\n\nCreate a [multiple_choice question](/docs/structured-questions-guide) listing your reaction words, capped to 3–5 selections. Koji supports [six structured question types](/docs/structured-questions-guide) — open_ended, scale, single_choice, multiple_choice, ranking, and yes_no — so you can combine the word picker with a 1–10 [desirability scale](/docs/scale-questions-guide) (\"How appealing is this design to you?\") in the same flow.\n\n### 2. Let the AI probe every selection\n\nThis is the differentiator. After a participant picks \"Premium\" and \"Confusing,\" Koji''s AI interviewer follows up on each: *\"You picked 'premium' — which part of the design gave you that impression?\"* and *\"And 'confusing' — where did you get lost?\"* No moderator required, available 24/7, in [voice or text](/docs/voice-vs-text-interviews).\n\n### 3. Aggregate automatically\n\nBecause each reaction word is a structured value, Koji produces a frequency distribution across all participants automatically — while [thematic analysis](/docs/thematic-analysis-guide) clusters the *reasons* behind each word into [coded themes](/docs/understanding-themes-patterns). You see not just that 14 of 20 people said \"trustworthy,\" but the three recurring reasons they gave.\n\n### 4. Compare variants\n\nRun the same desirability test against Design A and Design B as separate studies, then compare their adjective profiles side by side. This turns a subjective \"I think B looks better\" debate into evidence.\n\n---\n\n## How Many Participants?\n\nDesirability testing tolerates small samples for directional reads — 15–25 participants per design usually reveals the dominant adjectives. For [statistically comparable](/docs/survey-sample-size-guide) word frequencies between variants, push toward 40–50 per cell. Because AI interviews are unmoderated and run in parallel, scaling to those numbers costs you setup time, not facilitation time.\n\n---\n\n## Common Pitfalls\n\n- **Letting people pick too many words.** Unlimited selection floods you with mild positives. Cap at 3–5 to force genuine prioritization.\n- **Skipping the \"why.\"** The word frequencies are interesting; the reasoning is actionable. Always probe. (This is exactly the step manual studies cut for time — and AI moderation restores.)\n- **An all-positive word set.** If every adjective flatters, you''ll learn nothing. Keep the ~60/40 positive-to-critical balance.\n- **Testing in isolation.** Pair desirability with [first-click testing](/docs/first-click-testing-guide) or [usability testing](/docs/usability-testing-guide) so emotional and functional findings inform each other.\n- **Ignoring your target segment.** \"Premium\" from a bargain shopper and \"premium\" from your actual ICP mean different things — [screen participants](/docs/screener-questions-guide) accordingly.\n\n---\n\n## Desirability vs. Related Methods\n\n| Method | Question it answers |\n|--------|---------------------|\n| **Desirability testing** | How does this make you *feel*? |\n| **[Usability testing](/docs/usability-testing-guide)** | Can you *use* it? |\n| **[Concept testing](/docs/concept-testing-methodology)** | Do you *understand and value* the idea? |\n| **[Preference testing](/docs/preference-testing-guide)** | Which option do you *prefer*? |\n| **[Five-second test](/docs/5-second-test-guide)** | What''s the *first impression*? |\n\n---\n\n## Related Resources\n\n- [Structured Questions in AI Interviews](/docs/structured-questions-guide) — build the reaction-word picker with multiple_choice + scale\n- [Usability Testing Guide](/docs/usability-testing-guide) — pair \"can they\" with \"do they want it\"\n- [Concept Testing Methodology](/docs/concept-testing-methodology) — validate the idea, not just the look\n- [Preference Testing Guide](/docs/preference-testing-guide) — A/B your design directions\n- [The 5-Second Test](/docs/5-second-test-guide) — capture first impressions\n- [Scale Questions in AI Interviews](/docs/scale-questions-guide) — quantify the desirability rating","category":"Research Methods","lastModified":"2026-06-06T03:14:16.881358+00:00","metaTitle":"Desirability Testing & Microsoft Reaction Cards Guide (2026) | Koji","metaDescription":"How to run desirability testing with Microsoft Reaction Cards to measure emotional response to a design — plus how to scale it with AI-moderated interviews that probe every word choice.","keywords":["desirability testing","microsoft reaction cards","product reaction cards","emotional response design testing","desirability testing ux","how a design makes users feel","reaction word testing"],"aiSummary":"Desirability testing measures the emotional response a design evokes — the \"do I want this?\" that usability testing ignores. The classic Microsoft Reaction Cards method asks participants to pick adjectives describing how a design feels, then explain why. AI-moderated interviews scale this by presenting the words as a structured multiple-choice question and automatically probing each selection, yielding word-frequency data plus the reasoning in one unmoderated session.","aiPrerequisites":["Basic understanding of UX research or usability testing"],"aiLearningOutcomes":["Define desirability testing and when to use it","Run the Microsoft Reaction Cards protocol","Scale desirability studies with AI-moderated interviews","Avoid the common pitfalls (too many words, skipping the why)"],"aiDifficulty":"intermediate","aiEstimatedTime":"10 minutes"}],"pagination":{"total":1,"returned":1,"offset":0}}