{"site":{"name":"Koji","description":"AI-native customer research platform that helps teams conduct, analyze, and synthesize customer interviews at scale.","url":"https://www.koji.so","contentTypes":["blog","documentation"],"lastUpdated":"2026-06-22T20:50:54.751Z"},"content":[{"type":"documentation","id":"d46b378d-b311-4291-956a-7dda702d4eae","slug":"verbatim-analysis-guide","title":"Verbatim Analysis: How to Code and Analyze Open-Ended Responses at Scale (2026)","url":"https://www.koji.so/docs/verbatim-analysis-guide","summary":"Verbatim analysis (verbatim coding) classifies open-ended survey responses into a structured code frame so free text becomes countable. Manual coding — read responses, build a code frame, pilot, refine, code, QC, quantify — is accurate but slow, expensive, and inconsistent across coders. AI coding categorizes responses automatically at a fraction of the time and cost, with slightly lower edge-case accuracy. Koji auto-codes verbatims: cycle-1 open coding (2–5 grounded descriptive or in_vivo theme labels per answer with supporting quotes), cycle-2 axial clustering into a canonical per-question code frame, sentiment scoring, quote retrieval, and segment comparison — each theme carries a high/medium/low confidence flag for human audit. The deeper fix: Koji replaces thin survey verbatims with AI-moderated conversations that probe for the why, capturing reasoned answers instead of five-word fragments. Koji uses six structured question types (open_ended, scale, single_choice, multiple_choice, ranking, yes_no).","content":"## The short answer\n\n**Verbatim analysis** (or verbatim coding) is the process of classifying open-ended, free-text survey responses into a structured set of codes so you can count, compare, and quantify what people actually said. Traditionally it is done by hand: a researcher reads every comment, builds a **code frame**, then assigns codes response by response across several rounds of review to control for error and bias ([Blix](https://blix.ai/blog/verbatim-coding); [Voxco](https://www.voxco.com/resources/verbatim-coding)). It is **accurate but slow and expensive** — and it scales terribly.\n\nAI changes the economics. AI-powered coding categorizes responses automatically, **handling large datasets in a fraction of the time and cost of manual coding** ([Insight Platforms](https://www.insightplatforms.com/how-get-value-verbatims-autocoding/)). **Koji** auto-codes verbatims into themes with supporting quotes and sentiment — and goes one step further: it replaces the thin survey verbatim with an **AI-moderated interview** that probes for the *why*, so each \"verbatim\" is a reasoned answer instead of a five-word fragment.\n\n## What verbatim coding is — and why it exists\n\nOpen-ended questions capture what closed questions miss, but raw text is not analyzable on its own. Verbatim coding bridges the gap: once each comment is assigned to a code, each code gets a count, and unstructured opinion becomes quantified evidence you can chart and track over time.\n\nA **code frame** (or codebook) is the heart of it — the agreed list of categories, each with a clear definition and example responses. Build it well and two coders reach the same answer; build it poorly and your \"data\" is just two people's guesses. See the [qualitative research codebook](/docs/qualitative-research-codebook) guide for how to construct one.\n\n## The manual verbatim coding workflow\n\n1. **Read a sample** of responses to understand the range.\n2. **Draft the code frame** — group recurring ideas into named codes with definitions.\n3. **Pilot-code** a subset with two coders; measure agreement.\n4. **Refine** the frame — merge overlaps, split codes that are too broad.\n5. **Code the full set**, allowing multiple codes per response.\n6. **Quality-check** — review low-confidence assignments and resolve disagreements.\n7. **Quantify** — count codes, cross-tabulate by segment, pull representative quotes.\n\nThis mirrors formal qualitative coding stages — see [coding qualitative data](/docs/coding-qualitative-data) and [open, axial, and selective coding](/docs/open-axial-selective-coding). Done by hand on thousands of responses, it can take a researcher weeks.\n\n## Manual vs. AI verbatim analysis\n\n| | Manual coding | AI verbatim analysis (Koji) |\n| --- | --- | --- |\n| Speed | Days to weeks | Minutes |\n| Cost | High (analyst hours) | Low (automated) |\n| Consistency | Varies by coder and fatigue | Stable, auditable prompt |\n| Scale | Hundreds before it strains | Tens of thousands |\n| Nuance on edge cases | Strong | Strong, with low-confidence flags for review |\n| Bias control | Multiple review rounds | Surfaces minority + dissenting themes; confidence scores |\n\nThe honest tradeoff: pure text analytics is slightly less accurate than a careful human on rare edge cases, but it is dramatically faster and cheaper, and it never tires on response 4,000. Koji mitigates the accuracy gap by attaching a **confidence level (high/medium/low)** and **supporting quotes** to every coded theme, so a human can audit exactly which comments drove each code.\n\n## How Koji auto-codes verbatims\n\nWhen Koji analyzes responses, it performs the coding workflow automatically:\n\n- **Cycle-1 open coding.** Each open-ended answer gets 2–5 short, grounded theme labels — either *descriptive* (an analyst-style topic label like \"Onboarding friction\") or *in vivo* (the respondent's own framing) — each tied to the specific message and a verbatim supporting quote.\n- **Cycle-2 axial clustering.** Across all responses, near-duplicate themes are merged into a **canonical code frame per question**, so \"too expensive\", \"not worth the price\", and \"costs too much\" collapse into one countable code.\n- **Sentiment and intensity** scoring per theme.\n- **Quote retrieval** — representative verbatims for every theme, preserved in the respondent's original words.\n- **Segment comparison** — how themes differ across customer groups.\n\nYou get a structured, quantified report with evidence — not a spreadsheet of raw text. Explore this further in [how to analyze open-ended survey responses with AI](/docs/ai-analyze-open-ended-survey-responses), [thematic analysis](/docs/thematic-analysis-guide), and [understanding themes and patterns](/docs/understanding-themes-patterns).\n\n## The deeper fix: better verbatims, not just faster coding\n\nFaster coding still leaves a ceiling problem: a survey verbatim is whatever the respondent typed in one rushed box — often \"it was fine.\" You cannot code depth that was never captured. This is where Koji's core advantage applies. Instead of one static text field, Koji runs an **AI-moderated conversation** that probes shallow answers in the moment (\"you said it was fine — what would have made it great?\"). The result is a verbatim with reasoning attached, captured as structured Q&A pairs.\n\nKoji's six **structured question types** (open_ended, scale, single_choice, multiple_choice, ranking, yes_no) let one study combine codable open ends with quantitative scales, so you can correlate *what* people rate with *why* they rate it. See the [structured questions guide](/docs/structured-questions-guide).\n\n## Practical tips for cleaner verbatim analysis\n\n- **Ask one clear thing per open question** — compound questions produce uncodable answers.\n- **Let the AI probe** rather than stacking more text boxes.\n- **Keep the code frame per question** — codes that span unrelated questions blur meaning.\n- **Review low-confidence codes**, not every code — that is where AI plus human is strongest.\n- **Track code frames over time** so you can trend the same themes across waves.\n\n## Sentiment, intensity, and emotion in verbatim analysis\n\nCoding *what* a response is about is only half the job; *how strongly* and *how positively* it was said is the other half. Mature verbatim analysis layers three signals onto every coded comment:\n\n- **Sentiment** — positive, negative, or neutral toward the topic.\n- **Intensity** — how strong the feeling is (\"annoying\" vs \"the worst experience of my year\").\n- **Emotion** — the specific feeling (frustration, delight, confusion), which often predicts behavior better than a polarity label.\n\nKoji scores sentiment and intensity per theme automatically, so you can rank themes not just by frequency but by emotional weight — surfacing the issue that 8% mention but feel furious about, which a pure count would bury. See [sentiment analysis in interviews](/docs/sentiment-analysis-interviews) for more.\n\n## From codes to decisions\n\nA code frame is a means, not an end. Once verbatims are coded and quantified, close the loop:\n\n1. **Rank themes** by frequency and intensity together.\n2. **Cross-tabulate** by segment — does the complaint concentrate in new users, enterprise, or one region?\n3. **Pull the verbatim quote** that best represents each theme to make it real for stakeholders.\n4. **Trend the code frame** across waves to see whether a fix actually moved the needle.\n\nKoji's report does this for you — themes, counts, sentiment, representative quotes, and segment splits in one shareable view via [generating research reports](/docs/generating-research-reports).\n\n## Common verbatim coding mistakes to avoid\n\n- **A vague code frame.** Codes without clear definitions produce inconsistent coding and untrustworthy counts.\n- **Codes that span unrelated questions.** Keep the frame per question so meaning stays sharp.\n- **Coding only the dominant themes.** Minority and dissenting views are often the most actionable; Koji is prompted to surface them rather than collapse everything into the majority.\n- **Trusting counts without reading quotes.** Always sanity-check a few verbatims behind each code.\n- **Treating a one-word answer as data.** It is the absence of data — capture depth upstream with AI follow-up probing instead.\n\n## Related Resources\n\n- [Structured Questions Guide](/docs/structured-questions-guide) — the six question types behind every Koji study\n- [How to Analyze Open-Ended Survey Responses with AI](/docs/ai-analyze-open-ended-survey-responses)\n- [Coding Qualitative Data](/docs/coding-qualitative-data)\n- [Qualitative Research Codebook](/docs/qualitative-research-codebook)\n- [Thematic Analysis Guide](/docs/thematic-analysis-guide)\n- [Topic Modeling for Customer Feedback](/docs/topic-modeling-customer-feedback)","category":"Analysis & Synthesis","lastModified":"2026-06-22T03:20:22.595591+00:00","metaTitle":"Verbatim Analysis: Code Open-Ended Responses at Scale (2026) | Koji","metaDescription":"Verbatim coding turns open-ended answers into countable themes. Learn the code-frame workflow, the manual vs AI tradeoff, and how Koji auto-codes verbatims with quotes, sentiment, and the depth surveys miss.","keywords":["verbatim analysis","verbatim coding","open ended response coding","code frame","survey coding","open ended survey analysis","autocoding verbatims","qualitative coding"],"aiSummary":"Verbatim analysis (verbatim coding) classifies open-ended survey responses into a structured code frame so free text becomes countable. Manual coding — read responses, build a code frame, pilot, refine, code, QC, quantify — is accurate but slow, expensive, and inconsistent across coders. AI coding categorizes responses automatically at a fraction of the time and cost, with slightly lower edge-case accuracy. Koji auto-codes verbatims: cycle-1 open coding (2–5 grounded descriptive or in_vivo theme labels per answer with supporting quotes), cycle-2 axial clustering into a canonical per-question code frame, sentiment scoring, quote retrieval, and segment comparison — each theme carries a high/medium/low confidence flag for human audit. The deeper fix: Koji replaces thin survey verbatims with AI-moderated conversations that probe for the why, capturing reasoned answers instead of five-word fragments. Koji uses six structured question types (open_ended, scale, single_choice, multiple_choice, ranking, yes_no).","aiPrerequisites":["Open-ended responses to analyze (survey export or new Koji study)","Familiarity with basic survey design"],"aiLearningOutcomes":["Understand what verbatim coding is and why code frames matter","Run the seven-step manual coding workflow","Weigh the manual vs AI verbatim analysis tradeoff","See how Koji auto-codes verbatims with cycle-1 and cycle-2 coding","Capture deeper verbatims with AI follow-up probing"],"aiDifficulty":"intermediate","aiEstimatedTime":"11 min read"}],"pagination":{"total":1,"returned":1,"offset":0}}