New

Now in Claude, ChatGPT, Cursor & more with our MCP server

Back to docs
Analysis & Synthesis

Verbatim Analysis: How to Code and Analyze Open-Ended Responses at Scale (2026)

Verbatim coding turns messy open-ended answers into countable themes — but manual coding is slow, expensive, and bias-prone. Learn the code-frame workflow, the manual vs AI tradeoff, and how Koji auto-codes verbatims and captures the depth a survey verbatim never could.

The short answer

Verbatim analysis (or verbatim coding) is the process of classifying open-ended, free-text survey responses into a structured set of codes so you can count, compare, and quantify what people actually said. Traditionally it is done by hand: a researcher reads every comment, builds a code frame, then assigns codes response by response across several rounds of review to control for error and bias (Blix; Voxco). It is accurate but slow and expensive — and it scales terribly.

AI changes the economics. AI-powered coding categorizes responses automatically, handling large datasets in a fraction of the time and cost of manual coding (Insight Platforms). Koji auto-codes verbatims into themes with supporting quotes and sentiment — and goes one step further: it replaces the thin survey verbatim with an AI-moderated interview that probes for the why, so each "verbatim" is a reasoned answer instead of a five-word fragment.

What verbatim coding is — and why it exists

Open-ended questions capture what closed questions miss, but raw text is not analyzable on its own. Verbatim coding bridges the gap: once each comment is assigned to a code, each code gets a count, and unstructured opinion becomes quantified evidence you can chart and track over time.

A code frame (or codebook) is the heart of it — the agreed list of categories, each with a clear definition and example responses. Build it well and two coders reach the same answer; build it poorly and your "data" is just two people's guesses. See the qualitative research codebook guide for how to construct one.

The manual verbatim coding workflow

  1. Read a sample of responses to understand the range.
  2. Draft the code frame — group recurring ideas into named codes with definitions.
  3. Pilot-code a subset with two coders; measure agreement.
  4. Refine the frame — merge overlaps, split codes that are too broad.
  5. Code the full set, allowing multiple codes per response.
  6. Quality-check — review low-confidence assignments and resolve disagreements.
  7. Quantify — count codes, cross-tabulate by segment, pull representative quotes.

This mirrors formal qualitative coding stages — see coding qualitative data and open, axial, and selective coding. Done by hand on thousands of responses, it can take a researcher weeks.

Manual vs. AI verbatim analysis

Manual codingAI verbatim analysis (Koji)
SpeedDays to weeksMinutes
CostHigh (analyst hours)Low (automated)
ConsistencyVaries by coder and fatigueStable, auditable prompt
ScaleHundreds before it strainsTens of thousands
Nuance on edge casesStrongStrong, with low-confidence flags for review
Bias controlMultiple review roundsSurfaces minority + dissenting themes; confidence scores

The honest tradeoff: pure text analytics is slightly less accurate than a careful human on rare edge cases, but it is dramatically faster and cheaper, and it never tires on response 4,000. Koji mitigates the accuracy gap by attaching a confidence level (high/medium/low) and supporting quotes to every coded theme, so a human can audit exactly which comments drove each code.

How Koji auto-codes verbatims

When Koji analyzes responses, it performs the coding workflow automatically:

  • Cycle-1 open coding. Each open-ended answer gets 2–5 short, grounded theme labels — either descriptive (an analyst-style topic label like "Onboarding friction") or in vivo (the respondent's own framing) — each tied to the specific message and a verbatim supporting quote.
  • Cycle-2 axial clustering. Across all responses, near-duplicate themes are merged into a canonical code frame per question, so "too expensive", "not worth the price", and "costs too much" collapse into one countable code.
  • Sentiment and intensity scoring per theme.
  • Quote retrieval — representative verbatims for every theme, preserved in the respondent's original words.
  • Segment comparison — how themes differ across customer groups.

You get a structured, quantified report with evidence — not a spreadsheet of raw text. Explore this further in how to analyze open-ended survey responses with AI, thematic analysis, and understanding themes and patterns.

The deeper fix: better verbatims, not just faster coding

Faster coding still leaves a ceiling problem: a survey verbatim is whatever the respondent typed in one rushed box — often "it was fine." You cannot code depth that was never captured. This is where Koji's core advantage applies. Instead of one static text field, Koji runs an AI-moderated conversation that probes shallow answers in the moment ("you said it was fine — what would have made it great?"). The result is a verbatim with reasoning attached, captured as structured Q&A pairs.

Koji's six structured question types (open_ended, scale, single_choice, multiple_choice, ranking, yes_no) let one study combine codable open ends with quantitative scales, so you can correlate what people rate with why they rate it. See the structured questions guide.

Practical tips for cleaner verbatim analysis

  • Ask one clear thing per open question — compound questions produce uncodable answers.
  • Let the AI probe rather than stacking more text boxes.
  • Keep the code frame per question — codes that span unrelated questions blur meaning.
  • Review low-confidence codes, not every code — that is where AI plus human is strongest.
  • Track code frames over time so you can trend the same themes across waves.

Sentiment, intensity, and emotion in verbatim analysis

Coding what a response is about is only half the job; how strongly and how positively it was said is the other half. Mature verbatim analysis layers three signals onto every coded comment:

  • Sentiment — positive, negative, or neutral toward the topic.
  • Intensity — how strong the feeling is ("annoying" vs "the worst experience of my year").
  • Emotion — the specific feeling (frustration, delight, confusion), which often predicts behavior better than a polarity label.

Koji scores sentiment and intensity per theme automatically, so you can rank themes not just by frequency but by emotional weight — surfacing the issue that 8% mention but feel furious about, which a pure count would bury. See sentiment analysis in interviews for more.

From codes to decisions

A code frame is a means, not an end. Once verbatims are coded and quantified, close the loop:

  1. Rank themes by frequency and intensity together.
  2. Cross-tabulate by segment — does the complaint concentrate in new users, enterprise, or one region?
  3. Pull the verbatim quote that best represents each theme to make it real for stakeholders.
  4. Trend the code frame across waves to see whether a fix actually moved the needle.

Koji's report does this for you — themes, counts, sentiment, representative quotes, and segment splits in one shareable view via generating research reports.

Common verbatim coding mistakes to avoid

  • A vague code frame. Codes without clear definitions produce inconsistent coding and untrustworthy counts.
  • Codes that span unrelated questions. Keep the frame per question so meaning stays sharp.
  • Coding only the dominant themes. Minority and dissenting views are often the most actionable; Koji is prompted to surface them rather than collapse everything into the majority.
  • Trusting counts without reading quotes. Always sanity-check a few verbatims behind each code.
  • Treating a one-word answer as data. It is the absence of data — capture depth upstream with AI follow-up probing instead.

Related Resources

Related Articles

How to Analyze Open-Ended Survey Responses with AI (2026 Guide)

Stop manually coding free-text survey responses. Learn how AI analyzes open-ended answers at scale — surfacing themes, sentiment, and quotes in minutes, plus why an AI interview captures 10x more depth than any survey can.

How to Code Qualitative Data: A Step-by-Step Guide

Learn the complete process of qualitative coding — from building a codebook to identifying themes — and how AI tools like Koji automate the most time-consuming parts.

Open, Axial, and Selective Coding: The Complete Guide to Qualitative Coding Phases

A complete guide to the three coding phases of grounded theory — open, axial, and selective coding. Examples, decision points, and how AI-native research with Koji compresses weeks of coding into minutes.

How to Build a Qualitative Research Codebook (With Examples and Templates)

A qualitative codebook is the rulebook for how you code your data — code names, definitions, inclusion criteria, examples, and exceptions. Done well, it makes coding consistent across analysts. Done badly, it produces findings nobody can defend.

Structured Questions in AI Interviews

Mix quantitative data collection — scales, ratings, multiple choice, ranking — with AI-powered conversational follow-up in a single interview.

The Complete Guide to Thematic Analysis

Learn how to systematically analyze qualitative data using Braun and Clarke's six-phase thematic analysis framework.

Topic Modeling for Customer Feedback: How to Find Themes in Open-Ended Responses at Scale

A practical guide to topic modeling for customer feedback — how LDA and modern NLP surface hidden themes in open-ended survey responses and reviews, the limitations of traditional methods, and the faster AI-native alternative.

Understanding Themes & Patterns

Learn how Koji identifies recurring themes across interviews and how to use them for decision-making.