Open, Axial, and Selective Coding: The Complete Guide to Qualitative Coding Phases

What are open, axial, and selective coding?

Open, axial, and selective coding are the three sequential phases of qualitative analysis in grounded theory — each one moves your data from raw text toward an explanatory theory. Open coding breaks data into discrete concepts. Axial coding finds the relationships between those concepts. Selective coding chooses one core category and ties everything else to it.

Developed by Anselm Strauss and Juliet Corbin in the 1990s as a refinement of Glaser and Strauss's original grounded theory (Springer, 2019), the three-phase approach is now the most widely-used coding framework in qualitative research — taught in PhD programs, used by UX researchers, and applied in product discovery worldwide.

This guide walks through each phase with concrete examples, shows where teams typically get stuck, and explains how AI-native research platforms like Koji compress what used to be weeks of manual coding into automated thematic analysis you can validate and refine in hours.

Why grounded theory coding still matters

In an era of AI-assisted analysis, why bother understanding manual coding phases? Three reasons:

AI thematic analysis is a coding accelerator, not a replacement. A 2025 tutorial in the Journal of Medical Internet Research found that ChatGPT-assisted coding "enhanced the efficiency and diversity of coding" but showed shortcomings in depth and context compared to manual work (JMIR, 2025). Researchers who understand the underlying phases can validate and refine AI output; those who do not, cannot.
Grounded theory is how you build theory from data, not impose theory on data. Open coding in particular is the discipline of staying open to what the data is telling you, rather than coding to confirm a hypothesis you already hold.
The three phases map onto how every product team thinks anyway — observation → relationships → core insight. Knowing the formal vocabulary makes synthesis conversations sharper.

Phase 1: Open coding

Open coding is the first pass through your data, where you read line-by-line (or segment-by-segment) and label each meaningful unit with a short code. The goal is breadth, not precision — capture every concept, action, or meaning you see, without pre-committing to categories.

How it works

Read or listen to your data once with no coding — interview transcripts, field notes, open-ended survey responses. Get a feel for the whole.
Pass through again, line-by-line. For each meaningful chunk, write a short label (a "code") that captures what is happening. Codes can be:
- Descriptive — "first-time login confusion"
- In-vivo — using participants' exact words: "the dashboard is overwhelming"
- Process — "deciding-whether-to-upgrade"
Keep codes provisional. Resist the urge to consolidate yet. You will end up with hundreds of codes — that is normal and correct.
Write memos. When a code feels significant or ambiguous, write a paragraph explaining what you saw. These memos become the connective tissue for axial coding.

Open coding example

Suppose you ran six interviews about how product managers prepare for stakeholder reviews. A passage like:

"I usually rebuild my deck from scratch the morning of, even though I have a template, because I never feel like the template covers what this particular VP cares about."

Might produce open codes like:

last-minute deck rebuilding
template-not-fitting-audience
VP-specific preferences
avoiding-template-reuse
morning-of-review preparation

Notice the codes are granular and overlapping. That is fine in open coding.

Common open coding mistakes

Coding too sparsely. If you have one code per page, you are summarizing, not coding. Aim for codes every 1–3 sentences.
Importing pre-existing categories. If you already "know" the themes, you will see only those themes. Open coding requires deliberate openness.
Conflating codes too early. "Confusion" and "frustration" feel similar but may have different antecedents. Keep them separate until axial coding tells you whether to merge.
Skipping memos. Memos are where the meaning lives. A spreadsheet of codes without memos loses the analytical reasoning behind each label.

Phase 2: Axial coding

Axial coding takes the codes from open coding and finds the relationships between them — what causes what, what conditions affect what, what consequences flow from what. The output of axial coding is a smaller set of categories, each with subcategories that explain how the underlying codes connect.

Strauss and Corbin's coding paradigm

The canonical axial coding model uses a six-element framework — sometimes called the "coding paradigm" (Springer, 2019):

Phenomenon — the central concept being explained.
Causal conditions — what gives rise to it.
Context — the setting in which it occurs.
Intervening conditions — broader structural factors that shape it.
Action/interaction strategies — how people respond.
Consequences — what happens as a result.

Not every study uses all six elements. The framework is a scaffold, not a checklist.

Axial coding example

Using our open codes from earlier, axial coding might cluster them into:

Category: Audience-fit anxiety in stakeholder presentations

Causal conditions: generic templates, varied stakeholder priorities
Context: high-stakes review meetings, limited prep time
Intervening conditions: lack of stakeholder profile data, no shared template library
Action strategies: last-minute rebuilding, asking peers for the latest VP version, padding decks with backup slides
Consequences: late nights, slide redundancy, decreased confidence

Now you have a category with structure, not just a list of codes. A reader can see the causal logic.

When to move from open to axial

Move when:

You stop seeing meaningfully new open codes (an early sign of saturation — see Data Saturation in Qualitative Research).
Open codes are starting to cluster around recognizable concepts.
You have written enough memos to have a sense of the relationships at stake.

In practice, open and axial coding overlap. Most researchers iterate between the two before settling categories.

Phase 3: Selective coding

Selective coding is the final phase, where you choose one core category — the unifying concept that explains the most variance in your data — and integrate every other category around it. This is where grounded theory earns its name: the output is a theory grounded in the data.

How to identify the core category

Strauss and Corbin proposed several criteria. The core category should:

Appear frequently in the data
Connect to all (or most) other categories
Explain variation across cases
Be abstract enough to apply broadly, concrete enough to feel grounded
Have explanatory power — using it, can you make sense of why participants did what they did?

Selective coding example

From the stakeholder presentation study, we might have several axial categories:

Audience-fit anxiety
Template-tool mismatch
Time-pressure cascading
Cross-team learning gaps

The selective coding question: is there one core category that ties these together?

A candidate: "the gap between generic preparation tools and audience-specific expectations." Each axial category becomes a facet of this core. Audience-fit anxiety is the felt experience; template-tool mismatch is the artifact-level cause; time-pressure cascading is the temporal consequence; cross-team learning gaps is the structural condition.

Now you have a theory. You can write a research narrative around it. You can design product interventions against it.

When selective coding is too ambitious

Not every qualitative study needs to produce a grounded theory. For most product and UX research, axial coding output is sufficient: a set of well-structured categories with relationships explained. Selective coding is the right move when:

You are doing dissertation-level or publication-targeted research.
The decision your insights inform is large enough to warrant a unifying explanation.
You have enough data (typically 20+ rich interviews) to support theoretical claims.

For a sprint-level study, stop at axial coding and skip selective. Forcing selective on thin data produces theories that overreach.

Open vs. axial vs. selective: at-a-glance

Phase	Goal	Output	Typical duration (manual)
Open	Identify concepts	100s of codes + memos	Days to weeks
Axial	Find relationships	5–15 categories with subcategory structure	Days
Selective	Build a unifying theory	One core category integrating all others	Days

Manual grounded theory coding for a 20-interview study typically takes 80–120 hours of analyst time across the three phases — which is why so few teams do it rigorously.

How AI-native research changes the workflow

This is where Koji enters the picture. Rather than spend 80–120 hours coding by hand, modern teams use AI to accelerate the heavy lift while preserving researcher judgment for the strategic moves.

With Koji:

AI-moderated interviews generate clean transcripts automatically — no transcription cost, no waiting.
Automatic thematic analysis runs the equivalent of open + axial coding the moment interviews complete: clustering quotes, surfacing recurring themes, and labeling them in plain language.
Quality scoring (1–5 scale) identifies the richest interviews — the ones most likely to yield axial categories with explanatory depth.
AI consultants can be configured per-study to encode your research focus and coding priorities, so the AI's clustering reflects what you care about, not just what is statistically most frequent.
6 structured question types (open_ended, scale, single_choice, multiple_choice, ranking, yes_no) capture both the qualitative material that powers grounded coding and the quantitative anchors that ground theoretical claims. See the Structured Questions Guide.
Real-time reports let you see emerging categories as interviews land, so axial coding happens during fieldwork instead of after.

The critical principle: AI accelerates open and axial coding; selective coding remains a human judgment call. Koji surfaces candidate themes faster than any manual workflow can, but choosing the core category — the unifying explanation that earns the right to be called a theory — is editorial work that belongs to you.

The 2025 JMIR tutorial on ChatGPT for grounded theory captured the principle well: AI is strongest at the breadth-and-clustering end of coding, weakest at the depth-and-judgment end (JMIR, 2025). Use Koji for the former; bring researcher expertise to the latter.

A practical workflow combining manual and AI coding

Run interviews via Koji (AI-moderated voice or text — see Setting Up Voice Interviews).
Let Koji's automatic thematic analysis surface initial themes. Treat these as candidate axial categories, not finished conclusions.
Read 3–5 full transcripts manually. Even with AI clustering, deep reading is irreplaceable for catching nuance, contradictions, and emergent codes the AI missed.
Refine categories. Merge AI-generated themes that overlap, split themes that conflate distinct concepts, add categories the AI missed.
Apply the coding paradigm to top categories. For each major axial category, ask: what causes it, what consequences flow from it, what conditions shape it.
Identify the core category (selective coding) — only if the study scope warrants it.
Write the research narrative using the core category as the spine. See Research Storytelling.

Frequently asked questions

Do I need to do all three coding phases for every study? No. Open and axial coding are sufficient for most product and UX research. Selective coding is appropriate for academic work or large strategic studies where a unifying theory is the deliverable.

What is the difference between open coding and thematic analysis? Thematic analysis is a broader, more flexible methodology popularized by Braun and Clarke. It overlaps significantly with open + axial coding from grounded theory but is methodologically less prescriptive. See the Thematic Analysis Guide for a full comparison.

How long does manual coding take? Roughly 4–6 hours of coding per hour of interview, across all three phases. A 20-interview study (20 hours of audio) typically requires 80–120 hours of analyst time. AI-native platforms like Koji can compress the open/axial portion by 70–90%.

Can AI do grounded theory coding for me? AI accelerates open and axial coding effectively, but selective coding requires human judgment. The 2025 JMIR tutorial found AI useful for breadth and clustering, weaker for depth and theoretical integration. Best practice is a hybrid workflow: AI for the heavy lift, human researcher for the strategic synthesis.

What is in-vivo coding? In-vivo coding uses participants' exact words as codes — for example, "the dashboard is overwhelming" instead of "interface complexity." It is a sub-technique within open coding that preserves participant voice and is especially valuable when terminology itself is part of the finding.

Do I need a software tool for grounded theory coding? For studies under 5 interviews, a spreadsheet works. For 5–20 interviews, dedicated tools like NVivo, ATLAS.ti, or Delve speed manual coding. For ongoing or large-scale research, AI-native platforms like Koji handle the open/axial work automatically while you direct the strategic synthesis.

Related resources

Structured Questions Guide — Combine open-ended and quantitative question types in interviews to power richer grounded theory analysis.
Coding Qualitative Data — Broader overview of qualitative coding approaches beyond grounded theory.
Grounded Theory Qualitative Research — The full methodology of which open/axial/selective coding is the analytical engine.
Thematic Analysis Guide — A flexible alternative to grounded theory coding.
Data Saturation in Qualitative Research — Knowing when to stop collecting and start coding.
Research Synthesis Guide — The broader synthesis workflow that selective coding feeds into.

Product & Research

People & Marketing

Partners & Education

Open, Axial, and Selective Coding: The Complete Guide to Qualitative Coding Phases

What are open, axial, and selective coding?

Why grounded theory coding still matters

Phase 1: Open coding

How it works

Open coding example

Common open coding mistakes

Phase 2: Axial coding

Strauss and Corbin's coding paradigm

Axial coding example

When to move from open to axial

Phase 3: Selective coding

How to identify the core category

Selective coding example

When selective coding is too ambitious

Open vs. axial vs. selective: at-a-glance

How AI-native research changes the workflow

A practical workflow combining manual and AI coding

Frequently asked questions

Related resources

Further reading on the blog

Related Articles

How to Code Qualitative Data: A Step-by-Step Guide

Data Saturation in Qualitative Research: How to Know When You Have Enough

Framework Analysis: The Complete Guide to the Matrix Method for Qualitative Data (2026)

Grounded Theory in Qualitative Research: A Practical Guide

Qualitative Data Analysis Software: The AI-Native Alternative to NVivo and ATLAS.ti (2026)

Research Synthesis: How to Combine Multiple Studies Into Clear Insights

Structured Questions in AI Interviews

The Complete Guide to Thematic Analysis

Verbatim Analysis: How to Code and Analyze Open-Ended Responses at Scale (2026)