Observational Research: Study What Users Do

What Is Observational Research?

Observational research is a family of methods in which you study users by watching their actual behavior — in their real environment, with real tasks — rather than asking them to describe it. The premise is simple and slightly uncomfortable: people are unreliable narrators of their own behavior. What they say and what they do diverge constantly, and the gap between the two is where the most valuable insights live.

That gap has a name: the say-do gap. A user will tell you they prioritize security, then reuse the same password across every account. A user will say onboarding was "easy," when you watched them stumble through it twice. Self-reported behavior is filtered through memory, ego, and the desire to give a helpful answer. Observed behavior is not. Observational research exists to capture the version of the truth that interviews and surveys cannot reach.

As Jakob Nielsen of the Nielsen Norman Group put it: "To design an easy-to-use interface, pay attention to what users do, not what they say. Self-reported claims are unreliable, as are user speculations about future behavior." This is one of the foundational principles of modern user research — and observational methods are how you act on it.

Why the Say-Do Gap Exists

People do not lie in research. They simply cannot accurately report their own behavior, for four well-documented reasons:

Memory is reconstructive. People do not replay events; they rebuild them, smoothing over the friction and forgetting the small frustrations.
Social desirability bias. Participants want to be helpful and competent, so they round their answers toward what sounds reasonable.
Rationalization. People invent plausible explanations for behavior that was actually driven by habit, defaults, or impulse.
No introspective access. Much of what users do is automatic. They genuinely do not know why they tapped where they tapped.

In usability research specifically, self-reporting is consistently less reliable than observation: users routinely report that an interface was easier than it actually was, and that gap only becomes visible when someone watches.

The Main Types of Observational Research

Observational research is a spectrum, from completely natural settings to controlled lab conditions.

Naturalistic (Field) Observation

Watching users in their real environment with no intervention — a "fly on the wall." It captures genuine, unprompted behavior but offers no control over what happens.

Contextual Inquiry

A hybrid of observation and interviewing. The researcher watches users perform real tasks in their own environment and asks questions in the moment. The Nielsen Norman Group describes contextual inquiry as a type of ethnographic field study focused on understanding work practices and behaviors. It is the workhorse method for B2B and workflow products.

Ethnographic Research

Deep, prolonged immersion in users' lives and culture. A famous example: an IDEO team observing hospital nurses noticed they were writing patient information on tape stuck to their scrubs, because the official systems were not accessible at the bedside. No survey would have surfaced that workaround — it had to be seen.

Usability Testing (Lab Observation)

Scripted observation in a controlled setting, watching users attempt defined tasks with a product or prototype. Because it is structured observation, five participants surface roughly 85% of an interface's usability problems, per the Nielsen Norman Group — which is why usability tests stay small.

Diary Studies and Self-Observation

Participants log their own behavior over days or weeks. Not pure observation, but it captures behavior over time that a researcher cannot be present for.

Observation Tells You What. It Rarely Tells You Why.

Here is the honest limitation of observational research: it is unbeatable at revealing what people do and where they struggle — and it is silent on why. You can watch a nurse tape a note to her scrubs. You can see exactly where it happens. But the observation alone does not tell you whether the official system is too slow, too far away, requires a login she does not have, or simply was never trusted. To know that, you have to ask.

This is the central trade-off in research method selection. Behavioral, observational methods give you unfiltered truth about actions. Attitudinal methods — interviews — give you the reasoning, motivation, and emotion behind those actions. Strong research does not choose. It pairs them: observe to find the behavior, then interview to explain it.

The catch has always been cost. Running an observational study and then recruiting, scheduling, and moderating follow-up interviews — and manually analyzing all of it — is slow enough that most teams do one or the other and live with the blind spot.

How to Run an Observational Study

Define the behavior in question. Be specific: "how users recover from a failed payment," not "how users feel about checkout."
Choose the setting. Natural environment for genuine behavior; controlled setting for comparability. The choice determines what you can observe and how well it generalizes.
Recruit a representative sample. Off-target participants produce confident, wrong conclusions.
Observe without interfering. Your job is to watch, not to coach. Note workarounds, hesitations, errors, and emotional cues.
Capture rigorously. Timestamped notes, screen or video recording where consent allows, and a structured log so sessions can be compared.
Pair observation with inquiry. Immediately after the session, while context is fresh, ask the user to explain the moments you flagged. This is the step that converts behavior into insight.
Analyze for patterns. Look across sessions for recurring behaviors and workarounds — the repeated patterns, not the one-off anomalies.

How Koji Adds the "Why" at Scale

Koji is an AI-native research platform built to close exactly the gap observation leaves open — the why — without the cost that normally makes teams skip it.

Capture the why while context is fresh. Right after an observed session or usability test, send participants a Koji AI-moderated interview. It runs by voice or text, asynchronously, so there is no second round of scheduling. The AI interviewer asks about the specific moments you observed and probes every short answer with a real follow-up — turning "it was fine" into the actual reason.

Interview at the scale observation cannot reach. Field observation is labor-intensive; a researcher can only be in one place at a time. Koji's AI-moderated interviews run in parallel with dozens or hundreds of participants at once, so the explanatory layer can match the breadth of your behavioral data instead of being a small, expensive sample.

Quantify what you observed. Koji supports six structured question types — open_ended, scale, single_choice, multiple_choice, ranking, and yes_no. After observing a behavior, you can confirm how widespread it is with a yes_no or single_choice question, measure intensity with a scale, and capture the reasoning with open_ended follow-ups. See the structured questions guide for how to combine them.

Get patterns without the manual grind. Koji's automatic thematic analysis reads every interview and clusters the recurring explanations behind the behavior you saw, so the why arrives as organized themes — not a backlog of recordings waiting to be coded.

While traditional survey tools like SurveyMonkey only ever capture self-reported claims — the exact data observational research exists to distrust — Koji conducts a real follow-up conversation. Pair Koji interviews with your observational study and you get both halves of the truth: what users actually did, and why they did it. And because Koji democratizes this, you do not need a dedicated research team or a PhD in research methods to run it.

Common Observational Research Mistakes

Interfering with the behavior. Coaching, hinting, or hovering changes what you came to see. Watch first; ask later.
The observer effect. People behave differently when watched. Minimize your presence and give participants time to settle.
Observing without ever asking why. The behavior is the symptom. Without the follow-up interview, you are guessing at the cause.
Over-generalizing from one session. A single dramatic moment is an anecdote. Look for the pattern across participants.
Confirmation bias in note-taking. Record what happened, not what supports your hypothesis. Timestamped, neutral notes protect you from yourself.

How Many Observation Sessions Do You Need?

Observational research is qualitative, so you are looking for patterns, not statistical significance. For structured observation such as usability testing, five participants surface roughly 85% of an interface's usability problems, per the Nielsen Norman Group — diminishing returns set in quickly after that. For field observation and contextual inquiry, most teams find that recurring behaviors and workarounds become clear after five to eight sessions per distinct user group.

The signal to stop is saturation: the point at which new sessions stop producing new behaviors and you can reliably predict what the next participant will do. If you are still seeing genuinely new workarounds at session ten, your sample is probably too broad — you are likely observing several different user groups at once and should segment them. Running more sessions than saturation requires does not buy more insight; it buys repetition. The one exception is when you deliberately want to quantify how widespread an observed behavior is — and that is a job for a follow-up study with structured questions, not for more observation.

Observational Research in Remote and Digital Products

Classic observational research assumes you can be physically present. Most modern products do not allow that — users are remote, distributed, and using software alone. The methods adapt rather than disappear. Session recordings and analytics act as a form of asynchronous observation, capturing real behavior at scale. Unmoderated remote usability tests let you watch task performance without sharing a room. Diary studies capture self-observed behavior over days. The principle holds in every case: behavior first, explanation second. The digital shift simply makes the explanatory follow-up easier — an observed session and an AI-moderated interview can now happen back to back, remotely, within the same hour.

Related Resources

Contextual Inquiry — observation and interviewing combined in the user's environment
Ethnographic Research — deep immersion in users' lives and culture
Attitudinal vs Behavioral Research — what people say versus what they do
Usability Testing Guide — structured observation of task performance
Structured Questions Guide — Koji's six question types for quantifying observed behavior
Diary Study Guide — capturing self-observed behavior over time

Product & Research

People & Marketing

Partners & Education

Observational Research: How to Learn From What Users Do, Not What They Say

What Is Observational Research?

Why the Say-Do Gap Exists

The Main Types of Observational Research

Naturalistic (Field) Observation

Contextual Inquiry

Ethnographic Research

Usability Testing (Lab Observation)

Diary Studies and Self-Observation

Observation Tells You What. It Rarely Tells You Why.

How to Run an Observational Study

How Koji Adds the "Why" at Scale

Common Observational Research Mistakes

How Many Observation Sessions Do You Need?

Observational Research in Remote and Digital Products

Related Resources

Related Articles

Attitudinal vs. Behavioral Research: What Users Say vs. What They Do

Contextual Inquiry: The Complete Guide to Observational Research

Diary Studies: The Complete Guide to Longitudinal User Research

Ethnographic Research: Methods, Examples, and UX Applications

Structured Questions in AI Interviews

How to Conduct Usability Testing: The Complete Guide