New

Now in Claude, ChatGPT, Cursor & more with our MCP server

Back to docs
Study Design

Usability Testing Script Template: A Free, Ready-to-Use Script for Moderated & Unmoderated Tests (2026)

A complete copy-paste usability testing script - intro, warm-up, tasks, post-task questions, and wrap-up - plus how to run it unmoderated at scale with an AI moderator.

A usability testing script is the repeatable runsheet that keeps every session consistent: a short introduction and consent, a warm-up, three to five scenario-based tasks with think-aloud prompts, a quick post-task rating, and a wrap-up. Below is a complete, copy-paste script you can use today - followed by how to run the exact same script unmoderated at scale with Koji's AI moderator, so you are not stuck personally hosting 20 calls.

Why You Need a Script

Without a script, two sessions are never comparable - you phrase a task one way for the first participant and differently for the next, and your results become anecdotes instead of data. A script standardizes wording, task order, and timing so that when five users all stumble on the same step, you know it is the design, not your facilitation.

The gold-standard usability test has five sections. Here is each one, ready to use.

The Complete Usability Testing Script

1. Introduction and consent (2 minutes)

Thanks for joining. Today I'll ask you to complete a few short tasks using [product].
There are no right or wrong answers - we're testing the design, not you. If something
is confusing, that's exactly what we want to learn.

As you work, please think aloud: say what you're looking at, what you expect to happen,
and anything that surprises you. Is it okay if we record this session for research notes?

2. Warm-up (2 minutes)

Before we start, tell me a little about how you currently handle [the relevant task].
What tools do you use today, and how often?

The warm-up relaxes the participant and gives you context for interpreting their behavior.

3. Scenario tasks (12-18 minutes)

Frame each task as a goal, not a click path. Three to five tasks is the sweet spot.

Task 1: Imagine you want to [realistic goal]. Show me how you'd do that.
Task 2: You've just [context]. Find a way to [goal].
Task 3: Suppose [scenario]. Walk me through what you'd do next.

While they work, use only neutral think-aloud nudges: "What are you thinking right now?", "What did you expect to happen?", "Tell me more about that." Never point to the answer.

4. Post-task questions

Immediately after each task, capture ease and success:

- Did you complete the task successfully? (Yes / No / Partially)
- How difficult or easy was that task? (1 = Very difficult ... 7 = Very easy)  [Single Ease Question]
- What, if anything, was confusing about that step?

5. Wrap-up (3 minutes)

Overall, what was your impression of [product]?
If you could change one thing, what would it be?
How likely would you be to recommend this to a colleague? (0-10)

Moderated vs Unmoderated: The Tradeoff

A moderated test lets a researcher probe live, but it does not scale - every session needs a calendar slot and your full attention. An unmoderated test scales, but classic unmoderated tools just record clicks and leave you guessing why a user got stuck. See Moderated vs Unmoderated Research for the full comparison.

Koji collapses that tradeoff. Its AI moderator runs the script unmoderated - so participants complete it on their own schedule, 24/7 - while still doing the thing only a moderator used to do: prompting think-aloud and asking adaptive follow-ups when a participant hesitates or sounds confused.

How to Run This Script in Koji

  1. Create a study and paste your tasks into the interview plan. Koji turns your goals into a methodology-backed guide.
  2. Choose voice or text. Voice mode captures spontaneous think-aloud reactions; text mode is great for at-desk testing of web flows. A text interview costs 1 credit; a voice interview costs 3.
  3. Add structured post-task questions. Use a yes_no question for task success, a scale question (1-7) for the Single Ease Question, and an open_ended question for the "what was confusing" probe. See the Structured Questions Guide.
  4. Let the AI moderate. It reads each task, encourages think-aloud, and probes hesitation automatically - the Think-Aloud Protocol without a human host.
  5. Read the aggregated report. Koji charts task success rates and SEQ distributions across every participant and surfaces the recurring friction themes with supporting quotes.

The quality gate means only sessions scoring 3 or higher consume credits, so abandoned or low-effort sessions do not cost you. That is how teams run 30 usability sessions in the time it used to take to schedule three.

Metrics to Track

MetricHow to capture in KojiWhat it tells you
Task success rateyes_no question per taskCan users actually complete the core flows?
Single Ease Question (SEQ)scale question (1-7)How hard did each task feel?
Time on tasksession timestampsWhere do users slow down?
System Usability Scale (SUS)10 scale questionsBenchmarkable overall usability score
Qualitative frictionopen_ended + AI themesThe why behind every failure

For scoring details, see the Single Ease Question Guide and the System Usability Scale Guide.

Five Tips for Better Sessions

  1. Test with 5 users per round. Five well-run sessions surface roughly 85% of usability problems - then fix and retest.
  2. Goals, not instructions. Never name the button. "Find a plan for a team of five," not "Click Pricing."
  3. Stay neutral. Resist helping. Silence is data.
  4. Pilot the script once. A single dry run catches confusing task wording before it pollutes your dataset.
  5. Separate observation from interpretation. Record what happened first; decide what it means at synthesis.

Adapting the Script for Different Test Types

The five-section structure stays the same; only the task framing changes.

  • Prototype testing. Set expectations explicitly: "This is an early prototype, so some buttons may not work - just tell me what you would expect to happen." This frees participants to react honestly instead of apologizing for dead ends.
  • Live website or app. Use real account data and real goals. Live tests surface performance and content issues that prototypes hide.
  • Mobile. Keep tasks shorter and fewer - thumb fatigue and small screens compress attention. Voice mode works especially well here because participants can talk while they tap instead of typing.
  • Comparative testing. Run the same tasks against two designs (A and B) as separate Koji studies, then compare the aggregated task-success and SEQ charts side by side.

Common Scripting Mistakes to Avoid

  1. Naming the interface element. "Click the blue Settings gear" tells the participant exactly where to go and invalidates the test. Describe the goal, never the control.
  2. Stacking two tasks in one prompt. One goal per task keeps success measurable and the think-aloud focused.
  3. Rescuing a struggling participant. The instinct to help is strong, but a user getting stuck is the finding. Stay quiet and let it play out.
  4. Skipping the pilot. Always run the script once before fielding it. A single confusing task wording, repeated across 20 sessions, poisons the whole dataset.
  5. Forgetting the post-task rating. Capturing the Single Ease Question immediately after each task - while the experience is fresh - is far more reliable than one overall rating at the end.

Because Koji standardizes the script across every unmoderated session, these mistakes are caught once at design time rather than repeated live in 20 separate calls. The AI delivers the exact same task wording and the same neutral think-aloud nudges to every participant, which is the consistency that makes five-user findings trustworthy.

Related Resources

Related Articles

Structured Questions in AI Interviews

Mix quantitative data collection — scales, ratings, multiple choice, ranking — with AI-powered conversational follow-up in a single interview.

Single Ease Question (SEQ): The 7-Point UX Metric for Task-Level Usability (2026)

The complete 2026 guide to the Single Ease Question (SEQ): the verbatim 7-point scale wording, Sauro–MeasuringU benchmarks (5.3–5.5 average), correlation with task completion, when to use SEQ vs SUS, and how to bundle SEQ into AI-moderated interviews on Koji to get task-level usability scores in days.

System Usability Scale (SUS): Complete Guide with Calculator, Benchmarks & Examples

The definitive 2026 guide to the System Usability Scale (SUS): the 10-question formula, scoring calculator, Sauro–Lewis benchmark grades, and how to deploy SUS at scale with AI-moderated interviews on Koji.

Think-Aloud Protocol: How to Run and Analyze Think-Aloud Sessions

A complete guide to the think-aloud protocol — the most widely used usability testing method. Learn how to set up sessions, moderate effectively, analyze verbal data, and run remote think-aloud studies.

How to Conduct Usability Testing: The Complete Guide

A comprehensive guide to usability testing for UX researchers and product managers. Covers types of testing, participant numbers, step-by-step facilitation, and the most common mistakes to avoid.

Unmoderated vs Moderated User Research: How to Choose

Understand the real differences between moderated and unmoderated user research — and how AI-moderated interviews give you depth at scale that traditional approaches never could.