Single Ease Question (SEQ): 7-Point Scale, Benchmarks & 2026 Guide

Single Ease Question (SEQ): The 7-Point UX Metric for Task-Level Usability (2026)

The complete 2026 guide to the Single Ease Question (SEQ): the verbatim 7-point scale wording, Sauro–MeasuringU benchmarks (5.3–5.5 average), correlation with task completion, when to use SEQ vs SUS, and how to bundle SEQ into AI-moderated interviews on Koji to get task-level usability scores in days.

What Is the Single Ease Question (SEQ)?

The Single Ease Question (SEQ) is a one-item, 7-point rating scale used immediately after a user attempts a task to measure how difficult or easy that task felt. It is the simplest, fastest, and most-validated post-task usability metric in modern UX research, and is the standard companion to behavioural measures like task completion rate and time on task.

The SEQ was popularised by Jeff Sauro and the team at MeasuringU after years of empirical comparison against other post-task questionnaires (After-Scenario Questionnaire, NASA-TLX, Subjective Mental Effort Question). Sauro’s research established that a single well-anchored question correlated just as strongly with task completion and time-on-task as longer multi-item scales — and was dramatically less work to administer. The result is a metric that has effectively become the default post-task measure across modern usability research.

The Verbatim SEQ Wording

Overall, how difficult or easy was [the task] to complete?

1 — Very Difficult 2 3 4 5 6 7 — Very Easy

A few critical implementation details:

The scale runs from 1 (Very Difficult) to 7 (Very Easy). Reversing the polarity invalidates direct comparison to MeasuringU benchmarks.
Only the endpoints are labelled. Some teams label the midpoint or every point; both reduce sensitivity.
It is administered immediately after the task, not at the end of the session. The experience must be fresh.
The bracketed task name should be specific. Use the actual task wording the user just attempted (“purchasing a coffee subscription”), not a generic “the previous task.”

Why SEQ Works

SEQ’s superpower is predictive validity — the score correlates strongly with what users actually did. Sauro’s benchmark research at MeasuringU established that:

A raw SEQ score of 5.9 corresponds to a task completion rate of roughly 86% and an average task time of about 2 minutes.
A raw SEQ score of 4.7 corresponds to a completion rate of roughly 58% and an average task time of about 2.8 minutes.
The relationship is roughly linear within the 4.0–6.5 range that covers most real-world tasks.

This is unusually strong for a self-reported metric. Most attitudinal measures correlate weakly with behaviour. SEQ correlates almost as well with task success as task success itself — which is why it survives across two decades of usability research.

SEQ Benchmarks

According to MeasuringU’s published benchmark dataset of more than 400 tasks and 10,000+ users:

SEQ Score	Interpretation
6.5+	Top-decile task. Almost all users succeed without friction.
5.6–6.4	Above average. Workable; minor friction.
5.3–5.5	Population average. Typical for a competent but unremarkable task.
4.5–5.2	Below average. Friction is real and worth investigating.
<4.5	Bottom-decile. Likely a usability emergency.

A crucial calibration: the 5.3–5.5 average sits above the nominal scale midpoint of 4. This is normal for 7-point scales — humans cluster toward the positive end of unlabeled scales. Treating 4 as “average” is the single most common SEQ misinterpretation.

Industry benchmark. “Across over 400 tasks and 10,000 users the average score hovers between about 5.3 and 5.6, which is above the nominal midpoint of 4 but is typical for 7-point scales.” — MeasuringU, 10 Things to Know About the Single Ease Question

SEQ vs SUS: When to Use Each

SEQ and SUS are not competing — they measure different things at different cadences.

Dimension	SEQ	SUS
Scope	One specific task	Entire product/system
Timing	Immediately after each task	At the end of the test session
Question count	1	10
Scale	1–7	1–5 (Likert)
Output	Per-task ease score	0–100 system score
Best for	Diagnosing which task is hard	Benchmarking the whole product
Sample-size floor	~10 per task	~8 per study
Time to administer	<10 seconds	60–90 seconds

The canonical pattern in a moderated usability study is: SEQ after every task → SUS at the end. SEQ tells you which task is hard; SUS tells you whether the product is competitive against the 68 industry average. See the SUS guide for the full Sauro–Lewis benchmark scale.

How to Run a SEQ Study — Step by Step

Step 1: Define your tasks

Write each task as a goal the user can attempt without coaching. “Find a coat under £100 and add it to your basket” is a task. “Browse the catalogue” is not.

Step 2: Pick a sample size

Minimum 10–12 participants per task for reliability. For directional sprint testing, 8 is workable. For benchmarking or external reporting, aim for 30+. SEQ is unusually robust at small samples but never reliable below n=8.

Step 3: Run the task

Let the user attempt the task end-to-end. Do not interrupt. If they ask for help, treat it as a failure and move on.

Step 4: Administer the SEQ immediately

The instant the task ends — succeeded or failed — show the SEQ. Do not allow time for rationalisation. The fresher the response, the more diagnostic the score.

Step 5: Always pair SEQ with an open-ended probe

This is the single most under-used best practice. A bare SEQ score tells you the task is hard; the open-ended “What made the task feel that way?” tells you why. Without the probe, SEQ is a thermometer with no diagnosis.

Step 6: Analyse per task and across tasks

Per task: report the mean SEQ, the 95% confidence interval, and the % of users below 5. Across tasks: rank tasks by mean SEQ to identify the friction hotspots. Pair SEQ scores with task completion rates to triangulate.

Common SEQ Mistakes to Avoid

Reversing the scale. Some teams label 1 as “easy” and 7 as “difficult.” This breaks every benchmark comparison. Stick to 1 = Very Difficult, 7 = Very Easy.
Treating 4 as the average. The midpoint is statistically not the population average. The real average is 5.3–5.5. A score of 4 is well below average.
Administering SEQ at the end of the session. Recall bias collapses the diagnostic value. Administer immediately after each task.
Reporting SEQ without an open-ended probe. A score without a why is a metric you cannot act on.
Using SEQ to benchmark the whole product. SEQ is a task metric. For a product-level benchmark, use SUS.
Stopping at n=5. SEQ requires more participants than think-aloud sessions because it is quantitative. n=8 is a floor, n=10–12 is reliable, n=20+ is publishable.

The Modern Approach: SEQ at Scale With AI-Moderated Research

SEQ has always been easy to administer but expensive to run at scale. The traditional bottleneck is everything around the SEQ: recruiting, scheduling, moderating, transcribing the probes, then thematically analysing the open-ended responses. A 5-task SEQ study with 15 participants is two weeks of work for a research team — and most of those weeks are not the SEQ itself.

AI-native research platforms like Koji collapse this end-to-end. The modern SEQ workflow looks like this:

Build the study in minutes. Use Koji’s structured questions — specifically the scale type (1–7) — to add the SEQ after each task. Add an open-ended probe directly underneath. Use the yes_no question type for binary task success.
Launch via personalised link or in-product widget. No scheduling, no moderator availability constraints. The AI moderator runs the task with users 24/7.
Get clean per-task data. Koji’s ground-truth widget scores every scale answer at high confidence. Per-task SEQ averages, 95% confidence intervals, and distributions update on the report in real time.
Get the why automatically. Koji’s thematic analysis engine clusters the open-ended probe responses into friction themes per task — eliminating the manual coding step that traditionally consumes the entire week after a study closes.
Compare across releases. Re-run the same SEQ study after every release to track per-task ease over time, exactly as you would track SUS or NPS at the system level.

Forrester’s State of Customer Insights 2024 found teams using AI-moderated research achieve 60% faster time-to-insight than teams running equivalent studies manually. For SEQ studies specifically — where the bottleneck is rarely the metric itself but the moderation and analysis around it — the gap is closer to 80%. Koji customers routinely run 5-task SEQ studies in an afternoon that previously took a fortnight.

The broader point is that SEQ’s adoption has historically been limited not by the metric’s value (which is well-established) but by the operational cost of running enough sessions to make the score meaningful. Removing that operational cost is the actual research breakthrough — the metric itself has been settled science for two decades.

When NOT to Use SEQ

SEQ is not the right tool for:

System-level benchmarking — use SUS instead
Loyalty or recommendation intent — use NPS
Effort to resolve a problem — use Customer Effort Score (CES)
Generative discovery (“what should we build?”) — use Mom Test interviews or JTBD interviews

SEQ shines for one job and one job only: measuring the perceived ease of a specific task immediately after it is attempted. Used inside its lane, it is the highest-leverage metric in the usability researcher’s toolkit.

Related Resources

Structured Questions in AI Interviews — the six Koji question types, including the scale type used to deploy SEQ
System Usability Scale (SUS): Complete Guide — the system-level companion to SEQ
Customer Effort Score (CES): How to Measure and Reduce Friction — a related effort-based metric for support and resolution flows
HEART Framework: Google’s 5-Metric UX Model — where SEQ slots in as the Task Success attitudinal signal
Likert Scale Questions in User Research — broader scale-design principles relevant to SEQ
Usability Testing: The Complete Guide — the parent methodology in which SEQ is administered

Product & Research

Revenue & Growth

Advisory & Services

Single Ease Question (SEQ): The 7-Point UX Metric for Task-Level Usability (2026)

What Is the Single Ease Question (SEQ)?

The Verbatim SEQ Wording

Why SEQ Works

SEQ Benchmarks

SEQ vs SUS: When to Use Each

How to Run a SEQ Study — Step by Step

Step 1: Define your tasks

Step 2: Pick a sample size

Step 3: Run the task

Step 4: Administer the SEQ immediately

Step 5: Always pair SEQ with an open-ended probe

Step 6: Analyse per task and across tasks

Common SEQ Mistakes to Avoid

The Modern Approach: SEQ at Scale With AI-Moderated Research

When NOT to Use SEQ

Related Resources

Related Articles

Scale Questions in AI Interviews: Measure NPS, CSAT, and Ratings Automatically

Structured Questions in AI Interviews

HEART Framework: Google’s 5-Metric Model for Measuring User Experience (2026 Guide)

Likert Scale Questions: How to Use Rating Scales in User Research

System Usability Scale (SUS): Complete Guide with Calculator, Benchmarks & Examples

How to Conduct Usability Testing: The Complete Guide

How to Measure Customer Effort Score (CES) and Reduce Friction