Usability Metrics: Task Success Rate, Time on Task & Error Rate (2026)

What are the core usability metrics?

The three core usability metrics are task success rate (effectiveness), time on task (efficiency), and error rate (accuracy). Together they answer the only questions that matter in a usability test: Can users finish the job? How long does it take them? And how many mistakes do they make along the way? Every other quantitative usability measure — completion confidence, lostness, task-level satisfaction — is a refinement of these three.

The Nielsen Norman Group calls success rate "the simplest usability metric" precisely because it is the bottom line of usability: if users cannot complete what they came to do, nothing else about the interface matters. The strength of these three metrics is that they are objective and behavioral. Unlike attitudinal scores such as NPS or SUS, they record what users actually did, not what they later said they felt.

This guide defines each metric, gives you the formulas and the published industry benchmarks, explains how many participants you need, and shows how an AI-native platform like Koji captures all three automatically — turning a multi-day analysis grind into a real-time dashboard.

Metric 1: Task success rate (effectiveness)

Task success rate is the percentage of participants who complete a task successfully out of everyone who attempted it.

Task success rate = (number of successful attempts ÷ total attempts) × 100

If 17 of 20 participants successfully add an item to their cart and reach checkout, your success rate is 85%.

The benchmark: In an analysis of 1,189 tasks across 115 usability studies, MeasuringU founder Jeff Sauro found the average task completion rate is 78%. Most teams treat roughly 78–80% as the dividing line between "acceptable" and "needs work" for an important task — though success rate is highly sensitive to task difficulty, so the right target is always relative to the task and to your own historical baseline.

Binary vs. levels of success. The cleanest version is binary: a participant either completed the task or did not. But many teams record levels of success — full success, partial success (completed with significant struggle or workaround), and failure — because a binary view hides the difference between a user who breezed through and one who barely limped to the finish line. Partial successes are often where your richest design insights hide.

The trap: success rate alone is misleading. A task can show a 90% success rate while users take three minutes and make two errors getting there. That is why success rate must always be read alongside time and errors.

Metric 2: Time on task (efficiency)

Time on task measures how long it takes a participant to complete a task, usually reported as the mean or median time in seconds for successful attempts only. (Including failed attempts pollutes the number — a user who gave up after 10 seconds would otherwise look "efficient.")

Because time data is almost always skewed by a few very slow users, the geometric mean or the median is the statistically appropriate measure of center for small samples, not the arithmetic mean. Report a measure of spread too — the range or confidence interval — because an average of 45 seconds means something very different if the spread is 40–50 seconds versus 10–120 seconds.

How to use it: time on task is most powerful as a comparative metric — old design vs. new, your product vs. a competitor, or release over release. An absolute "good" time rarely exists in isolation; a 30-second task time is excellent for a complex configuration flow and terrible for a one-click action.

Metric 3: Error rate (accuracy)

An error is any unintended action, slip, mistake, or omission a user makes while attempting a task. Error rate is typically expressed as errors per task (the average number of errors across all attempts) or as a defect rate (the percentage of attempts containing at least one error).

Errors per task = total errors observed ÷ total attempts

The benchmark: across an analysis of 719 tasks using consumer and business software, Jeff Sauro found an average of 0.7 errors per task, with roughly two out of every three users making at least one error. Errors are far more common than most teams assume — which is exactly why counting them surfaces friction that success rate alone would never reveal.

Not all errors are equal. Classify them by severity (does the error block completion, or merely slow the user down?) and by type (slips, where the user knows the goal but executes the wrong action, vs. mistakes, where the user has the wrong mental model). The pattern in where errors cluster is usually more actionable than the raw count.

Putting the three together

Effectiveness, efficiency, and accuracy form a triangle. A mature usability scorecard reads all three at once:

Metric	What it measures	Typical benchmark
Task success rate	Effectiveness — can they finish?	~78% average (Sauro/MeasuringU)
Time on task	Efficiency — how fast?	Comparative; no universal target
Error rate	Accuracy — how clean?	~0.7 errors/task; ~2 of 3 users err

Layer a task-level satisfaction question on top — a single scale question such as "How easy or difficult was that task?" — and you capture the user's attitude alongside their behavior. The Nielsen Norman Group repeatedly finds that performance and satisfaction metrics correlate only moderately, so measuring both protects you from shipping something users can use but hate using.

How many participants do you need?

For qualitative, formative usability testing — finding problems to fix — five users per round uncovers roughly 85% of issues, the classic Nielsen-Landauer finding. But the moment you want reliable quantitative metrics like a stable success rate or time on task, five is far too few: the confidence interval on a metric from five users is enormous.

A practical rule of thumb:

5–8 users — formative testing, finding usability problems (not for reporting precise numbers).
15–20 users — a reasonably tight success-rate estimate for a single design.
30–50+ users — benchmark-grade metrics you intend to track over time or quote externally. (See our usability benchmarking guide for the full methodology.)

Use adjusted-Wald binomial confidence intervals for small-sample completion rates rather than naive percentages — a 4-of-5 success "80%" actually carries a confidence interval running from roughly 36% to 98%.

The modern approach: capturing usability metrics with AI

Traditionally, collecting these three metrics meant scheduling moderated sessions, watching every recording, manually timing each task with a stopwatch, tallying errors by hand, and reconciling notes across a research team. A 20-participant benchmark could swallow a week of analyst time — which is why most teams measured usability once a quarter at best, if at all.

This is exactly the bottleneck AI-native research platforms remove. Koji captures all three core metrics automatically:

Task success rate is recorded directly through structured questions. Frame each task with a yes_no or single_choice outcome question, and Koji aggregates the completion rate across every respondent in real time — no manual tallying.
Time on task is timestamped automatically for every session, with the distribution (median, range, outliers) computed and charted as responses arrive.
Error and friction signals surface through Koji's AI moderator, which probes in the moment ("What made that step confusing?") and then clusters the open-ended answers into themed friction findings, so you see where and why users struggle, not just that they did.

Koji supports all six structured question types — open_ended, scale, single_choice, multiple_choice, ranking, and yes_no — which means you can capture a binary success flag, a 1–5 ease rating, and a rich open-ended "what went wrong" in a single automated study. Because Koji runs 24/7, you can recruit 30–50 participants for a true quantitative benchmark in days rather than weeks, and re-run the identical study every release to track the trend line. Teams using AI-assisted research tools consistently report dramatically faster time-to-insight precisely because the counting, timing, and tagging — the slow part — is done the instant the last response lands.

You do not need a PhD in measurement theory to run a rigorous usability study. Define the tasks, attach the right structured questions, and let the platform handle the statistics.

Common mistakes to avoid

Reporting time on task for failed attempts. Always separate successful and unsuccessful times.
Using the arithmetic mean on small samples. Time data is skewed — use the median or geometric mean.
Quoting a success rate without a confidence interval. "80% from five users" is not a precise number.
Measuring success but never satisfaction. A usable-but-frustrating product still loses users.
Changing the task wording between benchmark rounds. Consistency is what makes release-over-release comparison valid.

Related Resources

Structured Questions Guide — the six question types for capturing success, ease, and friction
Usability Testing: The Complete Guide — the end-to-end method these metrics live inside
Usability Benchmarking Guide — turning these metrics into a tracked program
System Usability Scale (SUS) Guide — the standard attitudinal usability score
Customer Effort Score Guide — measuring perceived ease at the task level
Think-Aloud Protocol — surfacing the why behind every error

Usability Metrics: Task Success Rate, Time on Task, and Error Rate Explained

What are the core usability metrics?

Metric 1: Task success rate (effectiveness)

Metric 2: Time on task (efficiency)

Metric 3: Error rate (accuracy)

Putting the three together

How many participants do you need?

The modern approach: capturing usability metrics with AI

Common mistakes to avoid

Related Resources

Related Articles

How to Measure Customer Effort Score (CES) and Reduce Friction

Structured Questions in AI Interviews

System Usability Scale (SUS): Complete Guide with Calculator, Benchmarks & Examples

Think-Aloud Protocol: How to Run and Analyze Think-Aloud Sessions

Usability Benchmarking: How to Run a Benchmark UX Study and Track Metrics Over Time

How to Conduct Usability Testing: The Complete Guide