SUPR-Q: The Standardized Questionnaire for Measuring Website Quality, Trust & Loyalty (2026 Guide)
SUPR-Q is an 8-item questionnaire that scores your website or app on usability, trust, appearance, and loyalty — then converts it to a percentile rank against a normative database. Here is how to run, score, and interpret it (and how to do it faster with AI).
What is the SUPR-Q?
The SUPR-Q (Standardized User Experience Percentile Rank Questionnaire) is an 8-item survey that measures the overall quality of a website or app and expresses the result as a percentile rank — a single number from 0 to 100% that tells you how your experience compares to hundreds of other digital products.
The bottom line: A SUPR-Q score at the 50th percentile is exactly average. Above the 90th percentile means your site outperforms 90% of the products in the benchmark database. The questionnaire rolls up four sub-dimensions — Usability, Trust & Credibility, Appearance, and Loyalty — into one comparable score, which is what makes it more complete than a usability-only metric like SUS.
SUPR-Q was developed by Jeff Sauro and the team at MeasuringU and validated against a normative database of 150+ websites and over 5,000 users. That normative database is the magic: unlike a raw average, a percentile rank is only meaningful because it is anchored to real-world data from e-commerce, B2B, travel, finance, and SaaS sites. When you report "our checkout flow is at the 72nd percentile," everyone in the room instantly understands whether that is good.
The 8 SUPR-Q questions and their four factors
SUPR-Q uses eight statements rated on a 5-point agreement scale (1 = Strongly disagree, 5 = Strongly agree), plus one 11-point likelihood-to-recommend item. The items load onto four factors, two items each:
Usability
- The website is easy to use.
- It is easy to navigate within the website.
Trust & Credibility 3. The information on the website is trustworthy. 4. The website is trustworthy.
Appearance 5. I find the website to be attractive. 6. The website has a clean and simple presentation.
Loyalty 7. How likely are you to recommend this website to a friend or colleague? (0–10, the Net Promoter item) 8. I will likely return to the website in the future.
Two design choices make SUPR-Q efficient. First, it is short — eight items take respondents under two minutes. Second, it bakes loyalty (including the NPS question) directly into the instrument, so you capture an attitude metric and a behavioral-intent metric in one pass instead of fielding two separate surveys.
How SUPR-Q scoring works
There are three layers to a SUPR-Q result:
- The raw score. Average the eight items (after converting the 0–10 NPS item to the same 1–5 footing using the published transformation) to get a mean from 1 to 5.
- The percentile rank. Convert that raw mean to a percentile against the normative database. A raw score around 3.9 typically lands near the 50th percentile; the relationship is non-linear, which is exactly why the lookup against the benchmark matters more than the raw average.
- The sub-scores. Report each of the four factors separately. A site can sit at the 80th percentile on Appearance but the 30th percentile on Trust — and that gap is the most actionable thing the instrument gives you.
The percentile framing is the whole point: it turns an abstract 1–5 average into a competitive statement ("we beat 72% of sites") that executives and designers both understand without a statistics lesson.
SUPR-Q vs SUS vs NPS vs CES
These instruments are complementary, not interchangeable:
- SUS (System Usability Scale) measures perceived usability only, on a 0–100 scale. It is the gold standard for usability, but says nothing about trust or visual appeal.
- SUPR-Q measures the broader experience — usability plus trust, appearance, and loyalty — and is purpose-built for websites and web apps.
- NPS measures loyalty alone. SUPR-Q includes the NPS item but contextualizes it inside the full experience.
- CES (Customer Effort Score) measures friction on a single task.
Rule of thumb: use SUS when you are evaluating a tool or task flow, and SUPR-Q when you are benchmarking a whole website or marketing/commerce experience where trust and aesthetics drive conversion. Many mature teams track SUPR-Q quarterly as a site-health benchmark and SUS per release.
How many participants do you need?
SUPR-Q is a quantitative instrument, so sample size matters more than it does for a formative usability test:
- 30–50 respondents for a directional internal read.
- 75–100 respondents for a stable benchmark you will track over time.
- 100+ per segment if you want to compare percentile ranks between audiences (e.g., new vs. returning visitors) with confidence.
Because the score is anchored to an external normative database, you do not need thousands of responses — you need enough to make your own mean stable, then the benchmark does the comparative work.
Common SUPR-Q pitfalls
- Reporting only the overall score. The four sub-scores are where the insight lives. A strong overall percentile can hide a Trust problem that is quietly killing conversion.
- Modifying the wording. Like SUS, the normative database is built on the exact published items. Rewrite them and you forfeit the percentile comparison.
- Surveying the wrong moment. Field SUPR-Q after a representative task, not on arrival — you are measuring the experience, not first impressions (use a 5-second test for that).
- Ignoring the "why." A percentile rank tells you where you stand, never why. Pair every quantitative score with open-ended follow-up.
How to run SUPR-Q faster with Koji
Traditional SUPR-Q studies mean building a survey, recruiting a panel, exporting to a stats tool, and manually computing percentile ranks — usually a one-to-two-week cycle. Platforms like Koji collapse that into an afternoon by treating SUPR-Q as a set of structured questions inside an AI-moderated interview.
Koji supports six structured question types — open_ended, scale, single_choice, multiple_choice, ranking, and yes_no (see the structured questions guide). For SUPR-Q you map the eight items to:
- Scale questions for the six 1–5 agreement statements and the 0–10 NPS item (Koji captures the exact ground-truth value via the response widget, so scoring is deterministic — no transcription guesswork).
- An open_ended probe layered on the Trust and Appearance factors. This is the part traditional surveys cannot do: when a respondent rates Trust a 2, Koji's AI interviewer automatically asks a follow-up — "What made the site feel less trustworthy to you?" — and keeps probing up to your configured depth.
The result is a study that delivers the standardized percentile score and the qualitative reason behind every low sub-score, with the per-question distribution charts and themed open-text findings generated automatically in a real-time report. No moderator, no manual coding, voice or text, running 24/7. That is the difference between knowing you are at the 35th percentile on Trust and knowing exactly which three things to fix.
Worked example: reading a SUPR-Q result
Imagine you run SUPR-Q on your checkout flow with 90 shoppers and get an overall result at the 68th percentile. On its own that looks fine — above average. But break out the four factors and the story changes:
- Usability — 81st percentile. The flow is easy to use.
- Appearance — 74th percentile. It looks clean and credible.
- Trust & Credibility — 38th percentile. A clear weakness.
- Loyalty — 55th percentile. Middling intent to return and recommend.
The overall percentile hid the real problem. Shoppers can complete checkout easily and find it attractive, but they do not fully trust it — which on a payment page directly suppresses conversion. The action is obvious: invest in trust signals (security badges, clearer policies, social proof), not in visual polish or flow simplification. This is why reporting the sub-scores is non-negotiable, and why pairing each score with an open-ended "why" question — automatically, on the low factors — turns a benchmark into a roadmap.
When to re-field SUPR-Q
Treat SUPR-Q as a tracking benchmark, not a one-off. Re-field it after any significant redesign, and on a fixed quarterly cadence even when nothing changes, so you can see whether competitors, expectations, or your own iterations have moved your percentile. Keep the audience, task, and wording identical between waves — the only variable you want to change is the experience itself.
Related Resources
- Structured Questions Guide — the six question types that power quantitative-plus-qualitative studies in Koji
- System Usability Scale (SUS): Complete Guide — the usability-only companion benchmark
- HEART Framework: Google's 5-Metric UX Model — for behavioral UX measurement at scale
- Likert Scale Questions in User Research — how to design the agreement scales SUPR-Q relies on
- CSAT vs NPS vs CES — choosing the right experience metric
- Usability Benchmarking Guide — how to track UX metrics over time
Related Articles
CSAT vs NPS vs CES: Which Customer Experience Metric to Use
A clear comparison of CSAT, NPS, and CES — what each measures, when to use it, real benchmarks, and how to capture the reasons behind every score with AI-moderated follow-ups.
HEART Framework: Google’s 5-Metric Model for Measuring User Experience (2026 Guide)
The complete guide to Google’s HEART framework — the five user-centered metrics (Happiness, Engagement, Adoption, Retention, Task Success), the Goals–Signals–Metrics process, and how to collect each metric in days, not quarters, with AI-moderated research on Koji.
Likert Scale Questions: How to Use Rating Scales in User Research
A complete guide to Likert scale questions in user research — what they are, when to use them, how to write them correctly, and how Koji's AI interviews take rating scales further by pairing quantitative scores with qualitative follow-up.
Structured Questions in AI Interviews
Mix quantitative data collection — scales, ratings, multiple choice, ranking — with AI-powered conversational follow-up in a single interview.
System Usability Scale (SUS): Complete Guide with Calculator, Benchmarks & Examples
The definitive 2026 guide to the System Usability Scale (SUS): the 10-question formula, scoring calculator, Sauro–Lewis benchmark grades, and how to deploy SUS at scale with AI-moderated interviews on Koji.
Usability Benchmarking: How to Run a Benchmark UX Study and Track Metrics Over Time
The complete guide to UX benchmarking — what it is, the metrics to track, how to run a repeatable benchmark usability study, sample sizes, and how AI-moderated research makes continuous benchmarking practical.