New

Now in Claude, ChatGPT, Cursor & more with our MCP server

Back to docs
Research Operations

HIPAA-Compliant AI User Research: A Practical Playbook for Healthcare and HealthTech

Run AI-moderated customer research in healthcare contexts without putting PHI at risk. Patterns for HIPAA alignment, anonymous-mode interviews, BYOK, sub-processor handling, and what Enterprise teams need from a vendor.

HIPAA-Compliant AI User Research: A Practical Playbook for Healthcare and HealthTech

Answer first: You can run AI-moderated customer research in HIPAA-regulated contexts without ever touching Protected Health Information (PHI) — and that is almost always the right design. Most patient and provider research questions (about workflow friction, brand perception, app usability, billing experience, even symptom journeys) can be answered through interviews that are scoped to avoid the 18 HIPAA identifiers entirely. When PHI is genuinely required, the same study can be re-run on the Enterprise tier with a signed Business Associate Agreement, customer-controlled LLM keys (BYOK), and per-study retention controls. Koji is designed so that the default research path produces de-identified qualitative data: anonymous-mode interviews, no demographic identifiers required, transcript-level redaction in reports, and a per-study retention setting. With tools like Koji, healthcare teams get the speed of AI customer research without inheriting the compliance overhead of a clinical system of record.

If you're a product, UX, or marketing team at a payer, provider, EHR vendor, digital therapeutics company, pharmacy, or any HealthTech startup, this guide walks through how to design AI customer research that holds up under your privacy office's review.

What HIPAA actually requires (in research terms)

HIPAA's Privacy Rule applies to Covered Entities (health plans, healthcare providers that bill electronically, healthcare clearinghouses) and their Business Associates (vendors that handle PHI on a Covered Entity's behalf). It governs Protected Health Information — health information tied to one of 18 specific identifiers (name, address, dates more granular than year, phone, email, MRN, etc.).

For customer research, three rules dominate:

  1. De-identification removes the obligation. If your interview data does not contain any of the 18 HIPAA identifiers and there is no reasonable basis to re-identify a participant, the data is not PHI and HIPAA does not apply.
  2. Authorization is needed to use PHI for research outside treatment, payment, and operations — and most Covered Entities require IRB review for any study that touches PHI.
  3. Business Associates need a BAA. Any vendor that receives PHI must sign a Business Associate Agreement that flows down HIPAA obligations.

The practical implication for product teams: scope your research so you never need to collect PHI in the first place. Almost every product-discovery, churn, onboarding, pricing, and brand question can be answered without it.

The default Koji healthcare research pattern

For 90% of healthcare and HealthTech research, run the study with the following configuration. This produces interview data that is not PHI under HIPAA's de-identification standard and does not require a BAA.

  • Anonymous mode on. No email, name, phone, or address collected at intake. Participants are assigned a stable but opaque respondent ID. (Anonymizing customer interview data covers the controls in detail.)
  • Screening avoids the 18 identifiers. Use structured screener questions on role and behavior ("how often do you log medication intake?") rather than identifiable demographics ("what is your date of birth?"). The research screener questions doc has a full pattern library.
  • The AI interviewer is briefed not to elicit PHI. Add a short company-context instruction (see company context guide) telling the AI moderator: "Do not ask for, and do not record, the participant's name, date of birth, address, medical record number, specific diagnoses, or specific treatment dates. If a participant volunteers this information, acknowledge it briefly and steer the conversation back to the research topic."
  • Per-study retention configured. Set transcripts to auto-delete after 90 days (or whatever your privacy office approves). Reports and themes remain; raw transcripts are purged.
  • Reports use anonymized quotes. Koji's AI-generated reports surface themes and pull representative quotes. Configure the export to scrub any inadvertent identifiers before sharing outside your team.

This pattern lets a digital health PM run 30 patient interviews in a week, surface the friction themes, and ship the fix — without putting their company through a BAA negotiation for every study.

When you genuinely need PHI: the Enterprise path

Some research questions cannot be answered without PHI. Examples: post-discharge care research that requires linking back to the original encounter; clinical trial recruitment screening; a payer study comparing benefit utilization across named members; provider research that requires NPI-level tracking.

For those studies:

  • Move to the Enterprise tier and request a Business Associate Agreement before any PHI flows. Standard self-serve plans (Insights, Interviews) do not include a BAA.
  • Use BYOK (Bring Your Own Key) so the LLM calls hit your Anthropic, OpenAI, or Google Vertex account on a contract you've already negotiated with HIPAA terms. See the Bring Your Own Key doc for setup. BYOK means the LLM provider is your sub-processor, not Koji's, and the conversation content never enters the default shared inference pipeline.
  • Restrict the participant audience with personalized interview links tied to an internal participant ID rather than a name or email.
  • Tighten retention and access. Restrict the study to specific workspace members, set retention to the shortest period your protocol allows, and disable export to third-party tools like Slack and Notion unless those vendors are also under BAA.

This is the same pattern enterprise healthcare buyers expect from any AI vendor in 2026 — and it's the reason Koji's architecture separates the application layer (Vercel + Supabase, both SOC 2 Type II) from the inference layer (where BYOK lets you bring your own contracts).

The 18 HIPAA identifiers your interviews must avoid

If your study is running under the de-identification path, the AI moderator and your screener should never elicit or store:

  1. Names
  2. Geographic subdivisions smaller than a state (street, city, ZIP — except first 3 digits of ZIP in some cases)
  3. Dates more granular than year (DOB, admission date, discharge date)
  4. Phone numbers
  5. Fax numbers
  6. Email addresses
  7. Social Security numbers
  8. Medical record numbers
  9. Health plan beneficiary numbers
  10. Account numbers
  11. Certificate / license numbers
  12. Vehicle identifiers
  13. Device identifiers and serial numbers
  14. URLs that identify the individual
  15. IP addresses
  16. Biometric identifiers (fingerprints, voiceprints)
  17. Full-face photos or comparable images
  18. Any other unique identifying number, characteristic, or code

Number 16 is worth a pause for voice-mode interviews. A voiceprint is a HIPAA identifier. For studies in regulated contexts, prefer text-mode interviews (voice vs text interviews), or use Enterprise + BAA + BYOK if voice is required.

Structured questions as a compliance lever

Koji's six structured question types — open_ended, scale, single_choice, multiple_choice, ranking, yes_no — give you a way to capture sensitive content as categorical buckets instead of free-text PHI. Examples:

  • Instead of asking "what medications are you on?" (free-text → likely PHI), use a multiple_choice question with anonymized therapeutic categories.
  • Instead of "when were you diagnosed?" (date → identifier), use a single_choice question with year ranges.
  • Instead of "how often do you experience symptoms?" (open-text invitation to overshare), use a scale (1-5) or a single_choice with frequency buckets.

The structured questions guide is the canonical reference; the scale questions guide covers Likert and frequency patterns specifically.

This is one of the most important Koji differentiators for regulated industries. Traditional survey tools force you into a binary choice: free-text (sensitive overshare risk) or rigid multiple choice (low signal). Koji's AI-moderated structured questions let the AI follow up within the bucket boundary — depth without unbounded free text.

A safer participant intake pattern

For any healthcare research study, your intake form should:

  • Not require email. Use anonymous mode and let participants land via a generic study link rather than a personalized one.
  • Include an explicit research-only consent. Re-using the research consent form templates library, add a sentence: "This research is conducted under our internal research policy, not as a clinical activity. We do not need or want your PHI. Please do not share names, dates of birth, MRNs, or specific clinical details."
  • Set expectations on AI moderation. Disclose that an AI is moderating, that no human will listen to recordings in real time, and that the data will be used only for product/service improvement.
  • Provide a contact for withdrawal. Even with anonymous mode, give participants a way to request deletion of their interview by quoting their respondent ID (visible at the end of every Koji interview).

What Koji's architecture gives you out of the box

  • TLS 1.2+ in transit, AES-256 at rest for all participant content.
  • No model training on customer data — Koji has contractual no-train clauses with its LLM providers, and on Enterprise BYOK the contract is yours directly.
  • SOC 2 Type II infrastructure via Vercel and Supabase as primary sub-processors.
  • Per-study retention controls so transcripts can be purged on a schedule.
  • Workspace-scoped access so a study is only visible to the people you explicitly add.
  • Audit logs of who accessed which study and when (Enterprise tier).

Koji is not, by default, a HIPAA Covered Entity or Business Associate. The Enterprise tier supports a BAA on request for customers who need to send PHI through the platform. For most healthcare research programs, the de-identified pattern above is faster to launch, faster to ship insights from, and avoids the BAA path entirely.

Comparison: HIPAA on Koji vs. traditional research tools

  • SurveyMonkey and Typeform require enterprise plans to sign a BAA — and even then, free-text responses are an overshare hazard. Without AI-moderated probing, you can't coach participants away from PHI in the moment.
  • Qualtrics has a HIPAA-eligible XM tier, but the platform is built around quantitative survey logic, not conversational depth. The cost and complexity is enterprise-only.
  • Manual moderator interviews can be HIPAA-aligned but they're slow and expensive. A single in-depth provider interview is typically $150–$400 in incentive plus 60-90 minutes of moderator time.
  • Koji lets you run de-identified AI interviews at scale on self-serve plans, and graduates to Enterprise + BAA + BYOK when a study genuinely needs PHI. The same study design, two compliance modes.

Common pitfalls to avoid

  • Treating anonymized as anonymous. A combination of role + employer + ZIP can re-identify a single person in a small market. Be careful with combinations.
  • Forgetting voice biometrics. Voice recordings are themselves an identifier. Default to text in HIPAA-regulated studies unless you have a BAA in place.
  • Exporting raw transcripts to non-BAA tools. If you push transcripts to a Notion workspace or a Slack channel, those vendors are now in scope. Use anonymized themes and quotes instead, or restrict integration use to BAA-covered destinations.
  • Letting screener questions identify rare conditions. A screener that filters for "adults with a rare disease in a small ZIP code" produces an effectively re-identifiable sample even before the interview starts.
  • Skipping the IRB conversation when PHI is involved. Most Covered Entities require IRB review for any research that touches PHI, even when an external vendor handles the moderation.

A 5-step checklist before you launch

  1. Scope the question. Can you answer it without any of the 18 identifiers? If yes, default path. If no, Enterprise + BAA path.
  2. Configure anonymous mode and disable demographic intake. Use behavioral screeners only.
  3. Brief the AI interviewer with explicit PHI-avoidance instructions in company context.
  4. Set retention to the shortest acceptable window (often 30-90 days for transcripts).
  5. Review the first 3 transcripts before scaling. If participants are oversharing, tighten the AI's steering instructions and re-launch.

Related Resources

Related Articles

Bring Your Own Key (BYOK)

Use your own AI provider API keys with Koji for greater control over costs and model access.

Plan Comparison Guide

Compare Koji's credit-based pricing plans — Free, Insights, Interviews, and Enterprise — to find the right fit for your research needs.

Personalized Interview Links: Send Targeted Research Invitations to Every Participant

Embed participant-specific context into Koji interview URLs so the AI greets each person by name, references their company, and tailors the conversation — automatically. Covers CSV import, URL parameters, and CRM integration patterns.

Voice vs Text Interview: When to Use Each Mode

Choosing between voice and text mode for your AI interview? This guide breaks down response depth, completion rate, audience fit, and cost — plus a decision matrix that tells you which mode wins for each research scenario.

Intake Forms and Consent

Collect participant information and consent before interviews begin with customizable form fields.

Scale Questions in AI Interviews: Measure NPS, CSAT, and Ratings Automatically

Learn how to configure and use scale questions in Koji AI interviews to capture NPS, CSAT, and satisfaction ratings — with automatic probing and aggregated distribution charts in your research report.

Company Context: How to Make Your AI Interviewer a Domain Expert

Learn how to configure Koji's company context setting so your AI interviewer asks sharper, more relevant follow-up questions across every study you run.

Structured Questions in AI Interviews

Mix quantitative data collection — scales, ratings, multiple choice, ranking — with AI-powered conversational follow-up in a single interview.

Research Screener Questions: How to Write Questions That Find the Right Participants

Learn how to write effective screener questions that filter the right participants for your user research studies. Includes 10 proven templates, best practices, and common mistakes to avoid.

Research Consent Form Templates: GDPR-Compliant Forms for Every Study

Ready-to-use consent form templates for user research, UX studies, and AI interviews. Covers GDPR compliance, informed consent best practices, and how to collect consent automatically with Koji.

Research Ethics and Informed Consent: A Practical Guide for UX Teams

A practical guide to ethical UX research — covering the Belmont Report's three principles, GDPR informed consent requirements, how to handle AI tools responsibly, and how to build ethical maturity in your research practice.

Anonymizing Customer Interview Data: A Practical Guide for Privacy-Safe Research

Five operational techniques for handling PII in AI customer interviews — from intake-time anonymization to stakeholder-safe quote sharing — without sacrificing research signal.

GDPR-Compliant AI User Research: A Practical Guide

How to run AI-moderated customer interviews under GDPR. Lawful basis, consent flows, data minimization, retention, sub-processors, and how Koji handles each requirement.

AI-Powered Patient and Provider Research for Healthcare

How healthcare organizations use Koji to conduct patient experience research, provider feedback studies, and clinical workflow analysis at scale — while maintaining HIPAA-aware research practices.