Research Operations

HIPAA-Compliant AI User Research: A Practical Playbook for Healthcare and HealthTech

Run AI-moderated customer research in healthcare contexts without putting PHI at risk. Patterns for HIPAA alignment, anonymous-mode interviews, BYOK, sub-processor handling, and what Enterprise teams need from a vendor.

HIPAA-Compliant AI User Research: A Practical Playbook for Healthcare and HealthTech

Answer first: You can run AI-moderated customer research in HIPAA-regulated contexts without ever touching Protected Health Information (PHI) — and that is almost always the right design. Most patient and provider research questions (about workflow friction, brand perception, app usability, billing experience, even symptom journeys) can be answered through interviews that are scoped to avoid the 18 HIPAA identifiers entirely. When PHI is genuinely required, the same study can be re-run on the Enterprise tier with a signed Business Associate Agreement, customer-controlled LLM keys (BYOK), and per-study retention controls. The BAA is available on request and executed during onboarding — see /compliance/hipaa for the full HIPAA-ready posture, scope, and the controls applied. Koji is designed so that the default research path produces de-identified qualitative data: anonymous-mode interviews, no demographic identifiers required, transcript-level redaction in reports, and a per-study retention setting. With tools like Koji, healthcare teams get the speed of AI customer research without inheriting the compliance overhead of a clinical system of record.

If you're a product, UX, or marketing team at a payer, provider, EHR vendor, digital therapeutics company, pharmacy, or any HealthTech startup, this guide walks through how to design AI customer research that holds up under your privacy office's review.

What HIPAA actually requires (in research terms)

HIPAA's Privacy Rule applies to Covered Entities (health plans, healthcare providers that bill electronically, healthcare clearinghouses) and their Business Associates (vendors that handle PHI on a Covered Entity's behalf). It governs Protected Health Information — health information tied to one of 18 specific identifiers (name, address, dates more granular than year, phone, email, MRN, etc.).

For customer research, three rules dominate:

De-identification removes the obligation. If your interview data does not contain any of the 18 HIPAA identifiers and there is no reasonable basis to re-identify a participant, the data is not PHI and HIPAA does not apply.
Authorization is needed to use PHI for research outside treatment, payment, and operations — and most Covered Entities require IRB review for any study that touches PHI.
Business Associates need a BAA. Any vendor that receives PHI must sign a Business Associate Agreement that flows down HIPAA obligations.

The practical implication for product teams: scope your research so you never need to collect PHI in the first place. Almost every product-discovery, churn, onboarding, pricing, and brand question can be answered without it.

The default Koji healthcare research pattern

For 90% of healthcare and HealthTech research, run the study with the following configuration. This produces interview data that is not PHI under HIPAA's de-identification standard and does not require a BAA.

Anonymous mode on. No email, name, phone, or address collected at intake. Participants are assigned a stable but opaque respondent ID. (Anonymizing customer interview data covers the controls in detail.)
Screening avoids the 18 identifiers. Use structured screener questions on role and behavior ("how often do you log medication intake?") rather than identifiable demographics ("what is your date of birth?"). The research screener questions doc has a full pattern library.
The AI interviewer is briefed not to elicit PHI. Add a short company-context instruction (see company context guide) telling the AI moderator: "Do not ask for, and do not record, the participant's name, date of birth, address, medical record number, specific diagnoses, or specific treatment dates. If a participant volunteers this information, acknowledge it briefly and steer the conversation back to the research topic."
Per-study retention configured. Set transcripts to auto-delete after 90 days (or whatever your privacy office approves). Reports and themes remain; raw transcripts are purged.
Reports use anonymized quotes. Koji's AI-generated reports surface themes and pull representative quotes. Configure the export to scrub any inadvertent identifiers before sharing outside your team.

This pattern lets a digital health PM run 30 patient interviews in a week, surface the friction themes, and ship the fix — without putting their company through a BAA negotiation for every study.

When you genuinely need PHI: the Enterprise path

Some research questions cannot be answered without PHI. Examples: post-discharge care research that requires linking back to the original encounter; clinical trial recruitment screening; a payer study comparing benefit utilization across named members; provider research that requires NPI-level tracking.

For those studies:

Move to the Enterprise tier and request a Business Associate Agreement before any PHI flows. Standard self-serve plans (Insights, Interviews) do not include a BAA.
Use BYOK (Bring Your Own Key) so the LLM calls hit your Anthropic, OpenAI, or Google Vertex account on a contract you've already negotiated with HIPAA terms. See the Bring Your Own Key doc for setup. BYOK means the LLM provider is your sub-processor, not Koji's, and the conversation content never enters the default shared inference pipeline.
Restrict the participant audience with personalized interview links tied to an internal participant ID rather than a name or email.
Tighten retention and access. Restrict the study to specific workspace members, set retention to the shortest period your protocol allows, and disable export to third-party tools like Slack and Notion unless those vendors are also under BAA.

This is the same pattern enterprise healthcare buyers expect from any AI vendor in 2026 — and it's the reason Koji's architecture separates the application layer (Vercel + Supabase, both SOC 2 Type II) from the inference layer (where BYOK lets you bring your own contracts).

The 18 HIPAA identifiers your interviews must avoid

If your study is running under the de-identification path, the AI moderator and your screener should never elicit or store:

Names
Geographic subdivisions smaller than a state (street, city, ZIP — except first 3 digits of ZIP in some cases)
Dates more granular than year (DOB, admission date, discharge date)
Phone numbers
Fax numbers
Email addresses
Social Security numbers
Medical record numbers
Health plan beneficiary numbers
Account numbers
Certificate / license numbers
Vehicle identifiers
Device identifiers and serial numbers
URLs that identify the individual
IP addresses
Biometric identifiers (fingerprints, voiceprints)
Full-face photos or comparable images
Any other unique identifying number, characteristic, or code

Number 16 is worth a pause for voice-mode interviews. A voiceprint is a HIPAA identifier. For studies in regulated contexts, prefer text-mode interviews (voice vs text interviews), or use Enterprise + BAA + BYOK if voice is required.

Structured questions as a compliance lever

Koji's six structured question types — open_ended, scale, single_choice, multiple_choice, ranking, yes_no — give you a way to capture sensitive content as categorical buckets instead of free-text PHI. Examples:

Instead of asking "what medications are you on?" (free-text → likely PHI), use a multiple_choice question with anonymized therapeutic categories.
Instead of "when were you diagnosed?" (date → identifier), use a single_choice question with year ranges.
Instead of "how often do you experience symptoms?" (open-text invitation to overshare), use a scale (1-5) or a single_choice with frequency buckets.

The structured questions guide is the canonical reference; the scale questions guide covers Likert and frequency patterns specifically.

This is one of the most important Koji differentiators for regulated industries. Traditional survey tools force you into a binary choice: free-text (sensitive overshare risk) or rigid multiple choice (low signal). Koji's AI-moderated structured questions let the AI follow up within the bucket boundary — depth without unbounded free text.

A safer participant intake pattern

For any healthcare research study, your intake form should:

Not require email. Use anonymous mode and let participants land via a generic study link rather than a personalized one.
Include an explicit research-only consent. Re-using the research consent form templates library, add a sentence: "This research is conducted under our internal research policy, not as a clinical activity. We do not need or want your PHI. Please do not share names, dates of birth, MRNs, or specific clinical details."
Set expectations on AI moderation. Disclose that an AI is moderating, that no human will listen to recordings in real time, and that the data will be used only for product/service improvement.
Provide a contact for withdrawal. Even with anonymous mode, give participants a way to request deletion of their interview by quoting their respondent ID (visible at the end of every Koji interview).

What Koji's architecture gives you out of the box

TLS 1.2+ in transit, AES-256 at rest for all participant content.
No model training on customer data — Koji has contractual no-train clauses with its LLM providers, and on Enterprise BYOK the contract is yours directly.
SOC 2 Type II infrastructure via Vercel and Supabase as primary sub-processors.
Per-study retention controls so transcripts can be purged on a schedule.
Workspace-scoped access so a study is only visible to the people you explicitly add.
Audit logs of who accessed which study and when (Enterprise tier).

Koji is not, by default, a HIPAA Covered Entity or Business Associate. The Enterprise tier supports a BAA on request for customers who need to send PHI through the platform. For most healthcare research programs, the de-identified pattern above is faster to launch, faster to ship insights from, and avoids the BAA path entirely.

Comparison: HIPAA on Koji vs. traditional research tools

SurveyMonkey and Typeform require enterprise plans to sign a BAA — and even then, free-text responses are an overshare hazard. Without AI-moderated probing, you can't coach participants away from PHI in the moment.
Qualtrics has a HIPAA-eligible XM tier, but the platform is built around quantitative survey logic, not conversational depth. The cost and complexity is enterprise-only.
Manual moderator interviews can be HIPAA-aligned but they're slow and expensive. A single in-depth provider interview is typically $150–$400 in incentive plus 60-90 minutes of moderator time.
Koji lets you run de-identified AI interviews at scale on self-serve plans, and graduates to Enterprise + BAA + BYOK when a study genuinely needs PHI. The same study design, two compliance modes.

Common pitfalls to avoid

Treating anonymized as anonymous. A combination of role + employer + ZIP can re-identify a single person in a small market. Be careful with combinations.
Forgetting voice biometrics. Voice recordings are themselves an identifier. Default to text in HIPAA-regulated studies unless you have a BAA in place.
Exporting raw transcripts to non-BAA tools. If you push transcripts to a Notion workspace or a Slack channel, those vendors are now in scope. Use anonymized themes and quotes instead, or restrict integration use to BAA-covered destinations.
Letting screener questions identify rare conditions. A screener that filters for "adults with a rare disease in a small ZIP code" produces an effectively re-identifiable sample even before the interview starts.
Skipping the IRB conversation when PHI is involved. Most Covered Entities require IRB review for any research that touches PHI, even when an external vendor handles the moderation.

A 5-step checklist before you launch

Scope the question. Can you answer it without any of the 18 identifiers? If yes, default path. If no, Enterprise + BAA path.
Configure anonymous mode and disable demographic intake. Use behavioral screeners only.
Brief the AI interviewer with explicit PHI-avoidance instructions in company context.
Set retention to the shortest acceptable window (often 30-90 days for transcripts).
Review the first 3 transcripts before scaling. If participants are oversharing, tighten the AI's steering instructions and re-launch.

Related Resources

Structured Questions Guide — the six question types that let you capture sensitive content as categorical buckets.
GDPR-Compliant AI User Research — sister guide for EU privacy law.
Anonymizing Customer Interview Data — the field-level controls Koji provides for de-identification.
Bring Your Own Key (BYOK) — how to route LLM calls through your own contracted account on the Enterprise tier.
Research Consent Form Templates — starter templates you can adapt for healthcare contexts.
Research Screener Questions — screening patterns that don't collect identifiers.
AI Research for Healthcare — broader playbook for patient and provider research.
Research Ethics Guide — informed consent, incentives, and minimizing harm in qualitative research.

Product & Research

People & Marketing

Partners & Education

HIPAA-Compliant AI User Research: A Practical Playbook for Healthcare and HealthTech

HIPAA-Compliant AI User Research: A Practical Playbook for Healthcare and HealthTech

What HIPAA actually requires (in research terms)

The default Koji healthcare research pattern

When you genuinely need PHI: the Enterprise path

The 18 HIPAA identifiers your interviews must avoid

Structured questions as a compliance lever

A safer participant intake pattern

What Koji's architecture gives you out of the box

Comparison: HIPAA on Koji vs. traditional research tools

Common pitfalls to avoid

A 5-step checklist before you launch

Related Resources

Related Articles

AI Interview Data Privacy & Security: A Buyer's Evaluation Guide

AI-Powered Patient and Provider Research for Healthcare

AI Customer Research for Pharma & Life Sciences

Anonymizing Customer Interview Data: A Practical Guide for Privacy-Safe Research

Bring Your Own Key (BYOK)

Company Context: How to Make Your AI Interviewer a Domain Expert

Enterprise Security for AI Customer Research Platforms: SOC 2, SSO, and Vendor Review

GDPR-Compliant AI User Research: A Practical Guide

Intake Forms and Consent

Personalized Interview Links: Send Targeted Research Invitations to Every Participant

Plan Comparison Guide

Research Consent Form Templates: GDPR-Compliant Forms for Every Study

Research Ethics and Informed Consent: A Practical Guide for UX Teams

Research Screener Questions: How to Write Questions That Find the Right Participants

Scale Questions in AI Interviews: Measure NPS, CSAT, and Ratings Automatically

Structured Questions in AI Interviews

Voice vs Text Interview: When to Use Each Mode