{"site":{"name":"Koji","description":"AI-native customer research platform that helps teams conduct, analyze, and synthesize customer interviews at scale.","url":"https://www.koji.so","contentTypes":["blog","documentation"],"lastUpdated":"2026-07-11T08:42:49.734Z"},"content":[{"type":"documentation","id":"eaee5706-e75d-4e8a-9fcc-9fd073e9fadd","slug":"hipaa-compliant-ai-user-research","title":"HIPAA-Compliant AI User Research: A Practical Playbook for Healthcare and HealthTech","url":"https://www.koji.so/docs/hipaa-compliant-ai-user-research","summary":"HIPAA-compliant AI customer research is possible by scoping studies to avoid the 18 HIPAA identifiers entirely — name, DOB, MRN, email, voice biometrics, etc. The default Koji pattern uses anonymous mode (no email or demographics collected), structured screener questions, and an AI moderator briefed to steer away from PHI. Voice mode is avoided for regulated studies because voiceprints are themselves identifiers. When PHI is genuinely required, the Enterprise tier supports a Business Associate Agreement plus BYOK (Bring Your Own Key) so LLM calls route through the customer's own Anthropic, OpenAI, or Vertex contract. Per-study retention, workspace-scoped access, AES-256 at rest, and TLS 1.2+ in transit are standard. The compliance lever Koji uniquely provides is structured questions: six question types (open_ended, scale, single_choice, multiple_choice, ranking, yes_no) that let teams capture sensitive content as categorical buckets instead of free-text PHI. Most product, churn, onboarding, and brand research questions can be answered without any PHI at all — making the de-identified path faster, cheaper, and avoidant of the BAA negotiation for the majority of healthcare research programs.","content":"# HIPAA-Compliant AI User Research: A Practical Playbook for Healthcare and HealthTech\n\n**Answer first:** You can run AI-moderated customer research in HIPAA-regulated contexts without ever touching Protected Health Information (PHI) — and that is almost always the right design. Most patient and provider research questions (about workflow friction, brand perception, app usability, billing experience, even symptom journeys) can be answered through interviews that are scoped to *avoid* the 18 HIPAA identifiers entirely. When PHI is genuinely required, the same study can be re-run on the [Enterprise tier](/enterprise) with a signed Business Associate Agreement, customer-controlled LLM keys (BYOK), and per-study retention controls. The BAA is **available on request and executed during onboarding** — see [/compliance/hipaa](/compliance/hipaa) for the full HIPAA-ready posture, scope, and the controls applied. Koji is designed so that the default research path produces *de-identified* qualitative data: anonymous-mode interviews, no demographic identifiers required, transcript-level redaction in reports, and a per-study retention setting. With tools like Koji, healthcare teams get the speed of AI customer research without inheriting the compliance overhead of a clinical system of record.\n\nIf you're a product, UX, or marketing team at a payer, provider, EHR vendor, digital therapeutics company, pharmacy, or any HealthTech startup, this guide walks through how to design AI customer research that holds up under your privacy office's review.\n\n## What HIPAA actually requires (in research terms)\n\nHIPAA's Privacy Rule applies to **Covered Entities** (health plans, healthcare providers that bill electronically, healthcare clearinghouses) and their **Business Associates** (vendors that handle PHI on a Covered Entity's behalf). It governs *Protected Health Information* — health information tied to one of 18 specific identifiers (name, address, dates more granular than year, phone, email, MRN, etc.).\n\nFor customer research, three rules dominate:\n\n1. **De-identification removes the obligation.** If your interview data does not contain any of the 18 HIPAA identifiers and there is no reasonable basis to re-identify a participant, the data is not PHI and HIPAA does not apply.\n2. **Authorization is needed to use PHI for research** outside treatment, payment, and operations — and most Covered Entities require IRB review for any study that touches PHI.\n3. **Business Associates need a BAA.** Any vendor that *receives* PHI must sign a Business Associate Agreement that flows down HIPAA obligations.\n\nThe practical implication for product teams: **scope your research so you never need to collect PHI in the first place.** Almost every product-discovery, churn, onboarding, pricing, and brand question can be answered without it.\n\n## The default Koji healthcare research pattern\n\nFor 90% of healthcare and HealthTech research, run the study with the following configuration. This produces interview data that is not PHI under HIPAA's de-identification standard and does not require a BAA.\n\n- **Anonymous mode on.** No email, name, phone, or address collected at intake. Participants are assigned a stable but opaque respondent ID. ([Anonymizing customer interview data](/docs/anonymizing-customer-interview-data) covers the controls in detail.)\n- **Screening avoids the 18 identifiers.** Use structured screener questions on role and behavior (\\\"how often do you log medication intake?\\\") rather than identifiable demographics (\\\"what is your date of birth?\\\"). The [research screener questions](/docs/research-screener-questions) doc has a full pattern library.\n- **The AI interviewer is briefed not to elicit PHI.** Add a short company-context instruction (see [company context guide](/docs/company-context-guide)) telling the AI moderator: *\\\"Do not ask for, and do not record, the participant's name, date of birth, address, medical record number, specific diagnoses, or specific treatment dates. If a participant volunteers this information, acknowledge it briefly and steer the conversation back to the research topic.\\\"*\n- **Per-study retention configured.** Set transcripts to auto-delete after 90 days (or whatever your privacy office approves). Reports and themes remain; raw transcripts are purged.\n- **Reports use anonymized quotes.** Koji's AI-generated reports surface themes and pull representative quotes. Configure the export to scrub any inadvertent identifiers before sharing outside your team.\n\nThis pattern lets a digital health PM run 30 patient interviews in a week, surface the friction themes, and ship the fix — without putting their company through a BAA negotiation for every study.\n\n## When you genuinely need PHI: the Enterprise path\n\nSome research questions cannot be answered without PHI. Examples: post-discharge care research that requires linking back to the original encounter; clinical trial recruitment screening; a payer study comparing benefit utilization across named members; provider research that requires NPI-level tracking.\n\nFor those studies:\n\n- **Move to the Enterprise tier** and request a Business Associate Agreement before any PHI flows. Standard self-serve plans (Insights, Interviews) do not include a BAA.\n- **Use BYOK (Bring Your Own Key)** so the LLM calls hit *your* Anthropic, OpenAI, or Google Vertex account on a contract you've already negotiated with HIPAA terms. See the [Bring Your Own Key](/docs/bring-your-own-key) doc for setup. BYOK means the LLM provider is *your* sub-processor, not Koji's, and the conversation content never enters the default shared inference pipeline.\n- **Restrict the participant audience** with [personalized interview links](/docs/personalized-interview-links) tied to an internal participant ID rather than a name or email.\n- **Tighten retention and access.** Restrict the study to specific workspace members, set retention to the shortest period your protocol allows, and disable export to third-party tools like Slack and Notion unless those vendors are also under BAA.\n\nThis is the same pattern enterprise healthcare buyers expect from any AI vendor in 2026 — and it's the reason Koji's architecture separates the application layer (Vercel + Supabase, both SOC 2 Type II) from the inference layer (where BYOK lets you bring your own contracts).\n\n## The 18 HIPAA identifiers your interviews must avoid\n\nIf your study is running under the de-identification path, the AI moderator and your screener should never elicit or store:\n\n1. Names\n2. Geographic subdivisions smaller than a state (street, city, ZIP — except first 3 digits of ZIP in some cases)\n3. Dates more granular than year (DOB, admission date, discharge date)\n4. Phone numbers\n5. Fax numbers\n6. Email addresses\n7. Social Security numbers\n8. Medical record numbers\n9. Health plan beneficiary numbers\n10. Account numbers\n11. Certificate / license numbers\n12. Vehicle identifiers\n13. Device identifiers and serial numbers\n14. URLs that identify the individual\n15. IP addresses\n16. Biometric identifiers (fingerprints, voiceprints)\n17. Full-face photos or comparable images\n18. Any other unique identifying number, characteristic, or code\n\nNumber 16 is worth a pause for voice-mode interviews. A voiceprint is a HIPAA identifier. For studies in regulated contexts, prefer text-mode interviews ([voice vs text interviews](/docs/voice-vs-text-interviews)), or use Enterprise + BAA + BYOK if voice is required.\n\n## Structured questions as a compliance lever\n\nKoji's six structured question types — open_ended, scale, single_choice, multiple_choice, ranking, yes_no — give you a way to capture sensitive content as *categorical buckets* instead of free-text PHI. Examples:\n\n- Instead of asking \\\"what medications are you on?\\\" (free-text → likely PHI), use a `multiple_choice` question with anonymized therapeutic categories.\n- Instead of \\\"when were you diagnosed?\\\" (date → identifier), use a `single_choice` question with year ranges.\n- Instead of \\\"how often do you experience symptoms?\\\" (open-text invitation to overshare), use a `scale` (1-5) or a `single_choice` with frequency buckets.\n\nThe [structured questions guide](/docs/structured-questions-guide) is the canonical reference; the [scale questions guide](/docs/scale-questions-guide) covers Likert and frequency patterns specifically.\n\nThis is one of the most important Koji differentiators for regulated industries. Traditional survey tools force you into a binary choice: free-text (sensitive overshare risk) or rigid multiple choice (low signal). Koji's AI-moderated structured questions let the AI follow up *within* the bucket boundary — depth without unbounded free text.\n\n## A safer participant intake pattern\n\nFor any healthcare research study, your intake form should:\n\n- **Not require email.** Use anonymous mode and let participants land via a generic study link rather than a personalized one.\n- **Include an explicit research-only consent.** Re-using the [research consent form templates](/docs/research-consent-form-templates) library, add a sentence: *\\\"This research is conducted under our internal research policy, not as a clinical activity. We do not need or want your PHI. Please do not share names, dates of birth, MRNs, or specific clinical details.\\\"*\n- **Set expectations on AI moderation.** Disclose that an AI is moderating, that no human will listen to recordings in real time, and that the data will be used only for product/service improvement.\n- **Provide a contact for withdrawal.** Even with anonymous mode, give participants a way to request deletion of their interview by quoting their respondent ID (visible at the end of every Koji interview).\n\n## What Koji's architecture gives you out of the box\n\n- **TLS 1.2+ in transit, AES-256 at rest** for all participant content.\n- **No model training on customer data** — Koji has contractual no-train clauses with its LLM providers, and on Enterprise BYOK the contract is yours directly.\n- **SOC 2 Type II infrastructure** via Vercel and Supabase as primary sub-processors.\n- **Per-study retention controls** so transcripts can be purged on a schedule.\n- **Workspace-scoped access** so a study is only visible to the people you explicitly add.\n- **Audit logs** of who accessed which study and when (Enterprise tier).\n\nKoji is not, by default, a HIPAA Covered Entity or Business Associate. The Enterprise tier supports a BAA on request for customers who need to send PHI through the platform. For most healthcare research programs, the de-identified pattern above is faster to launch, faster to ship insights from, and avoids the BAA path entirely.\n\n## Comparison: HIPAA on Koji vs. traditional research tools\n\n- **SurveyMonkey and Typeform** require enterprise plans to sign a BAA — and even then, free-text responses are an overshare hazard. Without AI-moderated probing, you can't coach participants away from PHI in the moment.\n- **Qualtrics** has a HIPAA-eligible XM tier, but the platform is built around quantitative survey logic, not conversational depth. The cost and complexity is enterprise-only.\n- **Manual moderator interviews** can be HIPAA-aligned but they're slow and expensive. A single in-depth provider interview is typically $150–$400 in incentive plus 60-90 minutes of moderator time.\n- **Koji** lets you run de-identified AI interviews at scale on self-serve plans, and graduates to Enterprise + BAA + BYOK when a study genuinely needs PHI. The same study design, two compliance modes.\n\n## Common pitfalls to avoid\n\n- **Treating anonymized as anonymous.** A combination of role + employer + ZIP can re-identify a single person in a small market. Be careful with combinations.\n- **Forgetting voice biometrics.** Voice recordings are themselves an identifier. Default to text in HIPAA-regulated studies unless you have a BAA in place.\n- **Exporting raw transcripts to non-BAA tools.** If you push transcripts to a Notion workspace or a Slack channel, those vendors are now in scope. Use anonymized themes and quotes instead, or restrict integration use to BAA-covered destinations.\n- **Letting screener questions identify rare conditions.** A screener that filters for \\\"adults with a rare disease in a small ZIP code\\\" produces an effectively re-identifiable sample even before the interview starts.\n- **Skipping the IRB conversation when PHI is involved.** Most Covered Entities require IRB review for any research that touches PHI, even when an external vendor handles the moderation.\n\n## A 5-step checklist before you launch\n\n1. **Scope the question.** Can you answer it without any of the 18 identifiers? If yes, default path. If no, Enterprise + BAA path.\n2. **Configure anonymous mode and disable demographic intake.** Use behavioral screeners only.\n3. **Brief the AI interviewer** with explicit PHI-avoidance instructions in company context.\n4. **Set retention** to the shortest acceptable window (often 30-90 days for transcripts).\n5. **Review the first 3 transcripts** before scaling. If participants are oversharing, tighten the AI's steering instructions and re-launch.\n\n## Related Resources\n\n- [Structured Questions Guide](/docs/structured-questions-guide) — the six question types that let you capture sensitive content as categorical buckets.\n- [GDPR-Compliant AI User Research](/docs/gdpr-compliant-ai-user-research) — sister guide for EU privacy law.\n- [Anonymizing Customer Interview Data](/docs/anonymizing-customer-interview-data) — the field-level controls Koji provides for de-identification.\n- [Bring Your Own Key (BYOK)](/docs/bring-your-own-key) — how to route LLM calls through your own contracted account on the Enterprise tier.\n- [Research Consent Form Templates](/docs/research-consent-form-templates) — starter templates you can adapt for healthcare contexts.\n- [Research Screener Questions](/docs/research-screener-questions) — screening patterns that don't collect identifiers.\n- [AI Research for Healthcare](/docs/ai-research-for-healthcare) — broader playbook for patient and provider research.\n- [Research Ethics Guide](/docs/research-ethics-guide) — informed consent, incentives, and minimizing harm in qualitative research.","category":"Research Operations","lastModified":"2026-06-26T03:19:50.795273+00:00","metaTitle":"HIPAA-Compliant AI User Research: Healthcare & HealthTech Playbook","metaDescription":"Run AI-moderated customer research in healthcare contexts without putting PHI at risk. Anonymous-mode patterns, BYOK, BAA path, and the 18 HIPAA identifiers your interviews must avoid.","keywords":["hipaa compliant user research","hipaa research tool","hipaa survey tool","healthtech user research","patient research compliance","ai user research healthcare","hipaa de-identification","healthcare customer research","baa user research"],"aiSummary":"HIPAA-compliant AI customer research is possible by scoping studies to avoid the 18 HIPAA identifiers entirely — name, DOB, MRN, email, voice biometrics, etc. The default Koji pattern uses anonymous mode (no email or demographics collected), structured screener questions, and an AI moderator briefed to steer away from PHI. Voice mode is avoided for regulated studies because voiceprints are themselves identifiers. When PHI is genuinely required, the Enterprise tier supports a Business Associate Agreement plus BYOK (Bring Your Own Key) so LLM calls route through the customer's own Anthropic, OpenAI, or Vertex contract. Per-study retention, workspace-scoped access, AES-256 at rest, and TLS 1.2+ in transit are standard. The compliance lever Koji uniquely provides is structured questions: six question types (open_ended, scale, single_choice, multiple_choice, ranking, yes_no) that let teams capture sensitive content as categorical buckets instead of free-text PHI. Most product, churn, onboarding, and brand research questions can be answered without any PHI at all — making the de-identified path faster, cheaper, and avoidant of the BAA negotiation for the majority of healthcare research programs.","aiPrerequisites":["Basic understanding of HIPAA Privacy Rule","A research question that touches healthcare context","A Koji account (self-serve for de-identified studies; Enterprise for PHI)"],"aiLearningOutcomes":["Understand which research questions need PHI and which do not","Design AI interview studies that stay outside HIPAA scope","Configure Koji anonymous mode, screeners, and AI steering for healthcare research","Know when to escalate to Enterprise + BAA + BYOK","Avoid the 5 most common HIPAA pitfalls in customer research"],"aiDifficulty":"intermediate","aiEstimatedTime":"18 min read"}],"pagination":{"total":1,"returned":1,"offset":0}}