- What types of personal information does this tool detect?
- Three categories: names (sequences of two or more capitalised words not at the start of a sentence — typical of proper nouns like "Sarah Johnson" or "Dr. Marcus Webb"), email addresses (RFC 5322 pattern: local-part@domain.tld), and phone numbers (E.164 international format +1 415 555 0182, US/UK local formats like (415) 555-0182, and common digit strings of 7–15 digits with separators). The tool is intentionally conservative: it highlights candidate matches and lets you uncheck false positives before redacting.
- Will it catch every name in my transcript?
- Name detection uses heuristic pattern matching — it finds runs of Title-Cased words (like "Anna García" or "Prof. David Lee") that are unlikely to be sentence-starting common words. It will miss names that appear in ALL CAPS, all lowercase, or single first names without a surname. For high-stakes anonymisation (IRB submissions, GDPR compliance, legal discovery) always do a manual pass after running this tool. The scan is a time-saving first pass, not a certified redaction system.
- Is my transcript data sent to any server?
- No. This page is a self-contained HTML file. All detection and redaction logic is JavaScript that runs entirely inside your browser. No network requests are made with your text — you can verify this by opening DevTools → Network and watching the requests while scanning. The page also works offline after the first load because there are no external libraries or API calls.
- What do the placeholders look like in the output?
- Each redacted span is replaced with a bracketed label matching its type:
[NAME] for person names, [EMAIL] for email addresses, and [PHONE] for phone numbers. These tokens are easy to search and count in downstream tools (Word, NVivo, Atlas.ti, etc.) and are a common convention in qualitative research transcription.
- Can I use this for GDPR, HIPAA, or IRB research anonymisation?
- This tool is a practical aid to speed up de-identification, but it does not guarantee complete anonymisation on its own. GDPR, HIPAA, and IRB protocols typically require human review of the final output. Use this tool as a first-pass sweep to remove obvious PII, then conduct your own review of the result before sharing. Always consult your institution's data governance requirements.