Transcript Redactor

Remove names, emails, and phone numbers from interview transcripts — browser-only, zero upload, works offline. Review every match before redacting.

Step 1 — Paste or upload your transcript

Nothing leaves your browser. All processing happens in JavaScript on your device.

Step 2 — Review detected matches

Step 3 — Redacted transcript

How it works

1
Paste or upload your raw interview transcript — speaker labels, timestamps, and all.
2
Scan runs three independent pattern detectors: capitalised-noun name detection, RFC 5322 email regex, and E.164 / common local phone formats.
3
Review every highlighted match inline. Uncheck any false positives (e.g. brand names you want to keep) before continuing.
4
Redact replaces each checked match with its placeholder: [NAME], [EMAIL], or [PHONE]. Copy or download the clean text.

All processing runs in your browser — the original text and the redacted output never leave your device.

Frequently asked questions

What types of personal information does this tool detect?
Three categories: names (sequences of two or more capitalised words not at the start of a sentence — typical of proper nouns like "Sarah Johnson" or "Dr. Marcus Webb"), email addresses (RFC 5322 pattern: local-part@domain.tld), and phone numbers (E.164 international format +1 415 555 0182, US/UK local formats like (415) 555-0182, and common digit strings of 7–15 digits with separators). The tool is intentionally conservative: it highlights candidate matches and lets you uncheck false positives before redacting.
Will it catch every name in my transcript?
Name detection uses heuristic pattern matching — it finds runs of Title-Cased words (like "Anna García" or "Prof. David Lee") that are unlikely to be sentence-starting common words. It will miss names that appear in ALL CAPS, all lowercase, or single first names without a surname. For high-stakes anonymisation (IRB submissions, GDPR compliance, legal discovery) always do a manual pass after running this tool. The scan is a time-saving first pass, not a certified redaction system.
Is my transcript data sent to any server?
No. This page is a self-contained HTML file. All detection and redaction logic is JavaScript that runs entirely inside your browser. No network requests are made with your text — you can verify this by opening DevTools → Network and watching the requests while scanning. The page also works offline after the first load because there are no external libraries or API calls.
What do the placeholders look like in the output?
Each redacted span is replaced with a bracketed label matching its type: [NAME] for person names, [EMAIL] for email addresses, and [PHONE] for phone numbers. These tokens are easy to search and count in downstream tools (Word, NVivo, Atlas.ti, etc.) and are a common convention in qualitative research transcription.
Can I use this for GDPR, HIPAA, or IRB research anonymisation?
This tool is a practical aid to speed up de-identification, but it does not guarantee complete anonymisation on its own. GDPR, HIPAA, and IRB protocols typically require human review of the final output. Use this tool as a first-pass sweep to remove obvious PII, then conduct your own review of the result before sharing. Always consult your institution's data governance requirements.