- Is this really free with no signup?
- Yes. There is no account, no subscription, and no watermark. The tool is a single HTML file that runs locally in your browser. You can even save it and use it without an internet connection once the page has loaded.
- Does my PDF get uploaded anywhere?
- No. The file never leaves your device. All processing — text extraction, OCR, and Excel generation — runs inside your browser tab using WebAssembly and JavaScript. This makes it safe for confidential PDFs like financial statements, contracts, or medical records.
- What is the difference between "Text PDF" and "Scanned PDF" mode?
- A text PDF (also called a native or digital PDF) has actual selectable text embedded in it — you can copy-paste text from it in a regular PDF viewer. The converter extracts this text directly, which is fast and accurate. A scanned PDF is just an image of a page (e.g. a photocopied document). No text is embedded, so the tool renders the page as a picture and runs OCR to recognise the characters. OCR is slower and accuracy depends on scan quality, but it handles most real-world scanned documents well. "Auto" mode checks each page individually and picks the right method.
- How does the table detection work?
- For text PDFs, each character from pdf.js carries an X/Y position. The converter clusters items with similar Y positions into rows, then sorts items within each row by X position to form columns. Rows with similar column counts are treated as one table. For OCR output, the Tesseract hOCR word-position data is used in the same way. The result is not always perfect — complex multi-column layouts or nested tables may need manual cleanup in Excel — but it correctly handles the vast majority of straightforward data tables.
- What PDF types or layouts might not convert perfectly?
- Very complex layouts with multiple tables side by side on the same page, tables that span across two pages, merged cells, and heavily styled PDFs with coloured backgrounds can produce imperfect results. For these cases, you may need to tidy the output in Excel. Simple single-column tables, financial statements, and structured data exports typically convert cleanly.
- Can it handle large PDFs?
- Yes, but OCR mode is CPU-intensive. A 50-page text PDF typically converts in a few seconds. A 50-page scanned PDF may take several minutes in OCR mode, depending on your device's speed. The progress bar shows you each page as it is processed so you can see it working. There is no artificial file-size limit.