PDF to Excel Converter — Offline & Free

Drop a PDF with tables — text or scanned — and download a .xlsx file in seconds. Everything happens in your browser. Your PDF is never uploaded or sent anywhere.

📄

Drop your PDF here

or click to browse — PDF files only

100% OFFLINE File stays on your device. Zero uploads. Zero tracking.

Mode

OCR Language

One worksheet per page (uncheck = merge all pages)

Trim whitespace from cells

Processing…

How it works

All processing runs inside your browser using open-source libraries. Nothing is sent to any server.

1. Load PDF pdf.js parses the PDF binary entirely in-browser — no upload needed.

2. Extract text For text PDFs, character positions are read directly. Nearby text fragments are grouped into rows and columns by their X/Y coordinates.

3. OCR for scans Scanned pages are rendered to a canvas and passed to Tesseract.js, which runs a local OCR engine (LSTM neural network) trained on 100+ languages.

4. Export to Excel SheetJS (xlsx) writes each page's table as a worksheet and packages everything into a standard .xlsx file that Excel, Numbers, and Google Sheets can open.

Frequently asked questions

Is this really free with no signup?: Yes. There is no account, no subscription, and no watermark. The tool is a single HTML file that runs locally in your browser. You can even save it and use it without an internet connection once the page has loaded.
Does my PDF get uploaded anywhere?: No. The file never leaves your device. All processing — text extraction, OCR, and Excel generation — runs inside your browser tab using WebAssembly and JavaScript. This makes it safe for confidential PDFs like financial statements, contracts, or medical records.
What is the difference between "Text PDF" and "Scanned PDF" mode?: A text PDF (also called a native or digital PDF) has actual selectable text embedded in it — you can copy-paste text from it in a regular PDF viewer. The converter extracts this text directly, which is fast and accurate. A scanned PDF is just an image of a page (e.g. a photocopied document). No text is embedded, so the tool renders the page as a picture and runs OCR to recognise the characters. OCR is slower and accuracy depends on scan quality, but it handles most real-world scanned documents well. "Auto" mode checks each page individually and picks the right method.
How does the table detection work?: For text PDFs, each character from pdf.js carries an X/Y position. The converter clusters items with similar Y positions into rows, then sorts items within each row by X position to form columns. Rows with similar column counts are treated as one table. For OCR output, the Tesseract hOCR word-position data is used in the same way. The result is not always perfect — complex multi-column layouts or nested tables may need manual cleanup in Excel — but it correctly handles the vast majority of straightforward data tables.
What PDF types or layouts might not convert perfectly?: Very complex layouts with multiple tables side by side on the same page, tables that span across two pages, merged cells, and heavily styled PDFs with coloured backgrounds can produce imperfect results. For these cases, you may need to tidy the output in Excel. Simple single-column tables, financial statements, and structured data exports typically convert cleanly.
Can it handle large PDFs?: Yes, but OCR mode is CPU-intensive. A 50-page text PDF typically converts in a few seconds. A 50-page scanned PDF may take several minutes in OCR mode, depending on your device's speed. The progress bar shows you each page as it is processed so you can see it working. There is no artificial file-size limit.