PDF Table to Excel

Extract tables from any text-based PDF — financial reports, invoices, bank statements — and download as .xlsx. No upload. All processing happens inside your browser. Your file never leaves your device.

📄
Drop a PDF here or click to choose
Text-based PDFs only (financial reports, invoices, statements)
Private by design: your PDF is read locally by pdf.js and processed entirely in this tab. Zero bytes are sent to any server.

How it works

The tool reads your PDF entirely in-browser using pdf.js, then exports structured tables to .xlsx via SheetJS — no server, no upload, no account.

1. Parse text blocks pdf.js extracts every text item with its x/y coordinate and font size from each page.
2. Detect rows & columns Items are clustered into rows by y-coordinate proximity, then into columns by x-coordinate alignment across rows.
3. Group into tables Consecutive rows with consistent column structure are merged into a single table. Gaps between rows break tables apart.
4. Export to Excel SheetJS writes one worksheet per detected table into a single .xlsx file, which downloads instantly to your device.

What works well: financial statements, invoices, bank statements, price lists, schedules — any PDF where text was laid out in a grid. What doesn't: scanned/image PDFs (no text layer), hand-drawn tables, or PDFs with heavily merged cells.

Frequently asked questions

Is my PDF actually private — does it get uploaded anywhere?
No. The entire process runs inside your browser tab using the pdf.js library (loaded from a CDN once) and SheetJS. Your PDF bytes are read from your local disk via a FileReader API call and never transmitted over the network. You can verify this by opening your browser's Network tab before dropping a file — you will see zero outbound requests after the page finishes loading.
Why does my PDF show "no tables detected"?
This tool works on text-based PDFs — PDFs where the text was embedded when the file was created (e.g. exported from Excel, Word, or an accounting system). If your PDF was scanned from paper or is an image-only PDF, there is no text layer for pdf.js to parse. In that case, run an OCR step first (many free tools do this) to produce a searchable PDF, then drop it here. Also, very irregular layouts (rotated text, overlapping columns, merged header cells spanning the full width) may confuse the column-alignment detector.
How does the tool tell where one table ends and another begins?
After grouping text items into rows by their vertical (y) coordinate, the detector looks for vertical gaps larger than 1.5× the typical row height. A gap that size signals a paragraph break or white space separating two distinct tables. Each continuous block of aligned rows becomes its own worksheet in the downloaded .xlsx file, labelled Table 1, Table 2, and so on. If your PDF has a large header before the data rows, it will usually appear as a separate single-row "table" — you can delete that sheet in Excel.