- Is my file uploaded to any server?
- No — never. The entire process runs inside your browser using WebAssembly (Tesseract.js for OCR, pdf.js to render PDFs, SheetJS to build the workbook). Your image or PDF is processed locally and is never transmitted anywhere. This also means the tool works offline once the OCR engine has been downloaded the first time.
- What types of images does the image to CSV converter support?
- PNG, JPEG/JPG, TIFF, BMP, and WEBP are all accepted directly. Scanned PDFs are also supported — the first page is rendered to a canvas using pdf.js before OCR runs on it. Digitally-created (native) PDFs with selectable text can be copy-pasted directly into a spreadsheet and do not need OCR.
- How accurate is the OCR?
- Accuracy depends heavily on image quality. Clean, high-resolution scans (200 DPI+) with good contrast typically achieve 95–99% character accuracy. Low-resolution, blurry, or skewed scans produce more errors. After downloading, always spot-check numbers and dates in your spreadsheet. For critical financial or legal data, treat the CSV as a draft and review it.
- How does the table structure get detected?
- The OCR engine returns each word with its bounding-box position on the page. This tool groups words that share the same horizontal band into rows, then clusters them into columns by analysing the horizontal gap distribution across all rows. Tables with visible grid lines, ruled columns, or consistent spacing work best. Irregular or free-form layouts may need manual clean-up in your spreadsheet editor.
- What is the difference between CSV and Excel (.xlsx) output?
- CSV (comma-separated values) is a plain-text format readable by any spreadsheet app, text editor, database, or programming language. It has no formatting, no multiple sheets, and no formulas. Excel .xlsx is a binary workbook format that supports column widths, bold headers, number formatting, and multiple sheets — and opens directly in Microsoft Excel, Google Sheets, Apple Numbers, and LibreOffice Calc. For data pipelines, CSV is usually easier; for human review, .xlsx is more convenient.
- Can I convert a multi-page PDF?
- Currently only the first page of a PDF is processed. For multi-page documents, use our Bulk OCR PDF tool, or split your PDF into individual pages first using the PDF Page Picker. Multi-page support is on the roadmap.