Question 1

Is this safe? Does it upload my PDF?

Accepted Answer

No upload whatsoever. The entire extraction runs in your browser using JavaScript and the open-source pdf.js library. Your PDF never leaves your device, is never sent to a server, and is never logged. This makes it safe for sensitive files like contracts, medical reports, or financial statements.

Question 2

What is the maximum file size?

Accepted Answer

PDFs up to 100 MB are accepted. Files over 30 MB may be slower on mobile. For very large PDFs, consider splitting the file first using the Split PDF tool, then extracting text from each part.

Question 3

Does it work offline?

Accepted Answer

After the page has loaded once, yes — the PDF engine is cached in your browser and extraction runs locally even without an internet connection.

Question 4

Will this work on iPhone / iPad?

Accepted Answer

Yes, on modern iOS Safari. iOS limits per-tab memory, so very large PDFs may be slow. For best results on mobile, use the page range option to extract just the pages you need.

Question 5

Why is the extracted text empty?

Accepted Answer

Your PDF is probably a scan (a photo of a page, not a real text PDF). Real PDFs have a hidden text layer that we can read; scans don't. Extracting text from scans needs OCR, which we don't yet offer client-side.

Question 6

Why does the text look jumbled on two-column papers?

Accepted Answer

pdf.js reads text in PDF object order, which on multi-column layouts often interleaves the columns. Single-column layouts (most office docs, books, contracts) extract cleanly.

Question 7

What's the difference between .txt and .md output?

Accepted Answer

The content is identical. The .md extension just means apps that recognize Markdown (Obsidian, VS Code, GitHub) will treat the file like a Markdown document. Pick whatever your downstream tool prefers.

Question 8

Can I extract tables as proper rows and columns?

Accepted Answer

Not reliably — PDF doesn't store tables as structured data, just as text positioned at coordinates. The output will contain all the table cells in roughly reading order, but you'll need to reformat manually.

PDF to Text

About this tool

How to use it

When you need the text out of a PDF

How pdf.js reads text

Scanned PDFs and OCR

Why no upload matters for text documents

Frequently asked questions

Keep exploring