Extract text from scanned or image-based PDFs using OCR. Supports 100+ languages. Free & private!
Drop your scanned or image-based PDF here
Files never leave your device! OCR runs in browser.
3 simple steps
Upload a scanned PDF or image.
Tesseract.js recognizes text from images.
Get text, searchable PDF, or copy.
OCR (Optical Character Recognition) converts images of text into actual editable text. It "reads" the text from scanned documents, photos, or screenshots.
100% private. Tesseract.js runs entirely in your browser. No data is sent to any server. The language model is downloaded once and cached.
Over 100 languages including English, Hindi, Gujarati, Arabic, Chinese, Japanese, Korean, and all major European languages.
Accuracy depends on image quality. Clear scans at 300+ DPI typically achieve 90-99% accuracy. Use 3× render scale for best results.