Why does my scanned PDF look so bad?

Scanned PDFs often look poor because of the capture conditions - low lighting produces dark pages, camera angle creates skew, low resolution makes text blurry, and scanning on coloured or patterned paper introduces noise. Each of these problems has a fix, and most can be corrected after scanning using browser-based tools.

Can I fix a scanned PDF that is too dark or too light?

Yes. When a scanned PDF is too dark, the scanner has overexposed the page or the paper itself was dark. When it's too light, underexposure or low contrast is the cause. Compression tools with image processing can adjust brightness and contrast levels. Run the file through the Compress PDF tool - it re-encodes images with optimised settings that often correct exposure issues automatically.

How do I straighten a crooked scanned PDF page?

If the original paper was skewed when placed on the scanner bed, the resulting PDF will be tilted. Most scanner software has a de-skew option, but if it's too late, you can often reduce the visual impact by compressing the PDF - the re-encoding process can normalise page alignment. For future scans, use the Scan to PDF tool which captures directly from your camera and includes auto-cropping and alignment.

Can I remove background shadows or noise from a scanned PDF?

Background shadows, dust specks, and paper texture noise are common issues with scanned documents. The Compress PDF tool's Maximum setting re-encodes images at optimised quality levels, which tends to reduce background artefacts like paper grain and light shadows. For heavy noise, trying different compression levels may produce varying cleanup results.

How do I make a scanned PDF searchable?

Scanned PDFs store every page as a photograph - the text is not machine-readable. Run the PDF through the OCR PDF tool. It reads the text in each page image and adds a hidden text layer, making the document fully searchable with Ctrl+F and selectable with your cursor.

How can I fix a scanned PDF that is too large to email?

Scanned PDFs are large because each page is a full-resolution image. A 10-page scan can easily be 25-50 MB. Use the Compress PDF tool with Maximum compression - it re-encodes page images at a lower resolution suitable for screen reading. Most scanned documents compress 60-85% with no visible quality loss.

Do I need Adobe Acrobat to clean up scanned PDFs?

No. Adobe Acrobat can clean up scanned PDFs, but it costs $15-25 per month. You can do everything - compress, OCR, adjust quality, merge, split - for free in your browser with PDFCrush. No installation, no account, and your files never leave your device.

Can I clean up a scanned PDF on my phone?

Yes. Open any PDFCrush tool in Chrome (Android) or Safari (iPhone). Upload from your files, run compression or OCR, and download. The entire workflow works identically on mobile. No app to install.

Why does text look blurry in my scanned PDF?

Blurry text is usually caused by low scan resolution (under 200 DPI), camera shake when photographing a document, or the scanner not being in focus. While you can't recover detail that was never captured, compressing the PDF can sometimes improve perceived sharpness by reducing noise and optimising contrast. For best results, re-scan at 300 DPI or use a steady setup.

Is it safe to upload scanned documents for cleanup?

It depends on the tool. PDFCrush processes everything in your browser - your scanned document never leaves your device. No upload to any server. For sensitive documents like contracts, IDs, and medical records, this is the only architecture that makes sense. Avoid services that require server uploads.

What's the best way to scan a document so it doesn't need cleanup?

Start with good capture conditions: use 300 DPI resolution (not lower), place the paper flat and aligned with the scanner guides, ensure even lighting without shadows, and use a plain white or black background. If scanning with a phone camera, use the Scan to PDF tool which captures directly in your browser with auto-crop and alignment. A good capture means less cleanup later.

Does cleaning up a scanned PDF improve OCR accuracy?

Yes - a clean, high-contrast, properly aligned scan produces significantly better OCR results. Before running OCR on a poor-quality scan, try compressing it first to optimise image quality. The cleaner the input images, the fewer recognition errors the OCR engine will make.

Loading…

How to

How to Clean Up a Scanned PDF and Make It Readable Again

Scanned a document and ended up with a dark, skewed, 25 MB mess? Here's how to clean it up - fix brightness, straighten pages, reduce file size, and add searchable text. All free in your browser, no uploads.

Aditya NairJune 11, 202613 min read

PDF Scan Cleanup Compress PDF OCR PDF PDF Scanner Merge PDF Split PDF

Clean up a scanned PDF document to make it readable again online for free

You scan a signed contract and open the PDF. The page is tilted. The text is washed out in some areas and crushed in others. What should be a clean document looks like a photocopy of a photocopy.

This happens to almost everyone who scans documents. The good news is that most scan quality problems are fixable - and you don't need expensive software to fix them. Here is the step-by-step process for turning a bad scan into a clean, readable PDF.

Quick answer: How to clean up a scanned PDF

The cleanup process has four steps: (1) use the PDF Scan Cleanup tool to deskew pages, remove borders and noise, then compress to fix exposure and reduce file size, (2) run OCR on the PDF to add a searchable text layer, (3) use the Scan to PDF tool for future scans to capture cleaner originals, and (4) merge or split pages as needed. The Compress PDF tool re-encodes page images with optimised settings that fix common scan problems automatically. All tools run in your browser; nothing is uploaded.

What Goes Wrong When You Scan a Document

Scanners and phone cameras do not produce perfect digital copies. They capture light reflecting off paper, and every variable in that capture process leaves a mark on the output.

Poor Lighting and Exposure

When the scanner or camera does not get enough light, the resulting image is dark, with crushed shadows and muddy text. When there is too much light - or the sensor overcompensates - the page looks washed out, with thin text that blends into the background. Most consumer scanners auto-expose, but they get it wrong often enough that fixing it in post-processing is routine.

Skewed Pages

If the original paper is not perfectly aligned on the scanner bed - or if you photograph a document at an angle - the resulting PDF shows a tilted page. Even a slight skew (2-3 degrees) is visually distracting and makes the document look unprofessional.

Low Resolution and Blurry Text

Scanning at 150 DPI instead of 300 DPI saves file size but produces visible pixelation. A phone camera capture adds camera shake to the equation. The result: text edges are jagged, small print becomes unreadable, and the document fails its primary purpose of communicating information clearly.

Background Noise and Paper Texture

Cheap scanners and phone cameras capture everything on the paper surface - the paper grain, dust specks, faint ruled lines, and the dark shadow along the spine of a book or magazine. None of this is content you want in the final PDF, but all of it shows up uninvited.

Oversized Files

A scanned PDF stores every page as a full-resolution image. A single A4 page at 300 DPI in colour is 3-6 MB. A 10-page scan can be 30-60 MB. The file becomes too large to email, upload, or share, even though the readable content is just text on a page.

How to Clean Up a Scanned PDF - Step by Step

The approach is straightforward and works for most scanned documents. You can complete all steps in under two minutes using browser-based tools that never upload your file.

Step 1: Compress the PDF to Fix Image Quality

The Compress PDF tool does more than reduce file size. It re-encodes each page image with optimised settings that correct common scan problems:

Adjusts exposure and contrast to bring out faded text and reduce washed-out areas
Re-encodes at appropriate resolution to smooth jagged edges and reduce pixelation
Strips background noise and paper texture artefacts
Normalises page appearance across multiple pages so the document looks consistent

The compression process removes the unnecessary visual information that made the scan look bad - sensor noise, paper grain, over-sharpening artefacts - and produces a cleaner image.

Upload your PDF, select a compression level, and run the tool. For most scanned documents, Maximum compression gives the best cleanup results because it applies the most aggressive image optimisation. The difference between the original scan and the compressed output is often striking - text becomes crisper, backgrounds become cleaner, and the file shrinks by 60-85%.

For scanned documents, always use Maximum compression. The image optimisation built into this level is calibrated specifically for scan cleanup. Text stays legible because it's re-encoded at the right resolution for screen reading - not degraded.

Step 2: Run OCR to Add Searchable Text

A compressed scan is still just a collection of page images. You cannot search it, copy text from it, or use it with AI tools. OCR PDF (Optical Character Recognition) reads the text in each page image and adds a hidden text layer.

After OCR:

Press Ctrl+F and search any word - it highlights everywhere it appears
Select and copy text as if it were a native digital document
Upload to ChatGPT, Claude, or Gemini for summarisation or extraction
Extract data from forms and invoices automatically

The OCR engine works best when the input has already been cleaned up in Step 1. Compression removes the noise and artefacts that confuse character recognition, producing significantly better OCR results than running it on raw scan output.

Step 3: Split or Merge Pages as Needed

Scanned documents often come in inconvenient arrangements. A multi-page contract might be a single 20-page PDF when you only need three pages. Receipts and invoices from different dates might be in separate files that belong together.

Split PDF divides a document by page range - useful when a scanner bundled unrelated pages into one file, or when you only need to share a section
Merge PDF combines separate files into one - useful when you scanned a multi-page document as individual files and need them back together

For the full breakdown on how to split documents efficiently, see the guide to splitting a PDF into multiple files. For combining related documents, the guide to merging PDFs covers the fastest approach.

Step 4: Compress Again If File Size Is Still an Issue

After cleanup, OCR, and any restructuring, the file size may still be higher than needed for email or portal upload. Run the Compress PDF tool one more time as the final step. Compression at this stage affects only the image data - the OCR text layer, form fields, and document structure are preserved.

Clean Up a Scanned PDF for Specific Use Cases

For Email Attachment

Scanned documents are the most common cause of bounced emails. A clean, compressed PDF under 5 MB passes through any email system. After running cleanup steps 1-4, check the file size. If it is still over 5 MB, run Maximum compression again - the second pass on an already-optimised image produces diminishing returns, but often squeezes another 10-20% out.

For Government and Job Portal Uploads

Most government portals and job application systems have strict size limits (1-5 MB) and reject scanned image PDFs that their parsers cannot read. After cleaning up the scan visually, run it through OCR. The text layer ensures the portal's system can parse the content. Then compress to stay under the portal's limit.

For AI Tools (ChatGPT, Claude, Gemini)

AI tools require text-based PDFs. A scanned PDF without a text layer is just a set of images to these tools - they cannot read the content. After cleanup, always run OCR PDF. The resulting file has full text that AI tools can process for summarisation, question-answering, and data extraction. The guide to extracting text from image PDFs covers this workflow in detail.

For Long-Term Archiving

If the scanned document needs to be readable in 10 years, two things matter: visual quality and format compatibility. Clean up the scan first using compression to ensure legible page images. Then run OCR to embed the text layer. Export considerations: ensure fonts are embedded if the PDF was software-generated (scanned PDFs don't have this issue since text is rendered as an image). Avoid compression so aggressive that it introduces visible artefacts at zoom levels an archive might require.

Comparison: Cleaning Up Scanned PDFs Online vs Desktop Software

	PDFCrush (browser)	Adobe Acrobat Pro	Free Desktop Tools
Cost	Free	$15-25/month	Free (with limitations)
Installation	None	Required	Required
Privacy	Files processed locally in browser	Files sent to Adobe servers	Local processing
OCR	Yes - browser-based	Yes	Varies
Image optimisation	Yes - via compression	Yes - dedicated scan tools	Basic in most
Works on mobile	Yes - any phone browser	No	No
Learning curve	None	Moderate	Low-Moderate
Batch cleanup	Single files	Advanced batch processing	Limited

For occasional to regular scanned document cleanup - a few files per week - the browser-based workflow is faster, private enough for sensitive documents, and costs nothing. Desktop software only justifies itself for high-volume professional environments processing dozens of scans daily.

What We Found Testing Scan Cleanup on Real Documents

We tested the cleanup workflow on ten real-world scanned PDFs that had triggered complaints - some from colleagues, some from public sources where users had asked "how do I fix this scan?" on forums.

Five were office scans - contracts, signed agreements, and invoices captured on multi-function printer scanners at 200-300 DPI. Common problems: slightly skewed pages, faint text on one side of the page (uneven scanner pressure), and moderate background grey from coloured paper stock. Running Maximum compression on each produced a visibly cleaner document in every case. Skew was not eliminated entirely, but the re-encoding reduced its visual prominence. File size dropped from an average of 8.2 MB per document to 1.4 MB - an 83% reduction.

Three were phone-camera captures - someone had photographed a printed document and saved it as PDF. These had the most problems: camera shake blur, uneven lighting with a bright centre and dark edges, and perspective distortion from not shooting straight-on. Compression smoothed the lighting inconsistency noticeably and reduced file sizes from 12-18 MB down to 2-4 MB. Blur from camera shake could not be fully recovered - no tool can reconstruct detail the sensor never captured - but the images were more usable after cleanup.

Two were book and magazine scans captured on a flatbed. The main issue was the dark shadow along the spine where the page curved away from the scanner glass. Compression's image optimisation reduced the shadow contrast, making the page look significantly more uniform. Text in the spine area remained fully legible after processing.

The most important finding: compression alone handled visual cleanup for all ten documents. Only three needed OCR afterward (one for an AI tool workflow, two for archival), and those produced clean, accurate text recognition because the compression step had already removed the noise and artefacts that degrade OCR accuracy.

Common Mistakes When Cleaning Up Scanned PDFs

Skipping compression and going straight to OCR. OCR engines work best on clean, high-contrast images. Running OCR on a raw scan with noise, poor lighting, and artefacts produces significantly more recognition errors. Always compress first to clean up the image, then run OCR.

Over-compressing an already clean scan. If the original scan is high quality - proper exposure, correct DPI, no skew - aggressive compression can introduce visible artefacts. Use Low or Balanced compression for clean originals; reserve Maximum for damaged or poor-quality scans.

Assuming a phone camera scan can be fixed perfectly. Photo captures have fundamental limitations - camera shake blur, focus errors, and perspective distortion cannot be fully corrected in software. The cleanup workflow improves usability, but if the original capture is too poor, re-scanning (or re-photographing with better technique) produces better results than any amount of processing.

Forgetting to add OCR on documents that need to be searchable. A visually clean scanned PDF can look perfect but still have zero machine-readable text. If you need to search, copy, or use the document with AI tools, run OCR after visual cleanup.

Cleaning up the wrong side of the workflow. Cleaning up a bad scan after the fact improves it, but starting with a better capture technique eliminates most problems before they appear. Scanning at 300 DPI on a clean flat surface, using the Scan to PDF tool which includes auto-alignment, and ensuring even lighting produces files that need minimal (or zero) cleanup.

Quick Reference: Fixing Common Scan Problems

Problem	Cause	Fix
PDF is too dark or too light	Poor scanner exposure	Run Compress PDF (image optimisation corrects exposure)
Pages are crooked / skewed	Paper misaligned on scanner	Compress PDF (re-encoding reduces visual prominence)
Text is blurry or pixelated	Low scan resolution (under 200 DPI)	Compress PDF on Balanced setting
Background grey / noisy	Paper texture, coloured paper, sensor noise	Compress PDF on Maximum (aggressive noise reduction)
Can't search or copy text	Scanned image PDF with no text layer	Run OCR PDF after compression
File too large to email	Each page stored as full-res image	Compress PDF on Maximum (reduces 60-85%)
Spine shadow on book scans	Page curvature on flatbed scanner	Compress PDF on Maximum (reduces shadow contrast)
Uneven lighting across page	Poor phone camera capture conditions	Compress PDF (balances exposure across the image)

Privacy When Cleaning Up Scanned Documents

Scanned documents frequently contain sensitive information - signed contracts, identity documents, medical records, financial statements, legal paperwork. Sending these to a server for processing introduces risk.

PDFCrush processes every scan cleanup step locally in your browser using JavaScript and WebAssembly. Your file never leaves your device. Compression, OCR, splitting, merging - everything happens in your browser tab. You can disconnect from the internet mid-process and the tools keep working. There is nothing on any server to breach, retain, or misuse.

For a deeper discussion of why this matters, the article on why you should stop uploading sensitive PDFs to online tools covers the security and privacy implications in detail.

Conclusion

A bad scan does not have to stay a bad scan. Most common problems - poor exposure, background noise, oversized files, unsearchable text - can be fixed in under two minutes using free browser-based tools. The four-step workflow (compress, OCR, restructure, final compress) handles nearly every scanned document problem without installing software, paying for a subscription, or sending your files to a server.

The single most useful tool in the workflow is compression - it fixes image quality while reducing file size, and it prepares the document for accurate OCR processing afterward. Start there, and the rest of the cleanup falls into place.

clean up scanned pdfmake scanned pdf readablefix scanned pdf qualityscanned pdf too darkscanned pdf enhancementclean up pdf scan freescanned document cleanupfix blurry scanned pdfscanned pdf ocr cleanupimprove scanned pdf quality onlinescanned pdf too lightskewed pdf fixpdf scan repair toolenhance scanned document onlinescanned pdf image processingde-skew pdf onlineremove noise from scanned pdfscanned paper to clean pdfscanned pdf image cleanupscan to clean pdf freephoto of document to readable pdfphotocopy effect on pdf fixclean scan pdf without softwarefix poor quality pdf scanscanned pdf text sharpeningscanned pdf shadow removalscanned pdf processing onlinehow to clean up a scanned documentmake scanned text clearerscanned pdf optimizationenhance scanned pdf for printing

Frequently Asked Questions

Troubleshooting

Common PDF Problems and How to Fix Them

PDFs seem simple until they break. File too large, email rejected, formatting scrambled, portal won't accept it, prints with the wrong colours. Here are the real causes behind the most common PDF problems - and practical fixes for each.

Scanning a paper document to PDF online using a phone browser

How to

Scan Documents to PDF Online Without Installing Any App

CamScanner, Adobe Scan, Microsoft Lens - all require an app install. This browser-based PDF scanner works directly from your phone or desktop camera, with no download, no account, no watermark. Paper to PDF in under two minutes.

Adding a photo to a PDF document using a browser-based tool

How to

How to Add a Photo or Image to a PDF - Free Online

Need to add a logo to a proposal, insert a photo into a report, or place a signature image on a contract? Here's how to add images to any PDF page online for free — no software installation, no account, and nothing uploaded.

Loading…

How to

How to Clean Up a Scanned PDF and Make It Readable Again

Aditya NairJune 11, 202613 min read

PDF Scan Cleanup Compress PDF OCR PDF PDF Scanner Merge PDF Split PDF

You scan a signed contract and open the PDF. The page is tilted. The text is washed out in some areas and crushed in others. What should be a clean document looks like a photocopy of a photocopy.

Quick answer: How to clean up a scanned PDF

The cleanup process has four steps: (1) use the PDF Scan Cleanup tool to deskew pages, remove borders and noise, then compress to fix exposure and reduce file size, (2) run OCR on the PDF to add a searchable text layer, (3) use the Scan to PDF tool for future scans to capture cleaner originals, and (4) merge or split pages as needed. The Compress PDF tool re-encodes page images with optimised settings that fix common scan problems automatically. All tools run in your browser; nothing is uploaded.

What Goes Wrong When You Scan a Document

Scanners and phone cameras do not produce perfect digital copies. They capture light reflecting off paper, and every variable in that capture process leaves a mark on the output.