Scanned PDFs Are Now Made Searchable On Google

For some time now, you could search for PDF files with Boolean terms or by using PDF specific search engines like, Data-Sheet, Pdf geni.com and pdf Search Engine. Sites like Scribd and the defunct DocuFarm have also taken steps, along with PDFoo.com, to make searching for PDF files a one-stop journey.

But despite this convenience and ease, PDF searches have been exclusive only to native PDF files, which have been made specifically from digitally created documents (as opposed to paper documents being scanned in to make a PDF file). Well, last week Google announced they were making scanned PDFs searchable by using OCR technology to make the text in the PDFs indexable.

Now companies and organizations can index their scanned-in material
next to already natively created indexed PDFs, making Google’s
resources that much more extensive.  Material not normally seen on the
web will be accessible. Though admittedly, the scanned PDFs being
accessed via search engine, while useful and more convenient, will be
of lesser quality.

Can you tell the difference between a native and scanned PDF?

This brings me back to an article I read a few months back about PDF and SEO
by Search Engine Optimisation Consultant, Tim Nash,  whose blog  takes
a look into the workings of the SEO world. The article summarizes test
results on how effective PDF search engine indexing actually is and the
influence it has on the ranking of your ordinary PDF files.

Of course the study doesn’t include how scanned PDFs would factor in
and rank among other native files or how well it could be done on a
large scale, something that’s now being put to the test.  It does
however leave you with the simple question of how to deal with the
scanned PDFs that come up on a SERPs page.

Scanned PDFs have long been a special case when it
comes to conversion. The content can’t be converted normally unless
they’ve been electronically processed with OCR technology.
With the Professional versions of  Able2Extract and Able2Doc you can
convert those scanned PDFs with the built in OCR conversion option.
After opening the file in either application, you can enable the
option  from the menu and then convert them into regular documents you
can work with, such as Word, Excel or PowerPoint.
So whether Google (or the other PDF search engines that are sure to
follow Google’s lead) throws you a scanned or native PDF, you can still
keep pace without missing a beat.