There are two main
types of PDF documents - those that are created electronically using PDF
creation software and those that are created from a scanner or other
photo-imaging equipment. PDF creation software actually builds a PDF
document that has an internal structure, denoting characters, fonts and position
- although the raw information makes little sense to the human eye. A
scanned PDF is basically just a flat image of a document - hence, scanning a
page of text results in a picture of words being represented on the
screen. In order to take information from this sort of scanned PDF, OCR
technology is required so that each character can be optically recognized and
then represented. Click here to learn more about
OCR. You can generally visually determine if a
document is a scanned document by enlarging the picture on your screen and
looking closely at the text. A scanned image will appear to have much
poorer resolution, when looked at closely, than a created PDF document.
Do I have the ability to convert scanned PDFs?
No. Able2Extract version 6.0, Standard Edition does
not contain the OCR engine required to convert scanned PDFs. In order to
do so, you will require Able2Extract Professional which enables users to convert
scanned PDF files.
Most users of Able2Extract
will probably not need this functionality
since most PDFs are generated and saved by computer applications. These
created PDFs
that are generated and saved by a computer application can be handled by
Able2Extract. However, in cases where a user scans a paper document using
a scanner and then saves it as a PDF, it is a scanned PDF or an image PDF. In
these cases, Able2Extract Professional
is able to lift the textual
information off the PDF and make the conversion.