Investintech Offers PDF Conversion Solutions
site indexcontact usabout us
Support
Able2extract Professional

Index
Tutorial
Selection
Troubleshooting
 
 

PDF and OCR Fact Sheet

 

Are all PDF documents the same?

NO. PDF documents can be created in a variety of ways. PDFs that are generated from an electronic source – such as a Word document, a computer generated report, or spreadsheet data – have an internal structure that can be read and interpreted. These “generated” PDF documents already contain characters that have an electronic character designation. As such, conversion from such a PDF can rely on these electronic character designations and provide reliable output.

PDF documents can also be created through the process of scanning a document into electronic format. What a scanned document represents is really just a “picture” of the words contained within that document. In order to convert a scanned document into an editable format, OCR software is required to analyze the “image” of each character and match it to an electronic character-based file. Because of this, it is a much more difficult to ensure that the character that is “recognized” by the OCR software is the character on the scanned document. The quality of OCR output is affected by matters such as poor image quality of the scanned document, mixture of fonts used in the scanned documents, and italicized and underlined fonts, which may blur the quality and shape of individual characters.

What is OCR (Optical Character Recognition)?

Optical Character Recognition (OCR) is a visual recognition process that turns printed or written text into an electronic character-based file. A document that is scanned and converted into a PDF document provides the basis for which OCR software may interpret each character image on the PDF and assign it an electronic character-based file that can then be entered into an editable format, such as a Text or Word document.

Given the proliferation of scan-to-PDF technology available today, Investintech’s OCR solutions focus only on the conversion of already created scanned PDF documents. The quality of the OCR conversion process will largely depend on the quality of the scanned image and the clarity of the characters of that image.

The OCR technology that drives Investintech’s OCR-enabled products is licensed from Nuance, Inc. – a global leader in OCR-based technologies.

Nuance(r) OCR (c) 1994-2007 Nuance, Inc.

 

Quick Site Links
PDF Converter Products   PDF Creation Resources   PDF Conversion Software
 

Convert PDF to Excel, Word and More with Able2Extract 4.0!
Able2Extract Professional - Convert Scanned PDF-OCR to Word, Excel + more.
Convert PDF to Word with Able2Doc 3.0.
Able2Doc Professional - PDF to Word Converter, including scanned PDF conversion.
Sonic PDF Creator - Create PDF files from any Windows application.

 

PDF Writer Download
Securing PDF data
Watermarking PDF
PDF Creator Product
Adding PDF Passwords
Server-Side PDF Creation

 

PDF to Excel
PDF to Word
PDF to Text
PDF to Doc
PDF to XLS
PDF Server

Search By Keywords

PDF Conversion Software: Convert PDF to Word, PDF to Excel, PDF to Text, Scanned PDF-OCR, PDF to Doc, PDF to PowerPoint (PPT)
PDF Creator Software: PDF Converter Products, PDF Writer Download, PDF Conversion Resources, XPS Converter, PDF Developer Tools, PDF Server