Prime Recognition OCR Software

PDF Conversion Details

Overview

Prime Recognition software includes the capability to convert scanned images into PDF formatted files. Several products from Prime Recognition support PDF output, including PrimeOCR, an award winning, high accuracy "Voting" OCR engine, PrimeZone (image to PDF only), and PrimePost (PRO to PDF).

PrimeOCR's PDF output provides the most accurate OCR results available to the production imaging marketplace while minimizing PDF file size with full compression and retaining original image and text layout.

Supports Adobe Acrobat

Three styles of PDF documents can be produced:

  • PDF Image Only

    documents contain a bitmap image of the original scanned document. Text is not included in this type of document.

  • PDF Normal

    documents include the formatted text output from the PrimeOCR engine, and image zones, if any. These files are significantly smaller than the original compressed bitmap image files.

  • PDF Image with Hidden Text

    includes information from both the PDF Image and the PDF Normal file types. The original bitmap image is included in the document while the OCR results are hidden behind the image. This type of document is useful when the original image needs to be retained while OCR results can be indexed, searched, or copied into another application.

Advantages of using PrimeOCR for PDF Creation

OCR Accuracy

  • PrimeOCR generates 50-80% fewer character recognition errors than other OCR engines.

Designed for high volume unattended production environments

  • Memory management for robust operation. Many of today's products that produce PDF files have limitations processing a large number of documents in batch mode, or handling multi-page TIFFs. Prime Recognition products manage memory effectively so thousands of images and multi-page TIFFs can be processed quickly without complications.

  • Capability to process batches of images in directories and subdirectories, facilitating hands off operations of large imaging jobs.

  • Fault tolerance and process logging. Image/OCR errors are captured and recorded in log files and processing continues automatically. The software is designed for robust, continuous operation.

  • Support for long filenames and NTFS compressed drives. Prime Recognition offers the latest in Windows compatibility.

  • Automatic zoning within OCR, or automatic zoning with manual QA, or manual zoning before OCR are supported.

  • Image enhancement may be controlled by the user and may be done in a separate step from OCR or within OCR process including deskew, auto-rotation, despeckle, etc.

Speed

  • The single engine (Level 1) version of PrimeOCR is over 85% faster than other production imaging solutions.

  • The very high accuracy (Level 6) version of PrimeOCR is at least 15% faster than alternatives.

Process Time

OCR Conversion of 21 multi-page TIFF files to PDF Image Plus Text

OCR Process Time (min) % faster with PrimeOCR
Other product 7:30 n/a
PrimeOCR Level 1 1:05 85%
PrimeOCR Level 3 3:00 55%
PrimeOCR Level 6 5:50 15%
  • Conversion of TIFF images to the PDF Image Only document format is 92% faster than alternatives.

Process Time

Conversion of 21 multi-page TIFF files to PDF Image Only

Process Time (sec) % faster with PrimeOCR>
Other product 80 n/a
PrimeOCR 6 92%

File Size

  • Prime Recognition's PDF output can save up to 80% disk space vs. other alternatives depending on the PDF file type.

Conversion of 21 multi-page TIFF files (876.6KB total size)

PDF File Type PrimeOCR
(File Size KB)
Other Product
(File Size KB)
% saving with PrimeOCR
Normal 117.0 620.1 80%
Image Only 926.0 1263.0 25%
Image plus hidden text 988.0 1560.5 35%
  • All fonts are mapped to the base fonts found in the PDF reader reducing file size (however "look and feel" of document in PDF Normal format may suffer when the base fonts do not closely match fonts in document).

  • Both text and images are compressed within the PDF file to minimize file size.

  • To further minimize file size, desampling of the images within a PDF file is available with PrimeOCR PDF output. Desampling is fully configurable by the user from 50 dpi to 600 dpi.

PrimeOCR PDF I/O Specifications

Input File Formats:

  • TIFF - including large multi-page (>1,000's of pages) files
  • PCX
  • Bitonal images, color and grayscale
  • JPEG
  • PDF
  • many others ...

Output File Formats:

Additional Information

Contact Us

Information and Sales:
sales@primerecognition.com

Support:
support@primerecognition.com

Call Us
(425) 895-0550

Testimonials

"The University of Michigan Digital Library Production Services is extraordinarily pleased with the increase in OCR quality made possible through the use of PrimeOCR. Scalability is a critical issue in digital libraries, and Prime Recognition has contributed to our creating a large and scalable digital library production service."
~ John Price-Wilkin, University of Michigan

"PrimeOCR gives us a much cleaner document before verification than most OCR packages do after verification."  ~ Doug Thompson, Scan Center of America

> Read More Customer Testimonials

Try PrimeOCR | Site Map | Home