Click image for larger image.

PrimeOCR

Overview

Prime Recognition's award winning production OCR product, PrimeOCR is a Windows OCR engine that reduces OCR error rates by up to 65-80% over conventional OCR by implementing "Voting" OCR technology.

PrimeOCR reduces overall OCR processing costs by reducing the total number of errors generated from OCR and providing a level of reliability not available with other OCR engines.

For automated OCR - see the PrimeOCR Job Server.

Product Description

Description

Features

System Requirements

Technical Specifications

How to Buy/Pricing

User Manual

PrimeOCR produces fewer errors

Today's best OCR engines are only achieving, on average, 98% accuracy, when recognizing typical quality images. On a typical page of 2000 characters, that means that 40 errors remain in the OCR output.

By using PrimeOCR, error rates can be reduced by 65-80%. This means that the 40 errors generated by today's OCR engines can be drastically reduced to 8 by using PrimeOCR.

PrimeOCR saves time and money

Although you may pay more for PrimeOCR, the total system and operating costs are much lower by using PrimeOCR.

In some OCR intensive applications, manual error correction cost, including manual labor, and verification workstations, can often be 50-70% of the total image system costs. By reducing the need for error correction, PrimeOCR saves costs associated with annual/project manual error correction labor, capital investment.

PrimeOCR produces cleaner data

Not only does PrimeOCR reduce the total number of errors during OCR, but it also reduces the total number of errors that make it into your database or final application by 75%.

Only 40-60% of errors generated by standard OCR software are "flagged" for correction. Since manual error correction typically only looks at flagged errors, this means that up to 60% of the errors produced by the OCR software are not reviewed and remain in the OCR data. PrimeOCR generates more accurate suspicious character "flags" reducing the total number of errors that remain in the data after processing.

The PrimeOCR Job Server - Production level reliability image processing

The PrimeOCR Job Server provides flexibility and dependability to process a large array of OCR processing options and a level of reliability to process thousands of images without error. Each job defines the images to process, any pre/post OCR processing options required, and the type of output. The OCR Job Server queues the jobs for batch processing, and displays completed job statistics for effective batch management.

All Prime Recognition products are designed for easy installation, simple operation, reliable processing, and are scalable to match your OCR throughput requirements. Within minutes of installation, the PrimeOCR Job Server is generating high accuracy output.

Ever plan to process thousands of images overnight only to find out in the morning that the OCR engine crashed on a poor quality image? PrimeOCR's automatic engine recovery feature automatically senses when an engine fails and automatically re-initializes it for the next image. This level of software reliability eliminates downtime and mandatory manual intervention during OCR processing. Operating efficiencies are realized by implementing PrimeOCR into your imaging system.

By leveraging PrimeOCR's features, customers reduce OCR errors from image processing, and gain efficiencies in production imaging systems.

Interested in improving OCR accuracy in your imaging system, or having problems with your current system crashing during batch processing? Let us show you how PrimeOCR can impact your conversion operations. Give PrimeOCR a try.

PrimeOCR - fast AND accurate

PrimeOCR can be set up in a mode called "selective voting". In this mode PrimeOCR offers the best of both worlds, the high speed of conventional OCR when you can afford it, and the high accuracy of Prime Recognition's technology when you need it.

PrimeOCR automatically identifies the quality of documents. On clean documents, PrimeOCR will only run one engine, on lower quality documents PrimeOCR will run multiple engines and vote the results. Selective voting is configurable by the user, you decide when to run more engines. This flexibility offers a number of advantages. For example, you may wish to vote less often because you need higher throughput to finish your job for an upcoming deadline, or you may "turn up" voting because this project is for a customer who is more demanding of OCR accuracy.

PrimeOCR can address your throughput requirements while addressing high accuracy OCR needs.

Features

Options for deskew/image pre-processing such as auto-rotation
Options to auto-zone, manually zone, or full page OCR
Options to save image zones
Support for color and grayscale images
Priority management
Includes industry leading OCR engines
Reduces errors by up to 65-80%
Reduces labor costs required for verification
More accurate flagging of suspicious characters
High fault tolerance when operating under Windows
Automatic recovery ensures continuous processing and limits manual intervention
Common output formats including:
- Formatted ASCII
- RTF - support for color/grayscale images
- PDF - support for color/grayscale images
- PRO (required for verification using PrimeVerify)
- HTML
Scalable for growth or increased document capture
Network architecture provides flexibility, ensures reliability and maximizes operational efficiencies
OCR Job server manages image files through PrimeOCR processing
Available through an API/SDK as a Windowst DLL
Template/Job Wizard makes it easy to set up processing files.

Additional Information

How High Accuracy OCR Saves Operating Costs
How High Accuracy OCR Reduces OCR Errors
PrimeOCR as a certified Capture/InputAccel module
Prime Recognition customers
Prime Recognition partners

System Requirements

Software:

Windows based workstation or server.

Hardware:

Windows compatible computer. Up to 4 CPUs/cores supported.
A hard disk with 50-150 megabytes (Meg) of space for installation
At least 256 megabytes of Random Access Memory (RAM), 512 megabytes recommended. Additional memory may be required for processing color/grayscale or higher resolution images.

Technical Specifications

Recognition Data Types

PrimeOCR recognizes the following data types:

Characters - Machine Print and Dot Matrix text in any of the following 11 languages:

Danish
English (US or UK)
German
Norwegian
Spanish

Dutch
French
Italian
Portuguese
Swedish

plus Russian, Chinese Simple, Chinese Traditional, Japanese and Korean characters.

Optical Marks (OMR) - When an area on the image is "zoned" as OMR, PrimeOCR will return the percentage of black space contained within the zone. This percentage can be used to determine whether a user has marked a selection on the page.

Graphics - PrimeOCR normally ignores any graphics (e.g., pictures) found on an image. It can instead be instructed to save the graphic to a file. A path to the graphic is added to the text output for later page reconstruction.

PrimeOCR Access Methods

PrimeOCR can be accessed through:

The PrimeOCR Job Server - The Job Server controls PrimeOCR processing and can instruct PrimeOCR to process all images found in a directory/subdirectories, with no user intervention or coding. It also records major activities performed by PrimeOCR.

PrimeView/PrimeVerify - This graphical interface for end-users consists of two applications for sending images to the Job Server and editing PrimeOCR results. See the PrimeVerify Data Sheet for more information.

Software Developers Kit (SDK) - The SDK consists of 32 simple, orthogonal API calls accessible as a Dynamic Link Library by the following languages:

C or C++
C++NET
Visual Basic
VB NET
Any language capable of accessing a DLL such as PowerBuilder.

The SDK also includes:

Complete documentation
Working source code examples

Image Input

PrimeOCR will read images from either file or memory in the following formats:

TIFF (single or multi-page) all compression types
PDF
JPEG
Color or grayscale
PCX
PDA

Valid resolutions include 200, 240, 300, 400, and 600 DPI as well as Standard or Fine FAX.

Pre-Processing

PrimeOCR offers a variety of ways to enhance and define your image for optimal OCR results:

Image Enhancement:

Improves image quality for better OCR using features such as:

Deskew
Image Registration

Despeckle
Line Removal, etc.

Image Zoning:

Manual Zoning
Auto Zoning
Zone Content Restrictions include: None, Alphabetic, Alphabetic Upper/Lower Case, Numeric, Graphic, and OMR.

OCR Processing

PrimeOCR has several features that improve OCR accuracy, fault tolerance, and speed:

Configurable Accuracy

The base PrimeOCR configuration achieves 65% fewer errors than conventional OCR using a "3 engine" voting configuration. Even greater accuracy can be achieved through the following:

4 or 5 or 6 Engines - Add a 4th OCR engine to the base configuration for 75% fewer errors, a 5th engine for 80% fewer errors or a 6th engine for 83% fewer errors.
Character Training - PrimeOCR can be trained to recognize specific character sets or fonts.
Engine Customization - Users may select which engines participate in the recognition process or even weigh engine results differently.

High Fault Tolerance

Automatic Engine Recover - A poor quality image can cause a conventional OCR product to "crash". To solve this problem, PrimeOCR can sense when an engine fails and automatically reinitialize it for the next image. This increases throughput by allowing PrimeOCR to run unattended, 24 hours a day!

Configurable Speed

Multi-Processor Support - This option allows PrimeOCR to utilize up to four processors in a multi-CPU system for faster throughput.

Selective Voting - While "Voting" takes longer than conventional OCR, you can speed up the processing on high quality images through Selective Voting. The result: faster OCR speeds on high quality documents and more processing power on lower quality documents.

Output Format

PrimeOCR can generate file output in the following formats:

ASCII - Text only output, left justified.
Formatted ASCII - Spaces are added to text to mimic the original imaging layout.
PDF - Converts scanned images into PDF "Normal", "Image + Text" or "Image Only" formatted file including color images, including accessible PDF output and PDF/a.
RTF - Retains original character attributes and page layout using frames and paragraph conventions. Color/grayscale image zones are supported.
Comma Delimited ASCII - Useful for exporting text fields to other applications.
Confidence/Character Attribute Reporting - Provides text and information on each character to aid in OCR verification. Attributes include line coordinates as well as character confidence, font, location, point size, style, etc.
HTML - Transfer OCR results directly to the Web for on-line viewing. Color/grayscale image zones are supported.
XHTML
XML
Tab delimited - useful for forms based applications, each defined zone's output is separated by a comma which can be easily imported into any popular database application.
RRI3 - RRI's FormWorks compatible format.
ZYINDEX - ZyLab's ZyIndex compatible format.
Custom output - each conversion project is unique in its requirements. Contact us if you need customized output including advanced parsing of text output or any other custom pre or post OCR processing.

Complement products for PrimeOCR

Prime Recognition has 2 add-on applications that can be customized for specific image document types to improve PrimeOCR accuracy rates:

PrimeZone - This custom pre-OCR auto-zoning application creates a zone template for each image based upon specific document types such as Phonebooks, Greenbar, etc.

PrimePost - This custom post-OCR utility performs automatic error correction based upon predefined document types.

How To Buy/Pricing

Prime Recognition’s products are designed for the production market, hence they are significantly more expensive than desktop OCR products, but are competitively priced for the production market. Most configurations will have a cost between $4,600 to $8,000 per PC. PrimeOCR's increased accuracy, fault tolerance, and other "high end" features pay for themselves very quickly. Most of our clients report very short payback periods.

The "base configuration" of PrimeOCR starts at $4,600.00 for a license to process unlimited pages on one PC, or $1,400.00 for a page limited license on one PC (limited to process 150,000 pages then expire).

The base configuration of PrimeOCR includes Level 3 accuracy, the PrimeOCR Job Server application, the SDK, a sample REST API implementation, 2000 zones/page, all file outputs except for PDF (e.g. ASCII, RTF, XML, HTML, PRO, etc.), auto-zoning and recognition of English and 10 Western European languages.

Available add-on modules, that would add to the price of the base configuration, include:

additional voting levels for increased recognition accuracy (Level 6 maximum)
support for processing color or grayscale images
image enhancement (deskew, auto-rotation, despeckle, etc.)
PDF input/output
support for multi-core CPUs
support for Capture/InputAccel
barcode recognition
recognition of Asian languages (Chinese, Japanese and Korean)
recognition of Russian language
recognition of rotated text (90, 180, 270 degrees)
support for large images up to E sized Engineering drawings

Annual software maintenance program is required for the first year, is 15% of the license cost, and is already included in the pricing examples provided above.

Prime Recognition also offers a variety of pricing options to match your financial needs.

We encourage you to contact us at sales@primerecognition.com or give us a call at: (425)895-0550 to let us tailor a pricing program to your needs.

Additional Information

See how PrimeOCR can be cost effective for your OCR processing operation.
See how PrimeOCR provides cleaner data by reducing OCR errors.
Prime Recognition products are designed for the production imaging market with features that provide powerful, scalable OCR solutions. See how they can easily integrate into your current OCR processing flow.
See how high accuracy OCR software can save you operational costs.

User Manual

Download User Manual