Prime Recognition OCR Software

Accessible PDF Overview

PrimeOCR accessible PDF output offers a number of features that increase the accessibility of the contents of the PDF file for people with disabilities.  These features also help government agencies and businesses comply with US government regulations as outlined in Section 508 of the Rehabilitation Act.

Section 508 requires that Federal agencies' electronic documents, including PDF files, is accessible to people with disabilities.

Section 508.  Opening doors to IT logo (5 stars and 3 red stripes)

Features:

PrimeOCR provides accessibility and usability to OCRed text within scanned image PDF files that meet Section 508 standards.

  • Higher character recognition accuracy delivers more accurate results
  • Text reading order can be identified
  • Alternate text can be inserted for each graphic on a scanned page
  • Natural language of recognized text can be included
  • Text can be viewed in "Relfow" view mode - including text from Image plus Hidden Text files
  • Text retains formatting and paragraph wrapping when exported to another application

508 requirements:

PrimeOCR PDF output:

A text equivalent for every non text element shall be provided

Alternate text provided for each significant graphic on the page.

Alternate text can be provided by the user for each zoned graphic on the scanned image (using PrimeView)

If alternate text is not provided by the operator from manual zoning then PrimeOCR automatically inserts a default string "This is a graphic from a scanned page". The alternate text can be updated later with a PDF tag editor.

Documents shall be organized so they are readable without requiring an associated style sheet Reading order is identified and tagged in the PDF file along with paragraph markers.

Reading order is determined from autozoning the scanned page or set by the user manually using PrimeView.

The language of the document is identified and tagged for each page. (The language for the page is from the language setting in the processing template as defined in PrimeView or API call.)

Text retains formatting so when the OCR text is exported to RTF from a PDF viewer the output text reflows within paragraphs and character formatting is retained.

OCRed text can be viewed in "relflow" mode for all versions of PDF output including image plus hidden text.

Row and column headers shall be identified for data tables

Rows and columns are defined and tagged in the PDF file.

Rows and columns must be manually defined (using PrimeView) within the template or autozoning will segment the table as a single text zone or as several text zones.

Automated processing:

Autozoning may be used to automatically include accessible features into PDF output.  Autozoning automatically finds graphics on the scanned page and identifies columns of text to be recognized.  If the document is too complex and autozoning does not reliably determine correct reading order then manual zoning with PrimeView may be used (see below).  If using autozoning during OCR, PDF accessible output will include:

  • Reading order of the text that is recognized during OCR. Text will be identified and tagged in the PDF output file.

  • Paragraphs will be identified within the text for later export into another application.  Formatting of the text is retained during export.

  • Graphics within the scanned page will be identified and tagged with a generic alternative text tag - "This is a graphic from a scanned page". The alternative text for the graphic can be easily modified later with PDF editing software.

  • Tables will not be recognized or tagged when using autozoning. Tables can only be identified by manually zoning the page prior to OCR taking place.

Manual processing:

Depending on the document style a less automated approach may need be used to provide more accurate accessible OCR text in PDF output.  By using new features in PrimeView, specifically designed for adding accessibility attributes to PDF output, an operator can quickly zone text columns, zone graphics on a scanned page, provide alternate text for each graphic and identify table rows and columns.

The zoning information collected from PrimeView is provided to PrimeOCR when the PDF output is generated from the scanned image to create accessible PDF output.

Additional information - links:

Contact Us

Information and Sales:
sales@primerecognition.com

Support:
support@primerecognition.com

Call Us
(425) 895-0550

Testimonials

"InfoEdge has experienced up to 50% reduction in OCR errors through the use of the voting technique.   (PrimeOCR) ... Editing and correction of OCR errors can be the single largest cost in some applications, and reducing that cost can significantly reduce the bottom line of the entire project." ~ KMWorld

"The University of Michigan Digital Library Production Services is extraordinarily pleased with the increase in OCR quality made possible through the use of PrimeOCR. Scalability is a critical issue in digital libraries, and Prime Recognition has contributed to our creating a large and scalable digital library production service."
~ John Price-Wilkin, University of Michigan

"PrimeOCR gives us a much cleaner document before verification than most OCR packages do after verification."  ~ Doug Thompson, Scan Center of America

"PrimeOCR provides the highest OCR accuracy available to the production market.  This high accuracy combined with PrimeOCR's flexible architecture prods us with a powerful OCR platform to offer our customers." ~ Robert J. Perry, Webhire

"What we release to the public, by law, must be 100% correct.  PrimeOCR has significantly reduced errors, allowing us a faster turn-around time to publish a document."  ~ Rick Essex Rotunda

Try PrimeOCR | Site Map | Home