INFO: GlyphReader Engine - Overview


GlyphReader Engine

The glyph reader engine is a highly accurate OCR engine built for DotImage. The engine has been tested with the ISRI OCR Performance Toolkit and has been found to be more accurate, with a 99.5% accuracy rate, than other expensive industry leading OCR engines.

GlyphReader is a lexicon OCR engine requiring no dictionary. It supports European characters only. The following ASCII characters are supported.

The GlyphReader engine does not support font name or family determination. This engine does support font size, baseline, glyph bounds, and confidence.

The DefaultFontName property determines what the font name used by the OcrGlyph class.

Features

GlyphReader supports the following features:

  • European Character Set
  • Reports individual character position and size
  • Reports character confidence
  • OCR's of rotated pages, reports the rotation angle
  • Automatically breaks merged characters, or merges broken characters
  • Optionally rejects low confidence characters
  • Optionally reject low confidence lines
  • Disabling recognition of specific characters
  • Full Page color OCR can be generated when combined with the Searchable PDF Module

Features that are found in some engines but not in GlyphReader include zoning, and determining font characteristics

Output Formats

As with any OCR engine using the DotImage OCR interface, all foreign translators are supported. Text translation is supported out of the box. Searchable PDF is available with the PDF Translator add-on. Therefore, the following mime types are supported for output:

  • text/plain
  • application/pdf (Requires PDF Translator add-on)

Licensing

The DotImage OCR GlyphReader Engine is licensed per concurrent use. Two GlyphReader licenses are required for two applications to use GlyphReader simultaneously. If the application will only be residing on the server, you have the option of purchasing a server license granting an unlimited number of users connected to the server running the DotImage OCR GlyphReader Engine enabled application with up to 20 concurrent processes/threads running at once.

Deployment

  • Atalasoft.Shared.dll
  • Atalasoft.dotImage.Lib.dll
  • Atalasoft.dotImage.dll
  • Atalasoft.dotImage.Ocr.dll
  • Atalasoft.dotImage.GlyphReader.dll

GlyphReader also requires the following unmanaged assemblies and support files, located in Program Files\Atalasoft\DotImage 10.0\Bin\OcrResources\GlyphReader\v3.0:

  • GlyphReader.dll
  • GlyphReaderEngine.exe
  • GlyphReader.ini
  • TOCR31.gar
  • TOCR31.n3s
  • TOCR31.teh

These assemblies must be located on the client machine in System32, or can be installed along side the managed assemblies only if the OcrResourceLoader or GlyphReaderLoader class is instantiated in a static constructor of the class that invokes GlyphReader.

By default the unmanaged assemblies are found in SDK folder\OcrResources\GlyphReader\v3.0\.

Due to the architecture of the GlyphReader engine, to specify a location other than a default search path such as System32, you need to create an instance of the OcrResourceLoader or GlyphReaderLoader in a static constructor before any OCR code is loaded. This is the case even if the resources are in the assembly folder. There you can specify an alternate location of the resources if desired.

When consuming GlyphReaderEngine in a web application, web service or WCF service, see HOWTO: Load GlyphReaderEngine by Reflection

Example

See GlyphReaderEngine object reference for examples.

See Also

Original Article:
Q10362 - INFO: GlyphReader Engine - Overview