GlyphReader Engine
The glyph reader engine is a highly accurate OCR engine built for DotImage.
The engine has been tested with the ISRI OCR
Performance Toolkit and has been found to be more accurate, with a 99.5%
accuracy rate, than other expensive industry leading OCR engines.
GlyphReader is a lexicon OCR engine requiring no dictionary. It supports
European characters only. The following ASCII characters are supported.
The GlyphReader engine does not support font name or family
determination. This engine does support font size, baseline, glyph
bounds, and confidence.
The DefaultFontName property
determines what the font name used by the OcrGlyph class.
Features
GlyphReader supports the following features:
- European Character Set
- Reports individual character position and size
- Reports character confidence
- OCR's of rotated pages, reports the rotation angle
- Automatically breaks merged characters, or merges broken characters
- Optionally rejects low confidence characters
- Optionally reject low confidence lines
- Disabling recognition of specific characters
- Full Page color OCR can be generated when combined with the Searchable PDF Module
Features that are found in some engines but not in GlyphReader include
zoning, and determining font characteristics
Output Formats
As with any OCR engine using the DotImage OCR interface, all foreign
translators are supported. Text translation is supported out of the box.
Searchable PDF is available with the PDF Translator add-on. Therefore, the
following mime types are supported for output:
- text/plain
- application/pdf (Requires PDF
Translator add-on)
Licensing
The DotImage OCR GlyphReader Engine is licensed per concurrent use. Two
GlyphReader licenses are required for two applications to use GlyphReader
simultaneously. If the application will only be residing on the server, you have
the option of purchasing a server license granting an unlimited number of users
connected to the server running the DotImage OCR GlyphReader Engine enabled
application with up to 20 concurrent processes/threads running at
once.
Deployment
- Atalasoft.Shared.dll
- Atalasoft.dotImage.Lib.dll
- Atalasoft.dotImage.dll
- Atalasoft.dotImage.Ocr.dll
- Atalasoft.dotImage.GlyphReader.dll
GlyphReader also requires the following unmanaged assemblies and support
files, located in Program
Files\Atalasoft\DotImage
10.0\Bin\OcrResources\GlyphReader\v3.0:
- GlyphReader.dll
- GlyphReaderEngine.exe
- GlyphReader.ini
- TOCR31.gar
- TOCR31.n3s
- TOCR31.teh
These assemblies must be located on the client machine in System32, or can be
installed along side the managed assemblies only if the OcrResourceLoader or GlyphReaderLoader class is instantiated in
a static constructor of the class that invokes GlyphReader.
By default the unmanaged assemblies are found in SDK folder\OcrResources\GlyphReader\v3.0\.
Due to the architecture of the GlyphReader engine, to specify a location
other than a default search path such as System32, you need to create an
instance of the OcrResourceLoader or
GlyphReaderLoader in a static
constructor before any OCR code is loaded. This is the case even if the
resources are in the assembly folder. There you can specify an alternate
location of the resources if desired.
When consuming GlyphReaderEngine in a web application, web service or WCF
service, see HOWTO: Load GlyphReaderEngine by Reflection
Example
See GlyphReaderEngine object
reference for examples.
See
Also
Original Article:
Q10362 - INFO: GlyphReader Engine - Overview