![]() | OcrApi Class |
Namespace: Patagames.Ocr
The OcrApi type exposes the following members.
Name | Description | |
---|---|---|
![]() | AllWordConfidences |
Returns all word confidences (between 0 and 100) in an array.
|
![]() | AvailableLanguages |
Gets the available languages.
|
![]() | DataPath |
Gets the path to the tessdata folder
|
![]() | EngineMode |
Gets curent OEM.
|
![]() | Handle |
Gets the handle to the tesseract API object
|
![]() | InitLanguages |
Gets the languages string used in the last valid initialization.
|
![]() | InputFilters |
Gets the collection of filters that are applied to the input image.
|
![]() | InputImage |
Gets or sets the input image
|
![]() | InputName |
Gets or sets the name of the input file.
Needed for training and reading a UNLV zone file, and for searchable PDF output.
|
![]() | Iterator |
Get a reading-order iterator to the results of LayoutAnalysis and/or Recognize.
|
![]() ![]() | LicenseKey |
Gets or sets license key. Null for trial mode.
|
![]() | LoadedLanguages |
Gets the loaded languages. Includes all languages loaded by the last Init, including those
loaded as dependencies of other loaded languages
|
![]() | MutableIterator |
Get a mutable iterator to the results of LayoutAnalysis and/or Recognize.
|
![]() | OutputName |
Gets or sets the name of the output files. Needed only for debugging.
|
![]() | PageSegmentationMode |
Gets or sets the current page segmentation mode.
|
![]() ![]() | PathToEngine |
Gets or sets path to the tesseract.dll. Null for automatic detection. See remarks sections for detail.
|
![]() | Rectangle |
Restrict recognition to a sub-rectangle of the image.
|
![]() | SourceResolution |
Gets or set the resolution of the source image in pixels per inch.
This should be setted right after SetImage, and will let us return
appropriate font sizes for the text.
|
![]() | TextConfidences |
Gets the (average) confidence value between 0 and 100.
|
![]() | ThresholdedImage |
Get a copy of the internal thresholded image from Tesseract.
|
![]() | ThresholdedImageScaleFactor |
Gets the scale factor of the thresholded image that would be returned ThresholdedImage
and the various methods that call GetComponentImages.
Equals 0 if no thresholder has been set.
|
![]() | Version |
Gets the version identifier as a static string.
|
Name | Description | |
---|---|---|
![]() | AdaptToWordStr |
Applies the given word to the adaptive classifier if possible
|
![]() | AnalyseLayout |
Runs page layout analysis in the mode set by PageSegmentationMode.
|
![]() | Clear |
Free up recognition results and any stored image data, without actually
freeing any recognition data that would be time-consuming to reload.
Afterwards, you must call SetImage or GetTextFromImage before doing
any Recognize or Get* operation.
|
![]() | ClearAdaptiveClassifier |
Call between pages or documents etc to free up memory and forget adaptive data.
|
![]() | ClearPersistentCache |
Clear any library-level memory caches.
|
![]() ![]() | Create |
Create handle to base APIs interface
|
![]() | Dispose |
Releases all resources used by this OcrApi
|
![]() | DumpToPGM | Obsolete.
Dump the internal binary image to a PGM file.
|
![]() | GetAltoText |
Make an XML-formatted string with Alto markup from the internal data structures.
|
![]() | GetBoolVariable |
Get the value of an internal "parameter."
|
![]() | GetBoxText |
The recognized text is returned as a char* which is coded in the same format as a box file used in training.
|
![]() | GetComponentImages(PageIteratorLevel, Boolean, OcrBoxa, OcrPixa, Int32) |
Get the given level kind of components (block, textline, word etc.) as a leptonica-style Boxa, Pixa pair, in reading order.
|
![]() | GetComponentImages(PageIteratorLevel, Boolean, Boolean, Int32, OcrBoxa, OcrPixa, Int32, Int32) |
Get the given level kind of components (block, textline, word etc.) as a leptonica-style Boxa, Pixa pair, in reading order.
|
![]() | GetDoubleVariable |
Get the value of an internal "parameter."
|
![]() | GetHOCRText |
Make a HTML-formatted string with hOCR markup from the internal data structures.
|
![]() | GetIntVariable |
Get the value of an internal "parameter."
|
![]() | GetLSTMBoxText |
Make a box file for LSTM training from the internal data structures.
|
![]() | GetRegions |
Get the result of page layout analysis as a leptonica-style Boxa, Pixa pair, in reading order.
|
![]() | GetStringVariable |
Get the value of an internal "parameter."
|
![]() | GetStrips |
Get textlines and strips of image regions as a leptonica-style Boxa, Pixa pair, in reading order.Enables downstream handling of non-rectangular regions.
|
![]() | GetSymbols |
Get the symbols as a leptonica-style Boxa, Pixa pair, in reading order.
|
![]() | GetTextFromImage(Bitmap) |
Recognize a rectangle from an image and return the result as a string.
|
![]() | GetTextFromImage(String) |
Recognize a rectangle from an image and return the result as a string.
|
![]() | GetTextFromImage(OcrPix) |
Recognize a rectangle from an image and return the result as a string.
|
![]() | GetTextFromImage(Bitmap, Rectangle) |
Recognize a rectangle from an image and return the result as a string.
|
![]() | GetTextFromImage(String, Rectangle) |
Recognize a rectangle from an image and return the result as a string.
|
![]() | GetTextFromImage(OcrPix, Rectangle) |
Recognize a rectangle from an image and return the result as a string.
|
![]() | GetTextFromImage(Bitmap, Point, Size) |
Recognize a rectangle from an image and return the result as a string.
|
![]() | GetTextFromImage(String, Point, Size) |
Recognize a rectangle from an image and return the result as a string.
|
![]() | GetTextFromImage(OcrPix, Point, Size) |
Recognize a rectangle from an image and return the result as a string.
|
![]() | GetTextFromImage(Bitmap, Int32, Int32, Int32, Int32) |
Recognize a rectangle from an image and return the result as a string.
|
![]() | GetTextFromImage(String, Int32, Int32, Int32, Int32) |
Recognize a rectangle from an image and return the result as a string.
|
![]() | GetTextFromImage(OcrPix, Int32, Int32, Int32, Int32) |
Recognize a rectangle from an image and return the result as a string.
|
![]() | GetTextlines(OcrBoxa, OcrPixa, Int32) |
Get the textlines as a leptonica-style Boxa, Pixa pair, in reading order.
|
![]() | GetTextlines(Boolean, Int32, OcrBoxa, OcrPixa, Int32, Int32) |
Get the textlines as a leptonica-style Boxa, Pixa pair, in reading order.
|
![]() | GetTsvText |
Make a TSV-formatted string with Alto markup from the internal data structures.
|
![]() | GetUNLVText |
The recognized text is returned as a char* which is coded as UNLV format Latin-1 with specific reject and suspect codes
|
![]() | GetUtf8Text |
The recognized text is returned as a string which is coded as UTF8
|
![]() | GetWords |
Get the words as a leptonica-style Boxa, Pixa pair, in reading order.
|
![]() | GetWordStrBoxText |
The recognized text is returned as a char* which is coded in the same format as a WordStr box file used in training.
|
![]() | Init(Languages, String, OcrEngineMode, String, String, String, Boolean) |
Initialize the OCR SDK library
|
![]() | Init(Languages, String, OcrEngineMode, String, String, String, Boolean) |
Initialize the OCR SDK library
|
![]() | Init(String, String, OcrEngineMode, String, String, String, Boolean, Boolean) |
Initialize the OCR SDK library
|
![]() | InitForAnalysePage |
Init only for page layout analysis.
|
![]() | InitLang |
Init only the lang model component of Tesseract.
|
![]() | IsValidWord |
Check whether a word is valid according to Tesseract's language model
|
![]() | PrintVariablesToFile |
Print Tesseract parameters to the given file.
|
![]() | ProcessPage |
Turn a single image into symbolic text.
|
![]() | ProcessPages |
Turns images into symbolic text.
|
![]() | ReadConfigFiles |
Read a "config" file containing a set of parameter name, value pairs.
|
![]() | ReadDebugConfigFiles |
Same as ReadConfigFiles(String), but only set debug params from the given config file.
|
![]() | Recognize |
Recognize the image from SetImage, generating Tesseract internal structures.
|
![]() | RecognizeForChopTest |
Variant on Recognize used for testing chopper
|
![]() | Release |
Close down tesseract and free up all memory.
Once Release() has been used, none of the other API functions may be used
other than Init.
|
![]() | SetImage(Bitmap) |
Provide an image for Tesseract to recognize.
|
![]() | SetImage(OcrPix) |
Provide an image for Tesseract to recognize.
|
![]() | SetVariable |
Set the value of an internal "parameter."
|