CLASSIFIER_TESTER

Section:   (1)
Updated: 01/11/2023
Index Return to Main Contents
 

NAME

classifier_tester - for *legacy tesseract* engine.  

SYNOPSIS

classifier_tester -U unicharset_file -F font_properties_file -X xheights_file -classifier x -lang lang [-output_trainer trainer] *.tr  

DESCRIPTION

classifier_tester(1) runs Tesseract in a special mode. It takes a list of .tr files and tests a character classifier on data as formatted for training, but it doesn't have to be the same as the training data.  

IN/OUT ARGUMENTS

a list of .tr files  

OPTIONS

-l lang

(Input) three character language code; default value eng.

-classifier x

(Input) One of "pruner", "full".

-U unicharset

(Input) The unicharset for the language.

-F font_properties_file

(Input) font properties file, each line is of the following form, where each field other than the font name is 0 or 1:

*font_name* *italic* *bold* *fixed_pitch* *serif* *fraktur*

-X xheights_file

(Input) x heights file, each line is of the following form, where xheight is calculated as the pixel x height of a character drawn at 32pt on 300 dpi. [ That is, if base x height + ascenders + descenders = 133, how much is x height? ]

*font_name* *xheight*

-output_trainer trainer

(Output, Optional) Filename for output trainer.
 

SEE ALSO

tesseract(1)  

COPYING

Copyright (C) 2012 Google, Inc. Licensed under the Apache License, Version 2.0  

AUTHOR

The Tesseract OCR engine was written by Ray Smith and his research groups at Hewlett Packard (1985-1995) and Google (2006-present).


 

Index

NAME
SYNOPSIS
DESCRIPTION
IN/OUT ARGUMENTS
OPTIONS
SEE ALSO
COPYING
AUTHOR

This document was created by man2html, using the manual pages.
Time: 01:10:55 GMT, April 24, 2024