classifier_tester(1) - Man pages

dwww Home | Manual pages | Find package
CLASSIFIER_TESTER(1)                                      CLASSIFIER_TESTER(1)

NAME
       classifier_tester - for *legacy tesseract* engine.

SYNOPSIS
       classifier_tester -U unicharset_file -F font_properties_file -X
       xheights_file -classifier x -lang lang [-output_trainer trainer] *.tr

DESCRIPTION
       classifier_tester(1) runs Tesseract in a special mode. It takes a list
       of .tr files and tests a character classifier on data as formatted for
       training, but it doesn’t have to be the same as the training data.

IN/OUT ARGUMENTS
       a list of .tr files

OPTIONS
       -l lang
           (Input) three character language code; default value eng.

       -classifier x
           (Input) One of "pruner", "full".

       -U unicharset
           (Input) The unicharset for the language.

       -F font_properties_file
           (Input) font properties file, each line is of the following form,
           where each field other than the font name is 0 or 1:

               *font_name* *italic* *bold* *fixed_pitch* *serif* *fraktur*

       -X xheights_file
           (Input) x heights file, each line is of the following form, where
           xheight is calculated as the pixel x height of a character drawn at
           32pt on 300 dpi. [ That is, if base x height + ascenders +
           descenders = 133, how much is x height? ]

               *font_name* *xheight*

       -output_trainer trainer
           (Output, Optional) Filename for output trainer.

SEE ALSO
       tesseract(1)

COPYING
       Copyright (C) 2012 Google, Inc. Licensed under the Apache License,
       Version 2.0

AUTHOR
       The Tesseract OCR engine was written by Ray Smith and his research
       groups at Hewlett Packard (1985-1995) and Google (2006-present).

                                  01/11/2023              CLASSIFIER_TESTER(1)
Generated by dwww version 1.15 on Fri Jun 28 16:07:43 CEST 2024.