Class CharScript
java.lang.Object
org.apache.fop.complexscripts.util.CharScript
Script related utilities.
This work was originally authored by Glenn Adams (gadams@apache.org).
-
Field Summary
Modifier and TypeFieldDescriptionstatic final int
arabic script constantstatic final int
bengali script constantstatic final int
bengali 2 script constantstatic final int
bopomofo script constantstatic final int
burmese script constantstatic final int
cyrillic script constantstatic final int
devanagari script constantstatic final int
devanagari 2 script constantstatic final int
ethiopic script constantstatic final int
georgian script constantstatic final int
greek script constantstatic final int
gujarati script constantstatic final int
gujarati 2 script constantstatic final int
gurmukhi script constantstatic final int
gurmukhi 2 script constantstatic final int
han script constantstatic final int
hangul script constantstatic final int
hebrew script constantstatic final int
hiragana script constantstatic final int
kannada script constantstatic final int
kannada 2 script constantstatic final int
katakana script constantstatic final int
khmer script constantstatic final int
lao script constantstatic final int
latin script constantstatic final int
malayalam script constantstatic final int
malayalam 2 script constantstatic final int
math script constantstatic final int
mongolian script constantstatic final int
oriya script constantstatic final int
oriya 2 script constantstatic final int
sinhalese script constantstatic final int
symbol script constantstatic final int
tamil script constantstatic final int
tamil 2 script constantstatic final int
telugu script constantstatic final int
telugu 2 script constantstatic final int
thai script constantstatic final int
tibetan script constantstatic final int
uncoded script constantstatic final int
undetermined script constant -
Method Summary
Modifier and TypeMethodDescriptionstatic int
Determine the dominant script of a character sequence.static boolean
isArabic
(int c) Determine if character c belong to the arabic script.static boolean
isBengali
(int c) Determine if character c belong to the bengali script.static boolean
isBopomofo
(int c) Determine if character c belong to the bopomofo script.static boolean
isBurmese
(int c) Determine if character c belong to the burmese script.static boolean
isCyrillic
(int c) Determine if character c belong to the cyrillic script.static boolean
isDevanagari
(int c) Determine if character c belong to the devanagari script.static boolean
isDigit
(int c) Determine if character c is a digit.static boolean
isEthiopic
(int c) Determine if character c belong to the ethiopic (amharic) script.static boolean
isGeorgian
(int c) Determine if character c belong to the georgian script.static boolean
isGreek
(int c) Determine if character c belong to the greek script.static boolean
isGujarati
(int c) Determine if character c belong to the gujarati script.static boolean
isGurmukhi
(int c) Determine if character c belong to the gurmukhi script.static boolean
isHan
(int c) Determine if character c belong to the han (unified cjk) script.static boolean
isHangul
(int c) Determine if character c belong to the hangul script.static boolean
isHebrew
(int c) Determine if character c belong to the hebrew script.static boolean
isHiragana
(int c) Determine if character c belong to the hiragana script.static boolean
isIndicScript
(int script) Determine if script tag denotes an 'Indic' script, where a script is an 'Indic' script if it is intended to be processed by the generic 'Indic' Script Processor.static boolean
isIndicScript
(String script) Determine if script tag denotes an 'Indic' script, where a script is an 'Indic' script if it is intended to be processed by the generic 'Indic' Script Processor.static boolean
isKannada
(int c) Determine if character c belong to the kannada script.static boolean
isKatakana
(int c) Determine if character c belong to the katakana script.static boolean
isKhmer
(int c) Determine if character c belong to the khmer script.static boolean
isLao
(int c) Determine if character c belong to the lao script.static boolean
isLatin
(int c) Determine if character c belong to the latin script.static boolean
isMalayalam
(int c) Determine if character c belong to the malayalam script.static boolean
isMongolian
(int c) Determine if character c belong to the mongolian script.static boolean
isOriya
(int c) Determine if character c belong to the oriya script.static boolean
isPunctuation
(int c) Determine if character c is punctuation.static boolean
isSinhalese
(int c) Determine if character c belong to the sinhalese script.static boolean
isTamil
(int c) Determine if character c belong to the tamil script.static boolean
isTelugu
(int c) Determine if character c belong to the telugu script.static boolean
isThai
(int c) Determine if character c belong to the thai script.static boolean
isTibetan
(int c) Determine if character c belong to the tibetan script.static int
scriptCodeFromTag
(String tag) Determine the internal script code associated with a script tag.static int
scriptOf
(int c) Obtain ISO15924 numeric script code of character.static int[]
Obtain the script codes of each character in a character sequence.static String
scriptTagFromCode
(int code) Determine the script tag associated with an internal script code.static int
useV2IndicRules
(int sc) Obtain the V2 indic script code corresponding to V1 indic script code SC if and only iff V2 indic rules apply; otherwise return SC.
-
Field Details
-
SCRIPT_HEBREW
public static final int SCRIPT_HEBREWhebrew script constant- See Also:
-
SCRIPT_MONGOLIAN
public static final int SCRIPT_MONGOLIANmongolian script constant- See Also:
-
SCRIPT_ARABIC
public static final int SCRIPT_ARABICarabic script constant- See Also:
-
SCRIPT_GREEK
public static final int SCRIPT_GREEKgreek script constant- See Also:
-
SCRIPT_LATIN
public static final int SCRIPT_LATINlatin script constant- See Also:
-
SCRIPT_CYRILLIC
public static final int SCRIPT_CYRILLICcyrillic script constant- See Also:
-
SCRIPT_GEORGIAN
public static final int SCRIPT_GEORGIANgeorgian script constant- See Also:
-
SCRIPT_BOPOMOFO
public static final int SCRIPT_BOPOMOFObopomofo script constant- See Also:
-
SCRIPT_HANGUL
public static final int SCRIPT_HANGULhangul script constant- See Also:
-
SCRIPT_GURMUKHI
public static final int SCRIPT_GURMUKHIgurmukhi script constant- See Also:
-
SCRIPT_GURMUKHI_2
public static final int SCRIPT_GURMUKHI_2gurmukhi 2 script constant- See Also:
-
SCRIPT_DEVANAGARI
public static final int SCRIPT_DEVANAGARIdevanagari script constant- See Also:
-
SCRIPT_DEVANAGARI_2
public static final int SCRIPT_DEVANAGARI_2devanagari 2 script constant- See Also:
-
SCRIPT_GUJARATI
public static final int SCRIPT_GUJARATIgujarati script constant- See Also:
-
SCRIPT_GUJARATI_2
public static final int SCRIPT_GUJARATI_2gujarati 2 script constant- See Also:
-
SCRIPT_BENGALI
public static final int SCRIPT_BENGALIbengali script constant- See Also:
-
SCRIPT_BENGALI_2
public static final int SCRIPT_BENGALI_2bengali 2 script constant- See Also:
-
SCRIPT_ORIYA
public static final int SCRIPT_ORIYAoriya script constant- See Also:
-
SCRIPT_ORIYA_2
public static final int SCRIPT_ORIYA_2oriya 2 script constant- See Also:
-
SCRIPT_TIBETAN
public static final int SCRIPT_TIBETANtibetan script constant- See Also:
-
SCRIPT_TELUGU
public static final int SCRIPT_TELUGUtelugu script constant- See Also:
-
SCRIPT_TELUGU_2
public static final int SCRIPT_TELUGU_2telugu 2 script constant- See Also:
-
SCRIPT_KANNADA
public static final int SCRIPT_KANNADAkannada script constant- See Also:
-
SCRIPT_KANNADA_2
public static final int SCRIPT_KANNADA_2kannada 2 script constant- See Also:
-
SCRIPT_TAMIL
public static final int SCRIPT_TAMILtamil script constant- See Also:
-
SCRIPT_TAMIL_2
public static final int SCRIPT_TAMIL_2tamil 2 script constant- See Also:
-
SCRIPT_MALAYALAM
public static final int SCRIPT_MALAYALAMmalayalam script constant- See Also:
-
SCRIPT_MALAYALAM_2
public static final int SCRIPT_MALAYALAM_2malayalam 2 script constant- See Also:
-
SCRIPT_SINHALESE
public static final int SCRIPT_SINHALESEsinhalese script constant- See Also:
-
SCRIPT_BURMESE
public static final int SCRIPT_BURMESEburmese script constant- See Also:
-
SCRIPT_THAI
public static final int SCRIPT_THAIthai script constant- See Also:
-
SCRIPT_KHMER
public static final int SCRIPT_KHMERkhmer script constant- See Also:
-
SCRIPT_LAO
public static final int SCRIPT_LAOlao script constant- See Also:
-
SCRIPT_HIRAGANA
public static final int SCRIPT_HIRAGANAhiragana script constant- See Also:
-
SCRIPT_ETHIOPIC
public static final int SCRIPT_ETHIOPICethiopic script constant- See Also:
-
SCRIPT_HAN
public static final int SCRIPT_HANhan script constant- See Also:
-
SCRIPT_KATAKANA
public static final int SCRIPT_KATAKANAkatakana script constant- See Also:
-
SCRIPT_MATH
public static final int SCRIPT_MATHmath script constant- See Also:
-
SCRIPT_SYMBOL
public static final int SCRIPT_SYMBOLsymbol script constant- See Also:
-
SCRIPT_UNDETERMINED
public static final int SCRIPT_UNDETERMINEDundetermined script constant- See Also:
-
SCRIPT_UNCODED
public static final int SCRIPT_UNCODEDuncoded script constant- See Also:
-
-
Method Details
-
isPunctuation
public static boolean isPunctuation(int c) Determine if character c is punctuation.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character is punctuation
-
isDigit
public static boolean isDigit(int c) Determine if character c is a digit.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character is a digit
-
isHebrew
public static boolean isHebrew(int c) Determine if character c belong to the hebrew script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to hebrew script
-
isMongolian
public static boolean isMongolian(int c) Determine if character c belong to the mongolian script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to mongolian script
-
isArabic
public static boolean isArabic(int c) Determine if character c belong to the arabic script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to arabic script
-
isGreek
public static boolean isGreek(int c) Determine if character c belong to the greek script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to greek script
-
isLatin
public static boolean isLatin(int c) Determine if character c belong to the latin script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to latin script
-
isCyrillic
public static boolean isCyrillic(int c) Determine if character c belong to the cyrillic script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to cyrillic script
-
isGeorgian
public static boolean isGeorgian(int c) Determine if character c belong to the georgian script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to georgian script
-
isHangul
public static boolean isHangul(int c) Determine if character c belong to the hangul script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to hangul script
-
isGurmukhi
public static boolean isGurmukhi(int c) Determine if character c belong to the gurmukhi script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to gurmukhi script
-
isDevanagari
public static boolean isDevanagari(int c) Determine if character c belong to the devanagari script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to devanagari script
-
isGujarati
public static boolean isGujarati(int c) Determine if character c belong to the gujarati script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to gujarati script
-
isBengali
public static boolean isBengali(int c) Determine if character c belong to the bengali script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to bengali script
-
isOriya
public static boolean isOriya(int c) Determine if character c belong to the oriya script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to oriya script
-
isTibetan
public static boolean isTibetan(int c) Determine if character c belong to the tibetan script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to tibetan script
-
isTelugu
public static boolean isTelugu(int c) Determine if character c belong to the telugu script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to telugu script
-
isKannada
public static boolean isKannada(int c) Determine if character c belong to the kannada script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to kannada script
-
isTamil
public static boolean isTamil(int c) Determine if character c belong to the tamil script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to tamil script
-
isMalayalam
public static boolean isMalayalam(int c) Determine if character c belong to the malayalam script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to malayalam script
-
isSinhalese
public static boolean isSinhalese(int c) Determine if character c belong to the sinhalese script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to sinhalese script
-
isBurmese
public static boolean isBurmese(int c) Determine if character c belong to the burmese script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to burmese script
-
isThai
public static boolean isThai(int c) Determine if character c belong to the thai script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to thai script
-
isKhmer
public static boolean isKhmer(int c) Determine if character c belong to the khmer script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to khmer script
-
isLao
public static boolean isLao(int c) Determine if character c belong to the lao script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to lao script
-
isEthiopic
public static boolean isEthiopic(int c) Determine if character c belong to the ethiopic (amharic) script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to ethiopic (amharic) script
-
isHan
public static boolean isHan(int c) Determine if character c belong to the han (unified cjk) script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to han (unified cjk) script
-
isBopomofo
public static boolean isBopomofo(int c) Determine if character c belong to the bopomofo script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to bopomofo script
-
isHiragana
public static boolean isHiragana(int c) Determine if character c belong to the hiragana script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to hiragana script
-
isKatakana
public static boolean isKatakana(int c) Determine if character c belong to the katakana script.- Parameters:
c
- a character represented as a unicode scalar value- Returns:
- true if character belongs to katakana script
-
scriptOf
public static int scriptOf(int c) Obtain ISO15924 numeric script code of character. If script is not or cannot be determined, then the script code 998 ('zyyy') is returned.- Parameters:
c
- the character to obtain script- Returns:
- an ISO15924 script code
-
useV2IndicRules
public static int useV2IndicRules(int sc) Obtain the V2 indic script code corresponding to V1 indic script code SC if and only iff V2 indic rules apply; otherwise return SC.- Parameters:
sc
- a V1 indic script code- Returns:
- either SC or the V2 flavor of SC if V2 indic rules apply
-
scriptsOf
Obtain the script codes of each character in a character sequence. If script is not or cannot be determined for some character, then the script code 998 ('zyyy') is returned.- Parameters:
cs
- the character sequence- Returns:
- a (possibly empty) array of script codes
-
dominantScript
Determine the dominant script of a character sequence.- Parameters:
cs
- the character sequence- Returns:
- the dominant script or SCRIPT_UNDETERMINED
-
isIndicScript
Determine if script tag denotes an 'Indic' script, where a script is an 'Indic' script if it is intended to be processed by the generic 'Indic' Script Processor.- Parameters:
script
- a script tag- Returns:
- true if script tag is a designated 'Indic' script
-
isIndicScript
public static boolean isIndicScript(int script) Determine if script tag denotes an 'Indic' script, where a script is an 'Indic' script if it is intended to be processed by the generic 'Indic' Script Processor.- Parameters:
script
- a script code- Returns:
- true if script code is a designated 'Indic' script
-
scriptTagFromCode
Determine the script tag associated with an internal script code.- Parameters:
code
- the script code- Returns:
- a script tag
-
scriptCodeFromTag
Determine the internal script code associated with a script tag.- Parameters:
tag
- the script tag- Returns:
- a script code
-