SCRIPT_PROPERTIES structure (usp10.h)
Contains information about special processing for each script.
Syntax
typedef struct {
DWORD langid : 16;
DWORD fNumeric : 1;
DWORD fComplex : 1;
DWORD fNeedsWordBreaking : 1;
DWORD fNeedsCaretInfo : 1;
DWORD bCharSet : 8;
DWORD fControl : 1;
DWORD fPrivateUseArea : 1;
DWORD fNeedsCharacterJustify : 1;
DWORD fInvalidGlyph : 1;
DWORD fInvalidLogAttr : 1;
DWORD fCDM : 1;
DWORD fAmbiguousCharSet : 1;
DWORD fClusterSizeVaries : 1;
DWORD fRejectInvalid : 1;
} SCRIPT_PROPERTIES;
Members
langid
Language identifier for the language associated with the script. When a script is used for many languages, this member represents a default language. For example, Western script is represented by LANG_ENGLISH although it is also used for French, German, and other European languages.
fNumeric
Value indicating if a script contains only digits and the other characters used in writing numbers by the rules of the Unicode bidirectional algorithm. For example, currency symbols, the thousands separator, and the decimal point are classified as numeric when adjacent to or between digits. Possible values for this member are defined in the following table.
fComplex
Value indicating a complex script for a language that requires special shaping or layout. Possible values are defined in the following table.
Value | Meaning |
---|---|
|
The script requires special shaping or layout. |
|
The script contains no combining characters and requires no contextual shaping or reordering. |
fNeedsWordBreaking
Value indicating the type of word break placement for a language. Possible values are defined in the following table.
Value | Meaning |
---|---|
|
The language has word break placement that requires the application to call ScriptBreak and that includes character positions marked by the fWordStop member in SCRIPT_LOGATTR. |
|
Word break placement is identified by scanning for characters marked by the fWhiteSpace member in SCRIPT_LOGATTR, or for glyphs marked by the value SCRIPT_JUSTIFY_BLANK or SCRIPT_JUSTIFY_ARABIC_BLANK for the uJustification member of SCRIPT_VISATTR. |
fNeedsCaretInfo
Value indicating if a language, for example, Thai or Indian, restricts caret placement to cluster boundaries. Possible values are defined in the following table. To determine valid caret positions, the application inspects the fCharStop value in the logical attributes retrieved by ScriptBreak, or compares adjacent values in the pwLogClust array retrieved by ScriptShape.
Value | Meaning |
---|---|
|
The language restricts caret placement to cluster boundaries. |
|
The language does not restrict caret placement to cluster boundaries. |
bCharSet
Nominal character set associated with the script. During creation of a font suitable for displaying the script, this character set can be used as the value of the lfCharSet member of LOGFONT.
For a new script having no character set defined, the application should typically set bCharSet to DEFAULT_CHARSET. See the description of member fAmbiguousCharSet.
fControl
Value indicating if only control characters are used in the script. Possible values are defined in the following table. Note that every control character does not end up in a SCRIPT_CONTROL structure.
Value | Meaning |
---|---|
|
Set only control characters in the script. |
|
Do not set only control characters in the script. |
fPrivateUseArea
Value indicating the use of a private use area, a special set of characters that is privately defined for the Unicode range U+E000 through U+F8FF. Possible values are defined in the following table.
Value | Meaning |
---|---|
|
Use a private use area. |
|
Do not use a private use area. |
fNeedsCharacterJustify
Value indicating the handling of justification for the script by increasing all the spaces between letters, not just the spaces between words. Possible values are defined in the following table. When performing inter-character justification, Uniscribe inserts extra space only after glyphs marked with the SCRIPT_JUSTIFY_CHARACTER value for the uJustification member of SCRIPT_VISATTR.
Value | Meaning |
---|---|
|
Use character justification. |
|
Do not use character justification. |
fInvalidGlyph
Value indicating if ScriptShape generates an invalid glyph for a script to represent invalid sequences. Possible values are defined in the following table. The application can obtain the glyph index of the invalid glyph for a particular font by calling ScriptGetFontProperties.
Value | Meaning |
---|---|
|
Generate an invalid glyph to represent invalid sequences. |
|
Do not generate an invalid glyph to represent invalid sequences. |
fInvalidLogAttr
Value indicating if ScriptBreak marks invalid combinations for a script by setting fInvalid in the logical attributes buffer. Possible values are defined in the following table.
Value | Meaning |
---|---|
|
Mark invalid combinations for the script. |
|
Do not mark invalid combinations for the script. |
fCDM
Value indicating if a script contains an item that has been analyzed by ScriptItemize as including Combining Diacritical Marks (U+0300 through U+36F). Possible values are defined in the following table.
Value | Meaning |
---|---|
|
The script contains an item that includes combining diacritical marks. |
|
The script does not contain an item that includes combining diacritical marks. |
fAmbiguousCharSet
Value indicating if a script contains characters that are supported by more than one character set. Possible values are defined in the following table.
Value | Meaning |
---|---|
|
The script contains characters that are supported by more than one character set. In this case, the bCharSet member of this structure should be ignored, and the lfCharSet member of LOGFONT should be set to DEFAULT_CHARSET. See the Remarks section for more information. |
|
The script does not contain characters that are supported by more than one character set. |
fClusterSizeVaries
Value indicating if a script, such as Arabic, might use contextual shaping that causes a string to increase in size during removal of characters. Possible values are defined in the following table.
Value | Meaning |
---|---|
|
Use a variable cluster size for contextual shaping. |
|
Do not use a variable cluster size for contextual shaping. |
fRejectInvalid
Value indicating if a script, for example, Thai, should reject invalid sequences that conventionally cause an editor program, such as Notepad, to beep and ignore keystrokes. Possible values are defined in the following table.
Value | Meaning |
---|---|
|
Reject invalid sequences. |
|
Do not reject invalid sequences. |
Remarks
This structure is filled by the ScriptGetProperties function.
Many Uniscribe scripts do not correspond directly to 8-bit character sets. When some of the characters in a script are supported by more than one character set, the fAmbiguousCharSet member is set. The application should do further processing to determine the character set to use when requesting a font suitable for the run. For example, it might determine that the run consists of multiple languages and split the run so that a different font is used for each language.
The application uses the following code during initialization to get a pointer to the SCRIPT_PROPERTIES array.
const SCRIPT_PROPERTIES **ppScriptProperties; // Array of pointers
// to properties
int iMaxScript;
HRESULT hr;
hr = ScriptGetProperties(&ppScriptProperties, &iMaxScript);
Then the application can inspect the properties of the script of an item as shown in the next example.
hr = ScriptItemize(pwcInChars, cInChars, cMaxItems, psControl, psState, pItems, pcItems);
//...
if (ppScriptProperties[pItems[iItem].a.eScript]->fNeedsCaretInfo)
{
// Use ScriptBreak to restrict the caret from entering clusters (for example).
}
Requirements
Requirement | Value |
---|---|
Minimum supported client | Windows 2000 Professional [desktop apps only] |
Minimum supported server | Windows 2000 Server [desktop apps only] |
Header | usp10.h |