The Full Wiki

More info on Precomposed character

Precomposed character: Wikis


Note: Many of our articles have direct quotes from sources you can cite, within the Wikipedia article! This article doesn't yet, but we're working on it! See more info or our list of citable articles.


From Wikipedia, the free encyclopedia

A precomposed character (alternatively decomposable character) is a Unicode entity that can be decomposed into an equivalent string of several other characters. Typically, a precomposed character is decomposed into the main character and a combining diacritical mark.

The precomposed characters are included in the character set to aid computer systems with incomplete Unicode support, where decomposed equivalent characters may render incorrectly.

Similarly, ligatures are precompositions of their constituent letters or graphemes.

For example, the two strings

ḱṷṓn (U+006B U+0301 U+0075 U+032D U+006F U+0304 U+0301 U+006E) and
ḱṷṓn (U+1E31 U+1E77 U+1E53 U+006E)

are equivalent and should render identically. In practice, however, some Unicode implementations still have difficulties with combining the decomposed characters.

OpenType has the ccmp "feature tag" to define glyphs that are compositions or decompositions involving combining characters.

In theory, most Chinese characters as encoded by Han unification and similar schemes could be treated as precomposed characters, since they can be reduced (decomposed) to their constituent strokes and ideograph descriptions, though Unicode does not take this approach that would certainly be on the cutting edge of text storage and layout. Such an approach could potentially reduce the number of characters in the character set from tens of thousands to just a few hundred. On the other hand, a character set encoded in this way would also produce documents that were tenfold larger in bytes to represent the same characters as Unicode.

See also

External links


Got something to say? Make a comment.
Your name
Your email address