In Written Chinese, components are building blocks of characters, composed of strokes. In most cases, a component consists of more than one stroke, and is smaller than the whole of the character. For example, the character consists of two components: and . These can be further decomposed: can be analyzed as the sequence of strokes, and as the sequence .
There are two methods for Chinese character component analysis, hierarchical dividing and plane dividing. Hierarchical dividing separates layer by layer from larger to smaller components, and finally gets the primitive components. Plane dividing separates out the primitive components at one time.
The structure of a Chinese character is the pattern or rule in which the character is formed by its (first level) components. Chinese character structures include single-component structure, left-right structure, up-down structure and surrounding structure.
Chinese characters may be analyzed in terms of smaller components. This analysis is generally based on graphical forms, without considering aspects like pronunciation and meaning.
Component analysis is very helpful for learning Chinese characters. For example:
Through component analysis, one may learn characters in an easier way. If a student learns first, the knowledge will help with the learning or review of,, and . Obviously, learning by component analysis is much more efficient than learning by analyzing each character to strokes. Component analysis is also used in Chinese character encoding for computer input.
There are two methods for Chinese character dividing, hierarchical dividing and plane dividing. Hierarchical dividing separates layer by layer from large to small components, and finally gets the primitive components. Plane dividing separates out the primitive components at one time. Hierarchical dividing can display the external structure of Chinese characters, while plane splitting can be regarded as omitting the higher splitting levels, and directly writing out the final separating result of primitive components.
The rules for hierarchical dividing include:
The hierarchical analysis of character in (1) bracketed representation:
(((+(+))+(⿱Chinese: 夂貢|labels=no)(+(+(+))))+), 5 layers of components.or in tree structure: / \ / \ (⿱Chinese: 夂貢|labels=no) / \ / \ / \ / \ / \
The level to which a Chinese character is to be analyzed or divided depends on actual applications.
In plane analysis, only components on the tree-leaves are presented, i.e., :,,,,,,,.
The following is the analysis data of Cihai (Chinese: 辭海|labels=no), with a character set of 16,339 traditional and simplified Chinese characters.
component level | different components | total components | |
---|---|---|---|
1 | 3061 | 32065 | |
2 | 1302 | 34296 | |
3 | 539 | 16777 | |
4 | 195 | 3872 | |
5 | 48 | 396 | |
6 | 12 | 184 | |
7 | 3 | 6 |
A component that can independently form a character is a character component, or a component of independent character formation (Chinese: 成字部件|labels=no). For example, component Chinese: 口|labels=no formed character Chinese: 口|labels=no independently, and is a component in characters Chinese: 另|labels=no, Chinese: 洁|labels=no and Chinese: 唱|labels=no; and component Chinese: 相|labels=no is also a character by itself, and a component in Chinese: 湘|labels=no, Chinese: 箱|labels=no and Chinese: 想|labels=no.
A component that can not independently form a character is a non-character component, or a component of dependent character formation (Chinese: 非成字部件|labels=no). For example, component Chinese: 冂|labels=no in character Chinese: 同|labels=no, Chinese: 钢|labels=no and Chinese: 岗|labels=no; and component Chinese: 疒|labels=no in Chinese: 疾|labels=no, Chinese: 病|labels=no and Chinese: 痛|labels=no. Neither Chinese: 冂|labels=no nor Chinese: 疒|labels=no is a character in modern Chinese.
A component that cannot be (further) divided into smaller components by the rules is a primitive component, or basic component (Chinese: 基礎部件|labels=no, Chinese: 基础部件|labels=no). Primitive components are the final-level components of hierarchical dividing. For example, components Chinese: 田|labels=no and Chinese: 力|labels=no in character Chinese: 男|labels=no, and Chinese: 氵|labels=no in character Chinese: 河|labels=no.
A component composed of two or more primitive components is a compound component (Chinese: 合成部件|labels=no). For example, component Chinese: 咅 (立+口)|labels=no in character Chinese: 陪|labels=no, Chinese: 部|labels=no and Chinese: 菩|labels=no, and component Chinese: 相 (木+目)|labels=no in Chinese: 厢|labels=no, Chinese: 霜|labels=no and Chinese: 孀|labels=no.
A component divided out at the first level is called a level-one component, a component divided out at the second level is called a level-two component, and so on. A component divided out at the final level is called a final-level component, i.e., primitive component. For example, in the example of character Chinese: 戇|labels=no, Chinese: 戇|labels=no / \ Chinese: 贛|labels=no Chinese: 心|labels=no (level-one components) / \ Chinese: 章|labels=no (⿱Chinese: 夂貢|labels=no) (level-two components) / \ / \ Chinese: 立|labels=no Chinese: 早|labels=no Chinese: 夂|labels=no Chinese: 貢|labels=no (level-three components) / \ / \ Chinese: 曰|labels=no Chinese: 十|labels=no Chinese: 工|labels=no Chinese: 貝|labels=no (level-four components) / \ Chinese: 目|labels=no Chinese: 八|labels=no (level-five components)
where the leaf components Chinese: 立|labels=no, Chinese: 曰|labels=no, Chinese: 十|labels=no, Chinese: 夂|labels=no, Chinese: 工|labels=no, Chinese: 目|labels=no, Chinese: 八|labels=no and Chinese: 心|labels=no are final-level components or primitive components.
A component formed by one stroke is called a single-stroke component. For example, stroke Chinese: 一|labels=no in character Chinese: 丛,|labels=nostroke ㇑ in character Chinese: 引,|labels=nostroke ㇓ in character Chinese: 系,|labels=nostroke ㇔ in character Chinese: 良,|labels=nostroke ㇆ in character Chinese: 司|labels=no.
A component formed by more than one stroke is called a multi-stroke component. For example,component Chinese: 从|labels=no in character Chinese: 丛|labels=no, Chinese: 弓|labels=no in character Chinese: 引|labels=no, and Chinese: 艮|labels=no of Chinese: 良|labels=no.
Among the 16,339 traditional, simplified and unsimplified characters in Cihai, there are 675 primitive components; among the 11,834 characters excluding the simplified traditional characters, there are 648 primitive components. In Chinese Character Information Dictionary, among the 7,785 China Mainland standard characters, a total of 623 primitive components have been divided out.
serial number | components | characters composed | frequency | |
---|---|---|---|---|
1 | Chinese: 口|labels=no | 2409 | 20.3579% | |
2 | Chinese: 一|labels=no | 1279 | 10.8089% | |
3 | Chinese: 艹|labels=no | 812 | 6.8622% | |
4 | Chinese: 木|labels=no | 791 | 6.6841% | |
5 | Chinese: 人|labels=no | 774 | 6.5404% | |
6 | Chinese: 日|labels=no | 766 | 6.4736% | |
7 | Chinese: 氵|labels=no | 691 | 5.8391% | |
8 | Chinese: 亻|labels=no | 679 | 5.7383% | |
9 | Chinese: 八|labels=no | 642 | 5.4252% | |
10 | Chinese: 土|labels=no | 597 | 5.0457% |
(Divided from 11,834 simplified and unsimplified characters from Cihai).
Chinese character components are widely used in Chinese character keyboard encoding input methods. Different encoding input methods have different ways for component separation. Therefore, it is necessary to formulate norms or standards related to Chinese character components.
"Chinese Character Component Standard of GB13000.1 Character Set for Information Processing" (Chinese: 信息处理用|labels=noGB13000.1Chinese: 字符集汉字部件规范|labels=no) is a standard released on February 1, 1997, by the National Language Commission of China. It includes a "List of Chinese Character Primitive Components". The list contains 560 primitive components. All the 20,902 CJK Chinese characters in the GB13000.1 character set can be formed with these components. This standard is mainly for Chinese information processing.
Another important standard is the " Specification of Common Modern Chinese Character Components and Component Names" (Chinese: 現代常用字部件及部件名稱規範|labels=no) formulated by the National Language Commission in 2009. It includes a list of 514 primitive components of commonly-used characters and component names. This standard is mainly for Chinese character education and dictionary collation.
The rules for component naming include the following:
If the component is a character, then call it by this character, for example: Chinese: 口|labels=no(kǒu) and Chinese: 土|labels=no(tǔ). If the character has more than one sounds, then use the more common one, such as: component "Chinese: 中|labels=no" is called zhōng, not zhòng.
If the component is not a character, then if it has a name, then use the existing name. For example, Chinese: 扌|labels=no (tí shǒu, Chinese: 提手|labels=no) and Chinese: 宀|labels=no (bǎo gài, Chinese: 宝盖|labels=no). If the component has more than one name, then use the name commonly used, for example, Chinese: 彳|labels=no is rather called shuāng lì rén (Chinese: 双立人|labels=no) than shuāngrén páng (Chinese: 双人旁|labels=no).
For a component without a name, a colloquial and reasonable name should be determined. One way is to refer to the component by its position in common characters. For example: "the head of character Chinese: 青|labels=no" (Chinese: 龶|labels=no, Chinese: 青字头|labels=no), "the frame of character Chinese: 国|labels=no" (Chinese: 囗|labels=no, Chinese: 国字框|labels=no).
The structure of a Chinese character is the pattern or rule in which the character is formed by its (first level) components. Chinese character structures include
The principles of Chinese character first-level structure analysis can be extended to other levels. For example, character Chinese: 部|labels=no is in left-right structure, where the left component is in up-down structure.
Sometimes in order to make the glyph more beautiful and reasonable in structure, a component may need to be changed in form according to the character environment.The deformation of the components can be made in two ways:
Stroke deformation includes the following situations:
The narrowing or flattening of components is to make the structure of the whole character harmonious and well-proportioned. Take "Chinese: 犬|labels=no" (dog) as an example:
Pianpangs (Chinese: c=偏旁|p=piānpáng|labels=no) and radicals are components.
Originally, the left side of a combined Chinese character was called pian, and the right side was called pang. Nowadays, it is customary to refer to the left and right, upper and lower, outer and inner parts of combined characters as pianpangs. Therefore, the pianpang analysis of combined characters is similar to the first-level component analysis.Piangpang generally carry sound or meaning information. They are called "sound side" (also called "sound symbol") and "meaning side" (also called "meaning symbol") respectively.
Radicals are components used for sorting and retrieving Chinese characters. According to the glyph structure of Chinese characters, the common components of a group of characters are taken as the basis for character sorting and searching. And these components are called radicals.In pictophonetic characters, the radicals are mostly pianpangs representing the meaning.
Hu Qiaomu said: "The (primitive) components of Chinese characters should be reduced, and the components of Chinese characters should be made independent characters as many as possible; those that cannot be characters should be universal and easy to say. This may be more important than reducing the number of strokes and characters.Some simplified characters have added new components of Chinese characters. For example, 'Chinese: 书农长|labels=no' and so on. Although the traditional character Chinese: 農|labels=no has more strokes, it is very clear to say: 'Chinese: 曲|labels=no+Chinese: 辰|labels=no Chinese: 農|labels=no'.When we simplify Chinese characters, we should avoid new unspeakable and uncommon components. "
Components are important structural units of Chinese characters. Optimizing the components of Chinese characters to make them more concise, standardized, and easy to learn and use is an important task for Chinese character optimization, and there is a long way to go.