The YES stroke alphabetical order (一二三漢字筆順排檢法), also called YES stroke-order sorting, briefly YES order or YES sorting, is a Chinese character sorting method based on a stroke alphabet and stroke orders. It is a simplified stroke-based sorting method free of stroke counting and grouping.[1] [2]
"YES" in the English name is the acronym of "Yi Er San", the pinyin expression of the Chinese name. The Chinese name "Yi Er San" (一二三; literally "one, two, three") is in turn formed by the first three of all the Chinese characters in YES order (because stroke "一" lies at the top of the alphabet).
YES order has been applied to the indexing of Xinhua Character Dictionary and Xiandai Hanyu Word Dictionary. In this joint index the user can look up a Chinese character alphabetically to find its pinyin and Unicode, in addition to the page numbers in the two popular dictionaries.[3]
In the Oxford Advanced Learner's Dictionary, the word alphabet is defined as "a set of letters or symbols in a fixed order used for writing a language".[4] The YES "alphabet" is a list of Chinese character strokes in the order of ㇐ ㇕ ㇅ ㇎ ㇡ ㇋ ㇊ ㇍ ㇈ 乙 ㇆ ㇇ ㇌ ㇀ ㇑ ㇗ ㇞ ㇉ ㄣ ㇙ ㇄ ㇟ ㇚ ㇓ ㇜ ㇛ ㇢ ㇔ ㇏ ㇂for writing and sorting Chinese.This stroke alphabet is built on the basis of Unicode CJK Strokes[5] and the Standard of Chinese Character Bending Strokes of the GB13000.1 Character Set.[6] There are totally 30 strokes, sorted by the standard basic strokes order of “heng (橫, 横, 一), ti (提, ㇀), shu (豎, 竖, 丨), pie (撇, 丿), dian (點, 点, 丶), na (捺, ㇏)” and the bending points order of “zhe (折), wan (彎, 弯) and gou (鉤, 钩)”.
The English name is formed by the initial Pinyin letters of each character in the Chinese name, similar to the naming of CJK strokes in Unicode, i.e., H: heng, T: ti/tiao, S: shu, P: pie, D: dian, N: na; z: zhe, w: wan and g: gou.
Stroke | English name | Chinese name | Example | |
---|---|---|---|---|
㇐ | H | |||
㇕ | HzS | • Second stroke of • First stroke of | ||
㇅ | HzSzH | • Second stroke of | ||
㇎ | HzSzHzS | • Fourth stroke of | ||
㇡ | HzSzHzSg | • First stroke of • Fifth stroke of | ||
㇋ | HzSzHzP | • Second stroke of • Fifth stroke of | ||
㇊ | HzSzT | • Second stroke of • Second stroke of | ||
㇍ | HzSwH | • Second stroke of • Fifth stroke of | ||
㇈ | HzSwHg | • Second stroke of • Last stroke of | ||
㇆ | HzSg | • Second stroke of • First stroke of | ||
㇇(乛) | HzP | • First stroke of • Third stroke of | ||
㇌ | HzPzPg | • First stroke of • Ninth stroke of | ||
HzNg | • First stroke of • Second stroke of | |||
㇀ | T | • Third stroke of • Third stroke of • Third stroke of | ||
㇑ | S | • Second stroke of • Second stroke of | ||
㇗(㇜) | SzH | • Second stroke of • Second stroke of | ||
㇞ | SzHzS | • First stroke of • Fourth stroke of | ||
㇉ | SzHzSg | • Second stroke of • Third stroke of | ||
ㄣ | SzHzP | • Third stroke of • Seventh stroke of | ||
㇙ | SzT | • Third stroke of • First stroke of | ||
㇄ | SwH | • Fourth stroke of • Fifth stroke of | ||
㇟ | SwHg | • Third stroke of • Last stroke of • Second stroke of | ||
㇚ | Sg | • First stroke of • Second stroke of | ||
㇓ | P | • First stroke of • First stroke of • First stroke of | ||
㇜ | PzT | • Sixth stroke of • First and second strokes of | ||
㇛ | PzD | • First stroke of • First, Second and third strokes of | ||
㇢ | Pg | • Second stroke of • First stroke of | ||
㇔ | D | • First and second strokes of • First and second strokes of | ||
㇏(〇) | N | • Second stroke of • Last stroke of •, Last stroke of | ||
㇂(㇃) | Ng | • Second stroke of • Fourth stroke of • Second stroke of in Regular font |
Strokes are the most basic writing units of Chinese.[7] Chinese characters are written stroke by stroke in a certain order. The standard stroke orders of Taiwan and the China mainland are quite similar.[8] [9] [10] For example, the stroke orders of the different characters in "一二三笔顺排检法 一二三筆順排檢法" are: 一 (一) 二 (一 一) 三 (一 一 一) 笔 (㇓㇐㇔㇓㇐㇔㇓㇐㇐㇟) 顺 (㇓㇑㇑㇐㇓㇑㇕㇓㇔) 排 (㇐㇚㇀㇑㇐㇐㇐㇑㇐㇐㇐) 检 (㇐㇑㇓㇔㇓㇏㇐㇔㇔㇓㇐) 法 (㇔㇔㇀㇐㇑㇐㇜㇔) 筆 (㇓㇐㇔㇓㇐㇔㇕㇐㇐㇐㇐㇑) 順 (㇓㇑㇑㇐㇓㇑㇕㇐㇐㇐㇓㇔) 檢 (㇐㇑㇓㇔㇓㇏㇐㇑㇕㇐㇑㇕㇐㇓㇔㇓㇔),
where the stroke order of each character is a string of strokes put in brackets. In the rare cases where more than one glyph or stroke order exist for a Chinese character, YES follows the fonts and stroke order in the Standard of GB13000.1 Character Set Chinese Character Order (Stroke-Based Order)[11] in its current implementations, because this standard covers all the 20,902 Unicode CJK characters and has a larger user population. Theoretically, any standard of stroke order can be used in YES.
With the knowledge of stroke alphabet and stroke order, the user is now ready to sort (or lookup) Chinese characters and words alphabetically.
To arrange two Chinese characters into YES order, the user follows the same rules of Latin alphabetical order. First compare the first strokes of the stroke orders of the two characters. If they are different, arrange the characters according to the strokes' order in the alphabet, for example, "土 (㇐㇑㇐)" comes before "日 (㇑㇕㇐㇐)", because the initial stroke "㇐" is before initial stroke "㇑" in the alphabet. If the first strokes are the same, compare the second strokes of both sides, and so on, until a pair of strokes that are not the same is found and the Chinese characters are ordered accordingly, for example, "土 (㇐㇑㇐)" comes before "木 (㇐㇑㇓㇏)" because the third stroke "㇐" precedes "㇓". If the last stroke of one of the characters is compared and the strokes on both sides are again the same, then the shorter stroke order string comes first, for example, "二 (一 一)" comes before "三 (一 一 一)".
The YES order of the different characters in "一二三笔顺排检法 一二三筆順排檢法" is: 一 (一) 二 (一 一) 三 (一 一 一) 檢 (㇐㇑㇓㇔㇓㇏㇐㇑㇕㇐㇑㇕㇐㇓㇔㇓㇔) 检 (㇐㇑㇓㇔㇓㇏㇐㇔㇔㇓㇐) 排 (㇐㇚㇀㇑㇐㇐㇐㇑㇐㇐㇐) 筆 (㇓㇐㇔㇓㇐㇔㇕㇐㇐㇐㇐㇑) 笔 (㇓㇐㇔㇓㇐㇔㇓㇐㇐㇟) 順 (㇓㇑㇑㇐㇓㇑㇕㇐㇐㇐㇓㇔) 顺 (㇓㇑㇑㇐㇓㇑㇕㇓㇔) 法 (㇔㇔㇀㇐㇑㇐㇜㇔).
Words of multiple characters are sorted by their first characters in YES order. If the first characters are the same, then check the second characters, and so on. Non-Chinese characters appear after Chinese characters in alphabetical/Unicode order.[12] For example, 覺 覺醒 觉 觉醒 觉悟 B超 T恤.
YES order has been applied to the compilation of several books and lists, including:
Comparing with traditional stroke-based sorting, the advantages of YES include: (a) No stroke counting and grouping (such as, into the heng-shu-pie-dian-zhe 5 groups) is needed. (b) The employment of stroke alphabetical order. (c) Free of labelling. [18]
According to experimental results, YES's one-tiered stroke-order sorting is more accurate than the traditional two-tiered stroke-count-stroke-order sorting. For example, in the traditional method, the 9 characters of "" are not sortable, because they are all of 3 strokes and share the same stroke order code of 354 (pie-zhe-dian, 撇-折-点, ㇓㇕㇔).[11] The YES method can sort them into 6 groups "". The code duplicating rate (重码率) of the traditional method on the 20,902 CJK characters set is 10.31%. And in YES order, it is reduced to 2.75%. The maximal number of characters sharing a code is reduced to 4, such as 甲 曱 叶 申. (Duplicating code characters, i.e., characters sharing a stroke order code, are sorted by the positions of the starting and ending points of corresponding strokes in the order of higher before lower and left before right.)