Standard Chinese phonology explained

pronounced as /notice/The phonology of Standard Chinese has historically derived from the Beijing dialect of Mandarin. However, pronunciation varies widely among speakers, who may introduce elements of their local varieties. Television and radio announcers are chosen for their ability to affect a standard accent. Elements of the sound system include not only the segments—e.g. vowels and consonants—of the language, but also the tones applied to each syllable. In addition to its four main tones, Standard Chinese has a neutral tone that appears on weak syllables.

This article uses the International Phonetic Alphabet (IPA) to compare the phonetic values corresponding to syllables romanized with pinyin.

Consonants

The sounds shown in parentheses are sometimes not analyzed as separate phonemes; for more on these, see below. Excluding these, and excluding the glides pronounced as /link/, pronounced as /link/, and pronounced as /link/, there are 19 consonant phonemes in the inventory.

	Labial	Denti- alveolar	Retroflex	Alveolo- palatal	Velar
Nasal	pronounced as /link/	pronounced as /link/			pronounced as /link/
Plosive	pronounced as /link/	pronounced as /link/			pronounced as /link/
Plosive	pronounced as /link/	pronounced as /link/			pronounced as /link/
Affricate		pronounced as /link/	pronounced as /link/	(pronounced as /link/)
Affricate		pronounced as /link/	pronounced as /link/	(pronounced as /link/)
Fricative	pronounced as /link/	pronounced as /link/	pronounced as /link/	(pronounced as /link/)	pronounced as /link/~pronounced as /link/
Liquid		pronounced as /link/	pronounced as /link/~pronounced as /link/

Between pairs of plosives or affricates having the same place of articulation and manner of articulation, the primary distinction is not voiced vs. voiceless (as in French or Russian), but unaspirated vs. aspirated (as in Scottish Gaelic or Icelandic). The unaspirated plosives and affricates may however become voiced in weak syllables (see below). Such pairs are represented in the pinyin system mostly using letters which in Romance languages generally denote voiceless/voiced pairs (for example pronounced as /[p]/ and pronounced as /[b]/), or in Germanic languages often denotes fortis/lenis pairs (for example initial aspirated voiceless/unaspirated voiced pairs such as pronounced as /[pʰ]/ and pronounced as /[b]/). However, aspirated/unaspirated pairs such as pronounced as //pʰ// and pronounced as //p// are represented with p and b respectively in pinyin.

More details about the individual consonant sounds are given in the following table.

Phoneme or sound	Approximate description	Pinyin	Zhuyin	Wade–Giles*	Notes
pronounced as /link/	Like English p but unaspirated – as in spy	b		p
pronounced as /link/	Like an aspirated English p, as in pie	p		p῾
pronounced as /link/	Like English m	m		m
pronounced as /link/	Like English f	f		f
pronounced as /link/	Like English t but unaspirated – as in sty	d		t	See .
pronounced as /link/	Like an aspirated English t, as in tie	t		t῾	See .
pronounced as /link/	Like English n	n		n	See . Can occur in the onset and/or coda of a syllable.
pronounced as /link/	Like English clear l, as in RP lay (never dark, i.e. velarized)	l		l
pronounced as /link/	Like English k, but unaspirated, as in scar	g		k
pronounced as /link/	Like an aspirated English k, as in car	k		k῾
pronounced as /link/	Like ng in English sing	ng	-	ng	Occurs only in the syllable coda.
pronounced as //x// (pronounced as /[{{IPAplink\|h}} ~ {{IPAplink\|x}}]/)	Varies between h in English hat and ch in Scottish loch.	h		h
pronounced as /link/	Like an unaspirated English ch, but with an alveolo-palatal pronunciation	j		ch	See .
pronounced as /link/	As t͡ɕ/pinyin "j", with aspiration	q		ch῾	See .
pronounced as /link/	Similar to English sh, but with an alveolo-palatal pronunciation	x		hs	See .
pronounced as /link/	Similar to ch in English chat, but with a retroflex articulation and no aspiration	zh		ch	See .
pronounced as /link/	As ʈ͡ʂ/pinyin "zh", but with aspiration	ch		ch῾	See .
pronounced as /link/	Similar to English sh, but with a retroflex articulation	sh		sh	See .
pronounced as //pronounced as /link/// (pronounced as /[{{IPAplink\|ʐ}} ~ {{IPAplink\|ɻ}}]/)	Similar to z in zoo in English, but with a retroflex articulation. L2 learners may pronounce it as an English R, but lips are unrounded.	r		j	For pronunciation in syllable-final position, see .
pronounced as /link/	Like English ts in cats, without aspiration	z		ts	See .
pronounced as /link/	As t͡s/pinyin "z", but with aspiration	c		ts῾	See .
pronounced as /link/	Like English s, but usually with the tongue on the lower teeth.	s		s	See .
In Wade–Giles, the distinction between retroflex and alveolo-palatal affricates, which are both written as ch and ch῾, is indicated by the subsequent vowel coda, since the two consonant series occur in complementary distribution; for example, chi and chü correspond to pinyin ji and ju, respectively, whereas chih and chu correspond to pinyin zhi and zhu (see).

All of the consonants may occur as the initial sound of a syllable, with the exception of pronounced as //ŋ// (unless the zero initial is assigned to this phoneme; see below). Excepting the rhotic coda, the only consonants that can appear in syllable coda (final) position are pronounced as //n// and pronounced as //ŋ// (although pronounced as /[m]/ may occur as an allophone of pronounced as //n// before labial consonants in fast speech). Final pronounced as //n//, pronounced as //ŋ// may be pronounced without complete oral closure, resulting in a syllable that in fact ends with a long nasalized vowel. See also, below.

Denti-alveolar and retroflex series

The consonants listed in the first table above as denti-alveolar are sometimes described as alveolars, and sometimes as dentals. The affricates and the fricative are particularly often described as dentals; these are generally pronounced with the tongue on the lower teeth.

The retroflex consonants (like those of Polish) are actually apical rather than subapical, and so are considered by some authors not to be truly retroflex; they may be more accurately called post-alveolar.^[1] Some speakers not from Beijing may lack the retroflexes in their native dialects, and may thus replace them with dentals.

Alveolo-palatal series

The alveolo-palatal consonants (pinyin j, q, x) have standard pronunciations of pronounced as /[t͡ɕ, t͡ɕʰ, ɕ]/. Some speakers realize them as palatalized dentals pronounced as /[t͡sʲ]/, pronounced as /[t͡sʰʲ]/, pronounced as /[sʲ]/; this is claimed to be especially common among children and women, although officially it is regarded as substandard and as a feature specific to the Beijing dialect.^[2]

In phonological analysis, it is often assumed that, when not followed by one of the high front vowels pronounced as /[i]/ or pronounced as /[y]/, the alveolar-palatals consist of a consonant followed by a palatal glide (pronounced as /[j]/ or pronounced as /[ɥ]/). That is, syllables represented in pinyin as beginning,,,,, (followed by a vowel) are taken to begin pronounced as /[t͡ɕj]/, pronounced as /[t͡ɕʰj]/, pronounced as /[ɕj]/, pronounced as /[t͡ɕɥ]/, pronounced as /[t͡ɕʰɥ]/, pronounced as /[ɕɥ]/. The actual pronunciations are more like pronounced as /[t͡ɕ]/, pronounced as /[t͡ɕʰ]/, pronounced as /[ɕ]/, pronounced as /[t͡ɕʷ]/, pronounced as /[t͡ɕʰʷ]/, pronounced as /[ɕʷ]/ (or for speakers using the dental variants, pronounced as /[t͡sʲ]/, pronounced as /[t͡sʰʲ]/, pronounced as /[sʲ]/, pronounced as /[t͡sᶣ]/, pronounced as /[t͡sʰᶣ]/, pronounced as /[sᶣ]/). This is consistent with the general observation (see under) that medial glides are realized as palatalization and/or labialization of the preceding consonant (palatalization already being inherent in the case of the palatals).

On the above analysis, the alveolar-palatals are in complementary distribution with the dentals pronounced as /[t͡s, t͡sʰ, s]/, with the velars pronounced as /[k, kʰ, x]/, and with the retroflexes pronounced as /[ʈ͡ʂ, ʈ͡ʂʰ, ʂ]/, as none of these can occur before high front vowels or palatal glides, whereas the alveolo-palatals occur before high front vowels or palatal glides. Therefore, linguists often prefer to classify pronounced as /[t͡ɕ, t͡ɕʰ, ɕ]/ not as independent phonemes, but as allophones of one of the other three series. The existence of the above-mentioned dental variants inclines some to prefer to identify the alveolo-palatals with the dentals, but identification with any of the three series is possible (unless the empty rime pronounced as /link/ is identified with pronounced as //i//, in which case the velars become the only candidate). The Yale and Wade–Giles systems mostly treat the alveolo-palatals as allophones of the retroflexes; Tongyong Pinyin mostly treats them as allophones of the dentals; and Mainland Chinese Braille treats them as allophones of the velars. In standard pinyin and bopomofo, however, they are represented as a separate sequence.

The alveolo-palatals arose historically from a merger of the dentals pronounced as /[t͡s, t͡sʰ, s]/ and velars pronounced as /[k, kʰ, x]/ before high front vowels and glides. Previously, some instances of modern pronounced as /[t͡ɕ(ʰ)i]/ were instead pronounced as /[k(ʰ)i]/, and others were pronounced as /[t͡s(ʰ)i]/; distinguishing these two sources of pronounced as /[t͡ɕ(ʰ)i]/ is known as the . The change took place in the last two or three centuries at different times in different areas. This explains why some European transcriptions of Chinese names (especially in postal romanization) contain,,, where an alveolo-palatal might be expected in modern Chinese. Examples are Peking for Beijing (pronounced as /[kiŋ] → [tɕiŋ]/), Chungking for Chongqing (pronounced as /[kʰiŋ] → [tɕʰiŋ]/), Fukien for Fujian (cf. Hokkien), Tientsin for Tianjin (pronounced as /[tsin] → [tɕin]/); Sinkiang for Xinjiang (pronounced as /[sinkiaŋ] → [ɕintɕiaŋ]/, and Sian for Xi'an (pronounced as /[si] → [ɕi]/). The complementary distribution with the retroflex series arose when syllables that had a retroflex consonant followed by a medial glide lost the medial glide.

Zero onset

A full syllable such as ai, in which the vowel is not preceded by any of the standard initial consonants or glides, is said to have a null initial or zero onset. This may be realized as a consonant sound: pronounced as /link/ and pronounced as /link/ are possibilities, as are pronounced as /[ŋ]/ and pronounced as /link/ in some non-standard varieties. It has been suggested by San Duanmu that such an onset be regarded as a special phoneme, or as an instance of the phoneme pronounced as //ŋ//, although it can also be treated as no phoneme (absence of onset). By contrast, in the case of the particle Chinese: {{linktext|啊 a, which is a weak onset-less syllable, linking occurs with the previous syllable (as described under, below).

When a stressed vowel-initial Chinese syllable follows a consonant-final syllable, the consonant does not directly link with the vowel. Instead, the zero onset seems to intervene in between. ("cotton jacket") becomes pronounced as /[mjɛnʔau]/, pronounced as /[mjɛnɣau]/. However, in connected speech none of these output forms is natural. Instead, when the words are spoken together the most natural pronunciation is rather similar to pronounced as /[mjɛ̃ːau]/, in which there is no nasal closure or any version of the zero onset, and instead nasalization of the vowel occurs.

Glides

The glides pronounced as /link/, pronounced as /link/, and pronounced as /link/ sound respectively like the y in English yes, the (h)u in French huit, and the w in English we. (Beijing speakers often replace initial pronounced as /[w]/ with a labiodental pronounced as /[ʋ]/, except when it is followed by pronounced as /[o]/ or pronounced as /[u]/.) The glides are commonly analyzed not as independent phonemes, but as consonantal allophones of the high vowels: pronounced as /[i̯, y̯, u̯]/. This is possible because there is no ambiguity in interpreting a sequence like yao/-iao as pronounced as //iau//, and potentially problematic sequences such as pronounced as /

/iu/

/ do not occur.

The glides may occur in initial position in a syllable. This occurs with pronounced as /[ɥ]/ in the syllables written,,, and in pinyin; with pronounced as /[j]/ in other syllables written with initial y in pinyin (etc.); and with pronounced as /[w]/ in syllables written with initial w in pinyin (etc.). When a glide is followed by the vowel of which that glide is considered an allophone, the glide may be regarded as epenthetic (automatically inserted), and not as a separate realization of the phoneme. Hence the syllable, pronounced pronounced as /[ji]/, may be analyzed as consisting of the single phoneme pronounced as //i//, and similarly may be analyzed as pronounced as //in//, as pronounced as //y//, and as pronounced as //u//. It is also possible to hear both from the same speaker, even in the same conversation. For example, one may hear the number "one" as either pronounced as /cmn/ or pronounced as /cmn/.

The glides can also occur in medial position, that is, after the initial consonant but before the main vowel. Here they are represented in pinyin as vowels: for example, the i in represents pronounced as /[j]/, and the u in represents pronounced as /[w]/. There are some restrictions on the possible consonant-glide combinations: pronounced as /[w]/ does not occur after labials (except for some speakers in,,,); pronounced as /[j]/ does not occur after retroflexes and velars (or after pronounced as /[f]/); and pronounced as /[ɥ]/ occurs medially only in and and after alveolar-palatals (for which see above). A consonant-glide combination at the start of a syllable is articulated as a single sound – the glide is not in fact pronounced after the consonant, but is realized as palatalization pronounced as /[ʲ]/, labialization pronounced as /[ʷ]/, or both pronounced as /[ᶣ]/, of the consonant. (The same modifications of initial consonants occur in syllables where they are followed by a high vowel, although normally no glide is considered to be present there. Hence a consonant is generally palatalized pronounced as /[ʲ]/ when followed by pronounced as //i//, labialized pronounced as /[ʷ]/ when followed by pronounced as //u//, and both pronounced as /[ᶣ]/ when followed by pronounced as //y//.)

The glides pronounced as /[j]/ and pronounced as /[w]/ are also found as the final element in some syllables. These are commonly analyzed as diphthongs rather than vowel-glide sequences. For example, the syllable is assigned the underlying representation pronounced as //pai̯//. (In pinyin, the second element is generally written or, but pronounced as //au̯// is written as .)

Syllabic consonants

The syllables written in pinyin as,,,,,, may be described as a sibilant consonant (z, c, s, zh, ch, sh, r in pinyin) followed by a syllabic consonant (also known as apical vowel in classic literature):

[{{IPA link|ɹ|ɹ̩}} ~ {{IPA link|z|z̩}}], a laminal denti-alveolar voiced continuant, in,, ;
[{{IPA link|ɻ|ɻ̩}} ~ {{IPA link|ʐ|ʐ̩}}], an apical retroflex voiced continuant, in,,, .

Alternatively, the nucleus may be described not as a syllabic consonant, but as a vowel:

pronounced as /link/, similar to Russian Russian: [[ы]] and the vowel in American "roses", in,,,,,, .

Phonologically, these syllables may be analyzed as having their own vowel phoneme, pronounced as //ɨ//. However, it is possible to merge this with the phoneme pronounced as //i// (to which it is historically related), since the two are in complementary distribution – provided that the is either left un-merged, or is merged with the velars rather than the retroflex or alveolar series. (That is, pronounced as /[t͡ɕi]/, pronounced as /[t͡sɨ]/, and pronounced as /[ʈ͡ʂɨ]/ all exist, but pronounced as /

[ki]

/ and pronounced as /

[kɨ]

/ do not exist, so there is no problem merging both pronounced as /[i]~[ɨ]/ and pronounced as /[k]~[t͡ɕ]/ at the same time.)

Another approach is to regard the syllables assigned above to pronounced as //ɨ// as having an (underlying) empty nuclear slot ("empty rhyme", Chinese), i.e. as not containing a vowel phoneme at all. This is more consistent with the syllabic consonant description of these syllables, and is consistent with the view that phonological representations are minimal (underspecified).^[3] When this is the case, sometimes the phoneme is described as shifting from voiceless to voiced, e.g. becoming pronounced as //sź̩//.

Syllabic consonants may also arise as a result of weak syllable reduction; see below. Syllabic nasal consonants are also heard in certain interjections; pronunciations of such words include pronounced as /[m]/, pronounced as /[n]/, pronounced as /[ŋ]/, pronounced as /[hm]/, pronounced as /[hŋ]/.

Vowels

Standard Chinese can be analyzed as having between two and six vowel phonemes.^[4] pronounced as //i, u, y// (which may also be analyzed as underlying glides) are high (close) vowels, pronounced as //ə// is mid whereas pronounced as //a// is low (open).

The precise realization of each vowel depends on its phonetic environment. In particular, the vowel pronounced as //ə// has two broad allophones pronounced as /link/ and pronounced as /link/ (corresponding respectively to pinyin e and o in most cases). These sounds can be treated as a single underlying phoneme because they are in complementary distribution. The mid vowel phoneme may also be treated as an under-specified vowel, attracting features either from the adjacent sounds or from default rules resulting in pronounced as //ə//. (Apparent counterexamples are provided by certain interjections, such as pronounced as /[ɔ]/, pronounced as /[ɛ]/, pronounced as /[jɔ]/, and pronounced as /[lɔ]/, but these are normally treated as special cases operating outside the normal phonemic system.)

Transcriptions of the vowels' allophones (the ways they are pronounced in particular phonetic environments) differ somewhat between sources. More details about the individual vowel allophones are given in the following table (not including the values that occur with the rhotic coda).

Phoneme	Allophone	Description	Example	Pinyin	Wade–Giles	Gwoyeu Romatzyh
Depends on analysis (see below)	pronounced as /link/	Like English ee as in bee	比/bǐ	i	i	i
	pronounced as /link/	Like English oo as in boo	不/bù	u	u	u
	pronounced as /link/	Like English oo in took (varies between pronounced as /[o]/^[5] and pronounced as /[u]/ depending on the speaker.)	空/kōng	o	u	o
	pronounced as /link/	Like French u or German ü	女/nǚ	ü, u	ü	iu
pronounced as //ə//	pronounced as /link/	Somewhat like English ey as in prey	别/bié	e, ê	e, eh	e
	pronounced as /link/	Somewhat like southern British English awe or Scottish English oh	火/huǒ	o	o	o
	pronounced as /link/	Pronounced as a sequence pronounced as /[ɰɤ̞]/.	和/hé	e	ê, o	e
	pronounced as /link/	Schwa, like English a as in about.	很/hěn	e	ê, u	e
pronounced as //a//	pronounced as /link/	Like English a as in palm	巴/bā	a	a	a
pronounced as //a//	pronounced as /link/	Like English e as in then (varies between pronounced as /[e]/ and pronounced as /[a]/ depending on the speaker)	边/biān	a	e, a	a

Zhuyin represents vowels differently from normal romanisation schemes, and as such is not displayed in the above table.

The vowel nuclei may be preceded by a glide pronounced as //j, w, ɥ//, and may be followed by a coda pronounced as //i, u, n, ŋ//. The various combinations of glide, vowel, and coda have different surface manifestations, as shown in the tables below. Any of the three positions may be empty, i.e. occupied by a null meta-phoneme pronounced as /∅/.

Five vowel analysis (pinyin-based)

See also: Pinyin. The following table provides a typical five vowel analysis according to and . In this analysis, the high vowels pronounced as //i, u, y// are fully phonemic and may form sequences with the nasal codas pronounced as //n, ŋ//.

Nucleus		pronounced as /∅/	pronounced as //i//			pronounced as //u//		pronounced as //y//		pronounced as //ə//					pronounced as //a//
Coda		pronounced as /∅/	pronounced as /∅/	pronounced as //n//	pronounced as //ŋ//	pronounced as /∅/	pronounced as //ŋ//	pronounced as /∅/	pronounced as //n//	pronounced as /∅/	pronounced as //i//	pronounced as //u//	pronounced as //n//	pronounced as //ŋ//	pronounced as /∅/	pronounced as //i//	pronounced as //u//	pronounced as //n//	pronounced as //ŋ//
Medial	pronounced as /∅/	pronounced as /[ɹ̩~ɻ̩]/	pronounced as /[i]/ yi	pronounced as /[in]/ yin	pronounced as /[iŋ]/ ying	pronounced as /[u]/ wu	pronounced as /[ʊŋ]/	pronounced as /[y]/ yu	pronounced as /[yn]/ yun	pronounced as /[o, e, ɤ]/ o, ê, e	pronounced as /[ei̯]/ ei	pronounced as /[ou̯]/ ou	pronounced as /[ən]/ en	pronounced as /[əŋ]/ eng	pronounced as /[a]/ a	pronounced as /[ai̯]/ ai	pronounced as /[au̯]/ ao	pronounced as /[an]/ an	pronounced as /[aŋ]/ ang
	pronounced as //j//						pronounced as /[jʊŋ]/ yong			pronounced as /[je]/, pronounced as /[jo]/ ye, yo ,		pronounced as /[jou̯]/ you			pronounced as /[ja]/ ya		pronounced as /[jau̯]/ yao	pronounced as /[jɛn]/ yan	pronounced as /[jaŋ]/ yang
	pronounced as //w//									pronounced as /[wo]/ wo	pronounced as /[wei̯]/ wei		pronounced as /[wən]/ wen	pronounced as /[wəŋ]/ weng	pronounced as /[wa]/ wa	pronounced as /[wai̯]/ wai		pronounced as /[wan]/ wan	pronounced as /[waŋ]/ wang
	pronounced as //ɥ//									pronounced as /[ɥe]/ yue								pronounced as /[ɥɛn]/ yuan

¹ ü is written as u after j, q, or x (the pronounced as //u// phoneme never occurs in these positions)

² uo is written as o after b, p, m, or f.

Two vowel analysis (bopomofo-based)

See also: Bopomofo. Some linguists prefer to reduce the number of vowel phonemes drastically (at the expense of including underlying glides in their systems). Edwin G. Pulleyblank has proposed a system which includes underlying glides, but no vowels at all. More common are systems with two vowels; for example, in Mantaro Hashimoto's system,^[6] there are just two vowel nuclei, pronounced as //ə, a//. In this analysis, the high vowels pronounced as /[i, u, y]/ are analyzed as glides pronounced as //j, w, ɥ// which surface as vowels before pronounced as /∅/ or pronounced as //ən, əŋ//.

Nucleus

∅

pronounced as //ə//

pronounced as //a//

Coda

pronounced as /∅/

pronounced as //i//

pronounced as //u//

pronounced as //n//

pronounced as //ŋ//

pronounced as /∅/

pronounced as //i//

pronounced as //u//

pronounced as //n//

pronounced as //ŋ//

Medial

pronounced as /∅/

pronounced as /[ɹ̩~ɻ̩]/

pronounced as /[o, ɤ, e]/
，，

pronounced as /[ei̯]/

pronounced as /[ou̯]/

pronounced as /[ən]/

pronounced as /[əŋ]/

pronounced as /[a]/

pronounced as /[ai̯]/

pronounced as /[au̯]/

pronounced as /[an]/

pronounced as /[aŋ]/

pronounced as //j//

pronounced as /[i]/

pronounced as /[je, jo]/
，

pronounced as /[jou̯]/

pronounced as /[in]/

pronounced as /[iŋ]/

pronounced as /[ja]/

[jai̯]

*| pronounced as /[jau̯]/
|| pronounced as /[jɛn]/
|| pronounced as /[jaŋ]/
|-! pronounced as //w//| pronounced as /[u]/
| pronounced as /[wo]/
|| pronounced as /[wei̯]/
|| || pronounced as /[wən]/
|| pronounced as /[wəŋ], [ʊŋ]/
| pronounced as /[wa]/
|| pronounced as /[wai̯]/
|| || pronounced as /[wan]/
|| pronounced as /[waŋ]/
|-! pronounced as //ɥ//| pronounced as /[y]/
| pronounced as /[ɥe]/
ㄩㄝ|| || || pronounced as /[yn]/
|| pronounced as /[jʊŋ]/
| || || || pronounced as /[ɥɛn]/
|||}

Other notes

As a general rule, vowels in open syllables (those which have no coda following the main vowel) are pronounced long, while others are pronounced short. This does not apply to weak syllables, in which all vowels are short.

In Standard Chinese, the vowels pronounced as /[a]/ and pronounced as /[ə]/ harmonize in backness with the coda.^[7] For pronounced as /[a]/, it is fronted pronounced as /[a̟]/ before pronounced as //i, n// and backed pronounced as /[a̠]/ before pronounced as //u, ŋ//. For pronounced as /[ə]/, it is fronted pronounced as /[ə̟]/ before pronounced as //n// and backed pronounced as /[ə̠]/ before pronounced as //ŋ//.

Some native Mandarin speakers may pronounce pronounced as /[wei̯]/, pronounced as /[jou̯]/, and pronounced as /[wən]/ as pronounced as /[ui]/, pronounced as /[iu]/, and pronounced as /[un]/ respectively in the first or second tone.

Rhotic coda

See main article: Erhua. Standard Chinese features syllables that end with a rhotic coda pronounced as //ɚ//. This feature, known in Chinese as erhua, is particularly characteristic of the Beijing dialect; many other dialects do not use it as much, and some not at all. It occurs in two cases:

In a small number of independent words or morphemes pronounced pronounced as /[ɚ]/ or pronounced as /[aɚ̯]/, written in pinyin as, with some tone, such as,, and .
In syllables in which the rhotic coda is added as a suffix to another morpheme. This suffix is represented by the character, to which meaning it is historically related, and in pinyin as r. The suffix combines with the final sound of the syllable, and regular but complex sound changes occur as a result (described in detail under erhua).

The r final is pronounced with a relatively lax tongue, and has been described as a "retroflex vowel".

In dialects that do not make use of the rhotic coda, it may be omitted in pronunciation, or in some cases a different word may be selected: for example, Beijing and may be replaced by the synonyms and .

Syllables

Syllables in Standard Chinese have the maximal form (CG)V(X)^T, traditionally analysed as an "initial" consonant C, a "final", and a tone T. The final consists of a "medial" G (which may be one of the glides pronounced as /[j, w, ɥ]/), a vowel V, and a coda X, which may be one of pronounced as /[n, ŋ, ɚ̯, i̯, u̯]/. The vowel and coda may also be grouped as the "rhyme", sometimes spelled "rime". Any of C, G, and X (and V, in some analyses) may be absent. However, in some analyses, C cannot be absent, due to the zero initial being considered a consonant.

Many of the possible combinations under the above scheme do not actually occur. There are only some 35 final combinations (medial+rime) in actual syllables (see pinyin finals). In all, there are only about 400 different syllables when tone is ignored, and about 1300 when tone is included. This is a far smaller number of distinct syllables than in a language such as English. Since Chinese syllables usually constitute whole words, or at least morphemes, the smallness of the syllable inventory results in large numbers of homophones. However, in Standard Chinese, the average word length is actually almost exactly two syllables, practically eliminating most homophony issues even when tone is disregarded, especially when context is taken into account as well.^[8] ^[9] (Still, due to the limited phonetic inventory, homophonic puns in Mandarin Chinese are very common and important in Chinese culture.^[10] ^[11])

For a list of all Standard Chinese syllables (excluding tone and rhotic coda) see the pinyin table or zhuyin table.

Full and weak syllables

Syllables can be classified as full (or strong), and weak. Weak syllables are usually grammatical markers such as Chinese: 了 le, or the second syllables of some compound words (although many other compounds consist of two or more full syllables).

A full syllable carries one of the four main tones, and some degree of stress. Weak syllables are unstressed, and have neutral tone. The contrast between full and weak syllables is distinctive; there are many minimal pairs such as Chinese: 要事 yàoshì "important matter" and Chinese: 钥匙 yàoshi "key", or Chinese: 大意 dàyì "main idea" and (with the same characters) dàyi "careless", the second word in each case having a weak second syllable. Some linguists consider this contrast to be primarily one of stress, while others regard it as one of tone. For further discussion, see under Neutral tone and Stress, below. There is also a difference in syllable length. Full syllables can be analyzed as having two morae ("heavy"), the vowel being lengthened if there is no coda. Weak syllables, however, have a single mora ("light"), and are pronounced approximately 50% shorter than full syllables. Any weak syllable will usually be an instance of the same morpheme (and written with the same character) as some corresponding strong syllable; the weak form will often have a modified pronunciation, however, as detailed in the following section.

Syllable reduction

Apart from differences in tone, length, and stress, weak syllables are subject to certain other pronunciation changes (reduction).^[12]

If a weak syllable begins with an unaspirated obstruent (pronounced as //p, t, k, t͡s, t͡ʂ, t͡ɕ//), that consonant may become voiced (pronounced as /[b, d, ɡ, d͡z, d͡ʐ, d͡ʑ]/ respectively). For example, in Chinese: 嘴巴 zuǐba ("mouth"), the second syllable is likely to begin with a pronounced as /[b]/ sound, rather than an unaspirated pronounced as /[p]/.
The vowel of a weak syllable is often reduced, becoming more central. For example, in the word zuǐba just mentioned, the final vowel may become a schwa pronounced as /[ə]/.
The coda (final consonant or offglide) of a weak syllable is often dropped (this is linked to the shorter, single-mora nature of weak syllables, as referred to above). If the dropped coda was a nasal consonant, the vowel may be nasalized. For example, Chinese: 脑袋 nǎodai ("head") may end with a monophthong pronounced as /[ɛ]/ rather than a diphthong, and Chinese: 春天 chūntian ("spring") may end with a centralized and nasalized vowel pronounced as /[ə̃]/.
In some cases, the vowel may be dropped altogether. This may occur, particularly with high vowels i, u, ü, when the unstressed syllable begins with a fricative f, h, sh, r, x, s or an aspirated p, t, k, q, ch, c consonant; for example, Chinese: 豆腐 dòufu ("tofu") may be said as dòu-f, and Chinese: 问题 wènti ("question") as wèn-t (the remaining initial consonant is pronounced as a syllabic consonant). The same may even occur in full syllables that have low ("half-third") tone. The vowel (and coda) may also be dropped after a nasal, in such words as Chinese: 我们 wǒmen ("we") and Chinese: 什么 shénme ("what"), which may be said as wǒm and shém – these are examples of the merger of two syllables into one, which occurs in a variety of situations in connected speech.

The example of shénme → shém also involves assimilation, which is heard even in unreduced syllables in quick speech (for example, in guǎmbō for Chinese: 广播 guǎngbō "broadcast"). A particular case of assimilation is that of the sentence-final exclamatory particle Chinese: 啊 a, a weak syllable, which has different characters for its assimilated forms:

Preceding sound	Form of particle (pinyin)	Character
pronounced as /[ŋ]/, pronounced as /[ɹ̩]/, pronounced as /[ɻ̩]/	a	Chinese: 啊
pronounced as /[i]/, pronounced as /[y]/, pronounced as /[e]/, pronounced as /[o]/, pronounced as /[a]/	ya (from ŋja)	Chinese: 呀
pronounced as /[u]/	wa	Chinese: 哇
pronounced as /[n]/	na	Chinese: 哪
Chinese: 了 le (grammatical marker)	combines to form la	Chinese: 啦

Tones

See also: Four tones (Middle Chinese). Standard Chinese, like all varieties of Chinese, is tonal. This means that in addition to consonants and vowels, the pitch contour of a syllable is used to distinguish words from each other. Many non-native Chinese speakers have difficulties mastering the tones of each character, but correct tonal pronunciation is essential for intelligibility because of the vast number of words in the language that only differ by tone (i.e. are minimal pairs with respect to tone). Statistically, tones are as important as vowels in Standard Chinese.^[13]

The following table shows the four main tones of Standard Chinese, together with the neutral (or fifth) tone. To describe the pitch of the tones, its representation on a five-level scale is used, visualized with Chao tone letters. The values of the pitch for each tone described by Chao are traditionally considered standard, however slight regional and idiolectal variations in tone pronunciation also occur.

Tone number		1	2	3	4	5
Description		high	rising	low (dipping)	falling	neutral
Pinyin diacritic		ā	á	ǎ	à	a
Pitch contour	per Chao (1968)	pronounced as /˥/ 55	pronounced as /˧˥/ 35	pronounced as /˨˩˦/ 21(4)	pronounced as /˥˩/ 51, pronounced as /˥˧/ 53	(various, see below)
	Common realization (Beijing)	pronounced as /˥/ 55, pronounced as /˦/ 44	pronounced as /˨˥/ 25	pronounced as /˨˩˨/ 21(2)	pronounced as /˥˨/ 52, pronounced as /˥˧/ 53
	Common realization (Taipei)	pronounced as /˦/ 44	pronounced as /˧˨˧/ 323	pronounced as /˧˩˨/ 31(2)	pronounced as /˥˨/ 52, pronounced as /˥˧/ 53
	Other substandard variants	pronounced as /˥˦/ 54, pronounced as /˦˥/ 45	pronounced as /˧˨˥/ 325, pronounced as /˨˦/ 24	pronounced as /˨˩˧/ 21(3), pronounced as /˨/ 22	pronounced as /˦˨/ 42
IPA diacritic		pronounced as //á//	pronounced as //ǎ/ [a᷄]/	pronounced as //à// pronounced as /[à̰, a̰᷆, a̰᷉]/	pronounced as //â//
Tone name
Examples

The Chinese names of the main four tones are respectively,, ^[14] ^[15] or ^[16] ('rising'), and . As descriptions, they apply rather to the predecessor Middle Chinese tones than to the modern tones.

Most romanization systems, including pinyin, represent the tones as diacritics on the vowels, as does bopomofo. Some, like Wade–Giles, use superscript numbers at the end of each syllable. The tone marks and numbers are rarely used outside of language textbooks: in particular, they are usually absent in public signs, company logos, and so forth. Gwoyeu Romatzyh is a rare example of a system where tones are represented using normal letters of the alphabet (although without a one-to-one correspondence).

First tone

First tone is a high-level tone. It is a steady high sound, produced as if it were being sung instead of spoken. Its pitch is usually pronounced as /˥/ 55 or pronounced as /˦/ 44, at the same level where the fourth tone starts, or a little lower. Occasionally, slightly rising or falling high pitch (pronounced as /˥˦/ 54 or pronounced as /˦˥/ 45) is also possible.^[17]

In a few syllables, the quality of the vowel is changed when it carries first tone; see the vowel table above.

Second tone

Second tone is a rising tone. It is usually described as a high-rising (pronounced as /˧˥/ 35), with the sound that rises from middle to high pitch (like in the English "What?!"). It starts at around 3 or 2 pitch level, and then rises towards the level of the first tone pitch (5 or 4).

It may also start with a falling or flat segment, which is quite short in male speakers (a quarter of the total second tone length), but longer in female speakers, reaching nearly half of the total length of the second tone. This initial dip is more apparent in Southern China Mandarin accent, including Standard Taiwanese Mandarin, where the second tone is also lower and alternatively described as dipping or low-rising with overall contour of pronounced as /˧˨˧/ 323 (its start is still slightly lower than its final pitch).^[18] ^[19] ^[20] ^[21]

This tone is usually one of the most difficult to master for Mandarin learners, as well as the speakers of non-Mandarin Chinese varieties, who often pronounce their second tone close to (full) third tone, especially in the word-final position before a pause.^[22] ^[23]

Third tone

Third tone is a low tone. It is also often termed a "dipping tone".

This tone is often demonstrated as having a rise in pitch after the low fall; however, third tone syllables that include the rise are significantly longer than other syllables. When a third-tone syllable is not said in isolation, this rise is normally heard only if it appears at the end of a sentence or before a pause, and then usually only on stressed monosyllables. The third tone without the rise is sometimes called half third tone.

The overall pitch contour of the third tone is traditionally described as pronounced as /˨˩˦/ 214, but for modern Standard Chinese speakers, the rise, if present, is not that high. The third tone starts lower or around the starting point for the second tone. In Beijing, its value inclines to pronounced as /˨˩˧/ 213 or pronounced as /˨˩˨/ 212, while in Taiwan it is usually pronounced as /˧˩˨/ 312 (Taiwanese Standard Chinese speakers also tend to never pronounce the rising part in any context). Unlike the other tones, third tone is usually pronounced with creaky voice.^[24]

Two consecutive third tones are avoided by changing the first to second tone; see below.

Fourth tone

Fourth tone is a falling tone. It features a sharp fall from high to lower pitch (as is heard in curt commands in English, such as "Stop!").

It starts at the same pitch level or higher than the first tone, and then drops to the pitch 1 or 2. In connected speech, when followed by syllables with other full tones, it tends to fall only from high to mid-level. Similarly to the third tone, the final part is only pronounced before a pause or an unstressed syllable. Two consecutive fourth tones are pronounced in a zigzag pattern, with the first one higher, and the second one lower (˥˧ 53 - ˦˩ 41).^[25]

Neutral tone

Also called fifth tone or zeroth tone, the neutral tone is sometimes thought of as a lack of tone. It is associated with weak syllables, which are generally somewhat shorter than tonic syllables.

In Standard Chinese, about 15–20% of the syllables in written texts are considered unstressed, including certain suffixes, clitics, and particles. Second syllables of some disyllabic words are also unstressed in Northern Mandarin accents, but many Mandarin speakers in Southern China tend to preserve their inherent tone.

The pitch of a syllable with neutral tone is determined by the tone of the preceding syllable. Chao (1968) considered the neutral tone syllables to not have pitch contour. He introduced special dotted tone letters to denote its pitch. Later studies, however, found that the neutral tone syllables do have pitch contour. The following table shows the pitch at which the neutral tone is pronounced in Standard Chinese after each of the four main tones. For contoured pitch analysis, the first column shows the pitch contour directly after the full tone syllable, and the second column shows the pitch contour after another neutral tone syllable.^[26] ^[27] ^[28]

Tone of preceding syllable! colspan="3"
Contourless	Contoured		Characters	Pinyin	Meaning	Transcription
Pitch of neutral tone	Example
Contourless	first syllable	second syllable	Characters	Pinyin	Meaning	Transcription
First pronounced as /˥/	pronounced as /˨/ (pronounced as /꜋/) 2	pronounced as /˦˩/ 41	pronounced as /˨˩/ 21	Chinese: 玻璃({{zhi\|c=的		'[of the] glass'	pronounced as /[pwo˥ li˦˩ də˨˩]/
Second pronounced as /˧˥/	pronounced as /˧/ (pronounced as /꜊/) 3	pronounced as /˥˨/ 52	pronounced as /˧˨/ 32	Chinese: 伯伯({{zhi\|c=的		'[of an] uncle'	pronounced as /[pwo˨˥ bwo˥˨ də˧˨]/
Third pronounced as /˨˩/	pronounced as /˦/ (pronounced as /꜉/) 4	pronounced as /˧/ 33 ~ pronounced as /˨˧/ 23	pronounced as /˧˨/ 32	Chinese: 喇叭({{zhi\|c=的		'[of a] horn'	pronounced as /[lä˨˩ bä˨˧ də˧˨]/
Fourth pronounced as /˥˩/	pronounced as /˩/ (pronounced as /꜌/) 1	pronounced as /˨˩/ 21	pronounced as /˩/ 11	Chinese: 兔子({{zhi\|c=的		'[of a] rabbit'	pronounced as /[tʰu˥˨ d͡zɨ˨˩ də˩]/

Although the contrast between weak and full syllables is often distinctive, the neutral tone is often not described as a full-fledged tone; some linguists feel that it results from a "spreading out" of the tone on the preceding syllable. This idea is appealing because without it, the neutral tone needs relatively complex tone sandhi rules to be made sense of; indeed, it would have to have four allotones, one for each of the four tones that could precede it. However, the "spreading" theory incompletely characterizes the neutral tone, especially in sequences where more than one neutral-tone syllable is found adjacent.^[29] In Modern Standard Mandarin as applied in A Dictionary of Current Chinese, the second syllable of words with a 'toneless final syllable variant' (·) can be read with either a neutral tone or with the normal tone.^[30] ^[31] ^[32]

Relationship between Middle Chinese and modern tones

The four tones of Middle Chinese are not in one-to-one correspondence with the modern tones. The following table shows the development of the traditional tones as reflected in modern Standard Chinese. The development of each tone depends on the initial consonant of the syllable: whether it was a voiceless consonant (denoted in the table by v−), a voiced obstruent (v+), or a sonorant (s). (The voiced - voiceless distinction has been lost in modern Standard Chinese.)

Middle Chinese	Tone	(Chinese: 平)			(Chinese: 上)			(Chinese: 去)			(Chinese: 入)
Middle Chinese	Initial	v−	s	v+	v−	s	v+	v−	s	v+	v−	s	v+
Standard Chinese	Tone name	(1st)	(2nd)		(3rd)		(4th)				redistributed with no pattern	(4th)	(2nd)
Standard Chinese	Tone contour	55	35		21(4)		51				redistributed with no pattern	51	35

Tone sandhi

Pronunciation also varies with context according to the rules of tone sandhi. Some such changes have been noted above in the descriptions of the individual tones; however, the most prominent phenomena of this kind relate to consecutive sequences of third-tone syllables. There are also a few common words that have variable tone.

Third tone sandhi

The principal rule of third tone sandhi is:

When there are two consecutive third-tone syllables, the first of them is pronounced with second tone.

For example, is pronounced pronounced as /[lau̯˧˥ʂu˨˩]/ as if it were . It has been investigated whether the rising contour (pronounced as /˧˥/) on the prior syllable is in fact identical to a normal second tone. It has been concluded that it is identical at least in terms of auditory perception.

When there are three or more third tones in a row, the situation becomes more complicated since a third tone that precedes a second tone resulting from third tone sandhi may or may not be subject to sandhi itself. The results may depend on word boundaries, stress, and dialectal variations. General rules for three-syllable third-tone combinations can be formulated as follows:

If the first word is two syllables and the second word is one syllable, the first two syllables become second tones. For example, is pronounced pronounced as /[pau̯˧˥kwan˧˥xau̯˨˩˦]/.
If the first word has one syllable, and the second word has two syllables, the second syllable becomes second tone, but the first syllable remains third tone. For example, is pronounced pronounced as /[lau̯˨˩pau̯˧˥kwan˨˩˦]/.

Some linguists have put forward more comprehensive systems of sandhi rules for multiple third tone sequences. For example, it has been proposed that modifications are applied cyclically, initially within rhythmic feet (trochees; see below) and that sandhi "need not apply between two cyclic branches".

Tones on special syllables

Special rules apply to the tones heard on the morphemes and .

For :

Chinese: 不 is pronounced with second tone when followed by a fourth tone syllable.

Example: Chinese: 不是 (+, 'to not be') becomes pronounced as /[pu˧˥ʂɻ̩˥˩]/

In other cases, Chinese: 不 is pronounced with fourth tone. However, when used between words in an A-not-A question, it may become neutral in tone (e.g. Chinese: 是不是).

For :

Chinese: 一 is pronounced with second tone when followed by a fourth tone syllable.

Example: Chinese: 一定 (+ 'must') becomes pronounced as /[i˧˥tiŋ˥˩]/

Before a first, second or third tone syllable, is pronounced with fourth tone.

Examples：Chinese: 一天 (+ 'one day') becomes pronounced as /[i˥˩tʰjɛn˥]/, Chinese: 一年 (+ 'one year') becomes pronounced as /[i˥˩njɛn˧˥]/, Chinese: 一起 (+ 'together') becomes pronounced as /[i˥˩t͡ɕʰi˨˩˦]/.

When final, or when it comes at the end of a multi-syllable word (regardless of the first tone of the next word), Chinese: 一 is pronounced with first tone. It also has first tone when used as an ordinal number (or part of one), and when it is immediately followed by any digit (including another Chinese: 一; hence syllables of the word Chinese: 一一 and its compounds have first tone).
When Chinese: 一 is used between two reduplicated words, it may become neutral in tone, e.g.

The numbers and sometimes display similar tonal behavior as Chinese: 一, but for most modern speakers they are always pronounced with first tone. All of these numbers, and, were historically Ru tones, and as noted above, that tone does not have predictable reflexes in modern Chinese; this may account for the variation in tone on these words.

Second and fourth tone change

In conversational speech, for the rising tone (tone 2) and falling tone (tone 4), there are some situations (based on which tones are used immediately before and after) where the pitch contours will change.^[33]

Tone 2 becomes higher and changes its direction, approaching the tone 1 pitch contour, when put between tone 1 or 2 and any other full tone.

!Tone pattern!Nominal!Changed!Example words
1-2-1	pronounced as /˥ ˧˥ ˥/	pronounced as /˥ ˥˦ ˥/
1-2-4	pronounced as /˥ ˧˥ ˥˩/	pronounced as /˥ ˥˦ ˥˩/
2-2-1	pronounced as /˧˥ ˧˥ ˥/	pronounced as /˧˥ ˥˦ ˥/
2-2-4	pronounced as /˧˥ ˧˥ ˥˩/	pronounced as /˧˥ ˥˦ ˥˩/
1-2-2	pronounced as /˥ ˧˥ ˧˥/	pronounced as /˥ ˥˦ ˧˥/
1-2-3	pronounced as /˥ ˧˥ ˨˩/	pronounced as /˥ ˥˦ ˨˩/
2-2-2	pronounced as /˧˥ ˧˥ ˧˥/	pronounced as /˧˥ ˥˦ ˧˥/
2-2-3	pronounced as /˧˥ ˧˥ ˨˩/	pronounced as /˧˥ ˥˦ ˨˩/

Rising tone induced by the tone 3 sandhi also undergoes this transformation.

!Tone pattern!Nominal!With sandhi!Changed!Example words
3-3-3	pronounced as /˨˩ ˨˩ ˨˩/	pronounced as /˧˥ ˧˥ ˨˩/	pronounced as /˧˥ ˥˦ ˨˩/
2-3-3	pronounced as /˧˥ ˨˩ ˨˩/	pronounced as /˧˥ ˧˥ ˨˩/	pronounced as /˧˥ ˥˦ ˨˩/
1-3-3	pronounced as /˥ ˨˩ ˨˩/	pronounced as /˥ ˧˥ ˨˩/	pronounced as /˥ ˥˦ ˨˩/

The status of this tone change is ambiguous, and some authors consider it a tone sandhi akin to the third tone sandhi. Yuen Ren Chao considered the changed tone 2 to be identical to tone 1, and Cao Wen treated it as tone 1 (before tones 1 or 4) or tone 4 (before tones 2 or 3). Both views are generalizations; the exact pitch contour of the changed tone 2 varies between mid-level ˧ in isolated words or at a slower speaking rate, and slightly falling high ˥ in a carrier sentence, at a faster speaking rate.

Tone 4 becomes lower and flatter, but still slightly falling, akin to Cantonese tone 3, when put between tone 3 or 4 and tone 1 or 4.

!Tone pattern!Pitch
(nominal)!Pitch
(changed)!Example words
4-4-1	pronounced as /˥˩ ˥˩ ˥/	pronounced as /˥˧ ˧ ˥/
4-4-4	pronounced as /˥˩ ˥˩ ˥˩/	pronounced as /˥˧ ˧ ˥˩/
3-4-1	pronounced as /˨˩ ˥˩ ˥/	pronounced as /˨˩ ˧ ˥/
3-4-4	pronounced as /˨˩ ˥˩ ˥˩/	pronounced as /˨˩ ˧ ˥˩/

Unlike with changed tone 2, the changed tone 4 pitch contour was only insignificantly influenced by the change of speaking rate, provided it was still at conversational speed. The resulting pitch contours, especially that of the changed tone 4, are not associated with a phonemic tone in Mandarin. In perceptual experiments, native Beijing Mandarin speakers could easily recognize the intended tone in the original word, but could not recognize it when it was stripped from the context by the adjacent syllables being replaced with white noise:

Changed tone 2 was perceived as tone 1 in over 70% of responses
Changed tone 4 was perceived as tone 1 in over 50% of responses
Both of them were properly recognized in only 20% of responses

Besides the speech rate, the frequency of expression may also play a role in triggering this tone change. The changed tone 2 that normally required tone 1 or 2 to precede it is also said to occur in in place of sandhi-tone 3, but it remains to be seen whether there are more examples with initial tone 4.

Stress, rhythm and intonation

Stress within words (word stress) is not felt strongly by Chinese speakers, although contrastive stress is perceived easily (and functions much the same as in other languages). One of the reasons for the weaker perception of stress in Chinese may be that variations in the fundamental frequency of speech, which in many other languages serve as a cue for stress, are used in Chinese primarily to realize the tones. Nonetheless, there is still a link between stress and pitch – the range of pitch variation (for a given tone) has been observed to be greater on syllables that carry more stress.

As discussed above, weak syllables have neutral tone and are unstressed. Although this property can be contrastive, the contrast is interpreted by some as being primarily one of tone rather than stress. (Some linguists analyze Chinese as lacking word stress entirely.)

Apart from this contrast between full and weak syllables, some linguists have also identified differences in levels of stress among full syllables. In some descriptions, a multi-syllable word or compound is said to have the strongest stress on the final syllable, and the next strongest generally on the first syllable. Others, however, reject this analysis, noting that the apparent final-syllable stress can be ascribed purely to natural lengthening of the final syllable of a phrase, and disappears when a word is pronounced within a sentence rather than in isolation. San Duanmu takes this view, and concludes that it is the first syllable that is most strongly stressed. He also notes a tendency for Chinese to produce trochees – feet consisting of a stressed syllable followed by one (or in this case sometimes more) unstressed syllables. On this view, if the effect of "final-lengthening" is factored out:

In words (compounds) of two syllables, the first syllable has the main stress, and the second lacks stress.
In words (compounds) of three syllables, the first syllable is stressed most strongly, the second lacks stress, and the third may lack stress or have secondary stress.
In words (compounds) of four syllables, the first syllable is stressed most strongly, the second lacks stress, and the third or fourth may lack stress or have secondary stress depending on the syntactic structure of the compound.

The positions described here as lacking stress are the positions in which weak (neutral-tone) syllables may occur, although full syllables frequently occur in these positions also.

There is a strong tendency for Chinese prose to employ four-syllable 'prosodic words' consisting of alternating stressed and unstressed syllables which are further subdivided into two trochaic feet. This structure, sometimes known as a 'four-character template', is particularly prevalent in chengyu, which are classical idioms that are usually four characters in length.^[34] Statistical analysis of chengyu and other idiomatic phrases in vernacular texts indicates that the four-syllable prosodic word had become an important metrical consideration by the Wei and Jin dynasties (4th century CE).^[35]

This preference for trochaic feet may even result in polysyllabic words in which the foot and word (morpheme) boundaries do not align. For example, 'Czechoslovakia' is stressed as // and 'Yugoslavia' is stressed as /, even though the morpheme boundaries are / 'Czech[o]/slovak[ia]' and / 'South/slav[ia]', respectively. The preferred stress pattern also has a complex effect on tone sandhi for the various Chinese dialects.^[36]

This preference for a trochaic metrical structure is also cited as a reason for certain phenomena of word order variation within complex compounds, and for the strong tendency to use disyllabic words rather than monosyllables in certain positions. Many Chinese monosyllables have alternative disyllabic forms with virtually identical meaning – see .

Another function of voice pitch is to carry intonation. Chinese makes frequent use of particles to express certain meanings such as doubt, query, command, etc., reducing the need to use intonation. However, intonation is still present in Chinese (expressing meanings rather similarly as in standard English), although there are varying analyses of how it interacts with the lexical tones. Some linguists describe an additional intonation rise or fall at the end of the last syllable of an utterance, while others have found that the pitch of the entire utterance is raised or lowered according to the desired intonational meaning.

References

Works cited

Book: Chao, Yuen Ren . Yuen Ren Chao . Mandarin Primer: an Intensive Course in Spoken Chinese . . 1948 . 978-0-674-73288-9 . registration.
Book: Chao, Yuen Ren . Yuen Ren Chao . A Grammar of Spoken Chinese . . 1968 . 978-0-520-00219-7 . 2nd . 3.
Book: Duanmu, San . The Phonology of Standard Chinese . Oxford University Press . 2000 . 978-0-199-25831-4.
- Book: Duanmu, San . The Phonology of Standard Chinese . Oxford University Press . 2007 . 2nd . 0 . 978-0-199-21579-9.
Book: Lin, Yen-Hwei . The Sounds of Chinese . Cambridge University Press . 2007 . 978-0-521-60398-0.
1984 . Places of Articulation: An Investigation of Pekingese Fricatives . Journal of Phonetics . 12 . 267–78 . 10.1016/S0095-4470(19)30883-6 . free . Peter . Ladefoged . Zongji . Wu.
Book: Ladefoged, Peter . The Sounds of the World's Languages . The Sounds of the World's Languages . Maddieson . Ian . Blackwell . 1996 . Oxford . 3.
2003 . Standard Chinese (Beijing) . Journal of the International Phonetic Association . 33 . 1 . 109–112 . 10.1017/S0025100303001208 . free . Lee . Wai-Sum . Zee . Eric.
Book: Norman, Jerry . Jerry Norman (sinologist) . Chinese . Cambridge University Press . 1988 . 978-0-521-29653-3.
Book: Zhu, Xiaonong . The Oxford Handbook of Chinese Linguistics . Wang . Caiyu . Oxford University Press . 2015 . 978-0-199-85633-6 . Wang . William S.-Y. . 503–515 . Tone . Sun . Chaofen.
Book: National Taiwan Normal University . Chinese Phonetics Textbook Editorial Committee . Zhengzhong shuju . 2008 . 978-9-570-91808-3 . 8th . zh . zh:國音學 . Mandarin Chinese Phonetics.

External links

]

Notes and References

Lee . Wai-Sum . An articulatory and acoustical analysis of the syllable-initial sibilants and approximant in Beijing Mandarin . Proceedings of the 14th International Congress of Phonetic Sciences .
普通话是以北京语音为标准音的。那么北京人为什么还要学习普通话语音呢？这是因为北京话也是一种汉语方言。普通话采取北京语音系统作为标准音，并不是不加分析、不加选择地采用，而是要排除北京话的特殊土语成分。北京话的特殊土语成分可以表现在语音的许多方面。例如：在声母上，一部分年轻女性把普通话舌面前音j、q、x读作舌尖前音z、c、s。[...] 所以，北京人也有一个学习普通话语音的问题。" (Translation) "Putonghua takes the Beijing pronunciation as the standard pronunciation. Then why would Beijingers need to learn Putonghua's pronunciation? Because Beijing Chinese is a Chinese dialect. Putonghua does not absorb the Beijing pronunciation indiscriminately, but has to exclude special dialectal constituents. Special dialectal constituents of Beijing may exhibit in many aspects in pronunciation. For instance, in terms of initials, some young females pronounce anterior dorsal j, q, and x as anterior coronal z, c, and s. [...] Therefore, Beijingers still have the problem of learning Putonghua pronunciation." (北京市语言文字工作委员会办公室(2005). 普通话水平测试指导用书（北京版）. Beijing: 商务印书馆. p. 6. .)
[Richard Wiese (linguist)|Richard Wiese]
Wu . Chen-Huei . Shih . Chilin . 2009 . Mandarin Vowels Revisited: Evidence from Electromagnetic Articulography . Annual Meeting of the Berkeley Linguistics Society . 35 . 1 . 329–340 . 10.3765/bls.v35i1.3622 . free.
Wan . I-Ping . Jaeger . Jeri J. . 2003 . The Phonological Representation of Taiwan Mandarin Vowels: A Psycholinguistic Study . Journal of East Asian Linguistics . 12 . 3 . 205–257 . 10.1023/A:1023666819363 . 118189894.
Book: Mantaro. Hashimoto. Hashimoto Mantaro. Roman. Jakobson. Roman Jacobson. Shigeo. Kawamoto. Notes on Mandarin Phonology. Studies in General and Oriental Linguistics. 207–220. Tokyo. TEC. 1970. 978-0-404-20311-5.
Mou . Xiaomin . Nasal Codas in Standard Chinese: A Study in the Framework of the Distinctive Feature Theory . 2006 . Ph.D. . Massachusetts Institute of Technology . 1721.1/35283 . free.
Book: Mair, Victor H. . Two Non-Tetragraphic Northern Sinitic Languages: a) Implications of the Soviet Dungan Script for Chinese Language Reform . 1990 . University of Pennsylvania . Sino-Platonic Papers, no. 18 . Philadelphia, PA . A-10 . Victor H. Mair.
Web site: Mair . Victor . Victor H. Mair . November 29, 2014 . [reply to comment in "Punning banned in China"] ]. 30 November 2014 . Language Log . At the monosyllabic level, there are a lot of homophones, but the average length of a word is approximately two syllables. So, at the level of the word, there's no problem with homophony..
Book: Pollack, John. The Pun Also Rises: How the Humble Pun Revolutionized Language, Changed History, and Made Wordplay More Than Some Antics. Gotham Books. New York, NY. 2011. Chapter 5 - More than Some Antics: Why Puns Matter . 978-1-592-40623-4.
News: Wines . Michael . 11 March 2009 . A Dirty Pun Tweaks China's Online Censors . . 12 March 2009.
Book: Po-ching . Yip. 2000. The Chinese Lexicon: A Comprehensive Survey. 29. Routledge. London. 0-415-15174-0.
Book: Surendran . Dinoj . Proceedings of the International Conference on Speech Prosody 2004 . Levow . Gina-Anne . 2004 . Nara, Japan . 99–102 . The Functional Load of Tone in Mandarin Is as High as That of Vowels . November 16, 2021 . http://www.cs.uchicago.edu/~dinoj/research/fltonemandarin.pdf . https://web.archive.org/web/20210319202408/http://people.cs.uchicago.edu/~dinoj/research/fltonemandarin.pdf . March 19, 2021 . live.
Web site: https://www.moedict.tw/%E4%B8%8A%E8%81%B2. zh:上聲 - 國語辭典 . 中華民國教育部. zh. 2018-11-28.
Book: zh:古代汉语大词典大字本 . 《 . 商务印书馆 . 2002. 978-7-100-03515-6. Beijing. 1369. zh.
Book: zh:现代汉语词典（第5版）. 中国社会科学院语言研究所词典编辑室. 商务印书馆. 2006. 978-7-100-04385-4. Beijing. 1193. zh.
Book: https://books.google.com/books?id=LSREzgEACAAJ . 2007 . 北京语言大学出版社 . Cao . Wen . Beijing Shi . zh . zh:汉语语音教程.
Fon . Yee-Jean . 1999 . What Does Chao Have to Say about Tones? A Case Study of Taiwan Mandarin . AH . en.
Book: Shi, Feng . http://www.ling.sinica.edu.tw/Files/LL/Docments/Monographs/Linguistics%20Studies%20in%20Chinese%20and%20Neighboring%20Languages/Volume%201/16-%E7%9F%B3%E9%8B%92%26%E9%84%A7%E4%B8%B9.pdf . Deng . Dan . 2006 . Ho . Dah-an . Ho Dah-an 何大安; . 1 . 371–393 . zh . zh:山高水長：丁邦新先生七秩壽慶論文集 . Linguistic Studies in Chinese and Neighboring Languages: Festschrift in Honor of Professor Pang-hsin Ting on His 70th Birthday . zh:普通話與台灣國語的語音對比 . Shi Feng 石鋒; . Deng Dan 鄧丹 . Cheung . H. Samuel . H. Samuel Cheng 張洪年; . Pan . Wuyun . Pan Wuyun 潘悟雲; . Wu . Fuxiang . Wu Fuxiang 吳福祥 . 2021-12-11 . 2021-09-19 . https://web.archive.org/web/20210919220351/http://www.ling.sinica.edu.tw/Files/LL/Docments/Monographs/Linguistics%20Studies%20in%20Chinese%20and%20Neighboring%20Languages/Volume%201/16-%E7%9F%B3%E9%8B%92%26%E9%84%A7%E4%B8%B9.pdf . dead .
Book: Sanders, Robert . Proceedings of the 20th North American Conference on Chinese Linguistics (NACCL-20) . The Ohio State University . 2008 . Chan . Marjorie K. M. . 1 . Columbus, OH . 87–107 . Tonetic Sound Change in Taiwan Mandarin: The Case of Tone 2 and Tone 3 Citation Contours . Kang . Hana . https://naccl.osu.edu/sites/naccl.osu.edu/files/00_sanders-r.pdf.
Web site: Zhang . Ling . Zhang Ling 張凌 . 2018 . zh:香港人學習普通話的聲調偏誤之聲學分析 . https://repository.eduhk.hk/en/publications/%E9%A6%99%E6%B8%AF%E4%BA%BA%E5%AD%B8%E7%BF%92%E6%99%AE%E9%80%9A%E8%A9%B1%E7%9A%84%E8%81%B2%E8%AA%BF%E5%81%8F%E8%AA%A4%E4%B9%8B%E8%81%B2%E5%AD%B8%E5%88%86%E6%9E%90.
Khoo . Hui-lu . Khoo Hui-lu 許慧 . 2020 . zh:「台中腔」－台灣中部華語的聲調特徵及其成因初探 . A Preliminary Study of the Tonal Features of Central Taiwan Mandarin . http://tjl.nccu.edu.tw/main/uploads/TJL_18.1_.4_version_11_.pdf . Taiwan Journal of Linguistics . zh . 18 . 1 . 115–157 . 10.6519/TJL.202001_18(1).0004.
Book: Hsu, Huiju . 2004 International Symposium on Chinese Spoken Language Processing: Proceedings: December 15-18, 2004, the Chinese University of Hong Kong, Hong Kong . 2004 . IEEE . 0-7803-8678-7 . Piscataway, NJ . 129–132 . Taiwan Mandarin - Does It Remain Homogeneous? . https://scholar.lib.ntnu.edu.tw/en/publications/taiwan-mandarin-does-it-remain-homogeneous-2.
Kuang . Jianjing . 2017-09-01 . Covariation between voice quality and pitch: Revisiting the case of Mandarin creaky voice . The Journal of the Acoustical Society of America . 142 . 3 . 1693–1706 . 10.1121/1.5003649 . 28964062 . 2017ASAJ..142.1693K . 0001-4966. free .
Shen . Xiaonan Susan . 1990 . On Mandarin Tone 4 . Australian Journal of Linguistics . 10 . 1 . 41–59 . 10.1080/07268609008599431.
[Wang Jialing]
Zhang . Jie . 2007 . A Directional Asymmetry in Chinese Tone Sandhi Systems . Journal of East Asian Linguistics . 16 . 4 . 259–302 . 10.1007/s10831-007-9016-2 . 25702296. 2850414 .
Book: Huang . C.-T. James . The Handbook of Chinese Linguistics . Li . Y.-H. Audrey . Simpson . Andrew . 2014 . Wiley Blackwell . 978-0-470-65534-4.
[Yiya Chen]
凡例, section 3.5: "一般读轻声、间或重读的字"Book: Xiàndài Hànyǔ cídiǎn . 2016 . 商务印书馆 . 978-7-100-12450-8 . 7th . Beijing . 3 . zh . zh:现代汉语词典 . A Dictionary of Current Chinese.
"重·次轻词语后一音节的读音不太稳定，有"轻化"倾向，是"可轻读词语"。"Book: Xing, Fuyi 邢福义 . Pǔtōnghuà péixùn cèshì zhǐyào . 2011 . 华中师范大学出版社. . 978-7-562-24795-1 . 2nd . Wuhan . 172 . zh . zh:普通话培训测试指要.
"4一般读轻声、间或重读的音节"。"Book: Pǔtōnghuà shuǐpíng cèshì shíshī gāngyào . 2004 . 商务印书馆 . 7-100-03996-7 . Beijing . 43 . zh . zh:普通话水平测试实施纲要.
PhD. Xu. Yi. Contextual tonal variation in Mandarin Chinese. 1993. The University of Connecticut. en.
Book: Zhu, Saiping {{zhi|c=朱赛萍}} . 2015 . 978-7-5619-4361-8 . Duanmu . San . Beijing Shi . zh . zh:汉语的四字格 . Feng . Shengli . Wang . Hongjun.
Book: Feng, Shengli . Prosodic Morphology of Mandarin Chinese . 2018 . Routledge . 978-1-315-39276-9 . Abingdon, Oxon.
Book: Duanmu, San . Proceedings of the Symposium "Cross-Linguistic Studies of Tonal Phenomena, Historical Development, Tone-Syntax Interface, and Descriptive Studies . 2005 . Research Institute for Languages and Cultures of Asia and Africa (ILCAA) . Kaji . Sigeki . Tokyo . 16–17 . The Tone-Syntax Interface in Chinese: Some Recent Controversies.