Sotho phonology explained

Notes:
  • The orthography used in this and related articles is that of South Africa, not Lesotho. For a discussion of the differences between the two see the notes on Sesotho orthography.
pronounced as /notice/The phonology of Sesotho and those of the other Sotho–Tswana languages are radically different from those of "older" or more "stereotypical" Bantu languages. Modern Sesotho in particular has very mixed origins (due to the influence of Difaqane refugees) inheriting many words and idioms from non-Sotho–Tswana languages.

There are in total 39 consonantal phonemes[1] (plus 2 allophones) and 9 vowel phonemes (plus two close raised allophones). The consonants include a rich set of affricates and palatal and postalveolar consonants, as well as three click consonants.

Historical sound changes

Probably the most radical sound innovation in the Sotho–Tswana languages is that the Proto-Bantu prenasalized consonants have become simple stops and affricates.[2] Thus isiZulu words such as entabeni ('on the mountain'), impuphu ('flour'), ezinkulu ('the big ones'), ukulanda ('to fetch'), ukulamba ('to become hungry'), and ukuthenga ('to buy') are cognates to Sesotho pronounced as /[tʰɑbeŋ̩]/ thabeng, pronounced as /[pʰʊfʊ]/ phofo, pronounced as /[t͡sʼexʊlʊ]/ tse kgolo, pronounced as /[hʊlɑtʼɑ]/ ho lata, pronounced as /[hʊlɑpʼɑ]/ ho lapa, and pronounced as /[hʊʀɛkʼɑ]/ ho reka, respectively (with the same meanings).

This is further intensified by the law of nasalization and nasal homogeneity, making derived and imported words have syllabic nasals followed by homogeneous consonants, instead of prenasalized consonants.

Another important sound change in Sesotho which distinguishes it from almost all other Sotho–Tswana languages and dialects is the chain shift from pronounced as //x// and pronounced as //k͡xʰ// to pronounced as //h// and pronounced as //x// (the shift of pronounced as //k͡xʰ// to pronounced as //x// is not yet complete).

In certain respects, however, Sesotho is more conservative than other Sotho–Tswana languages. For example, the language still retains the difference in pronunciation between pronounced as //ɬ//, pronounced as //t͡ɬʰ//, and pronounced as //tʰ//.[3] Many other Sotho–Tswana languages have lost the fricative pronounced as //ɬ//, and some Northern Sotho languages, possibly influenced by Tshivenda, have also lost the lateral affricate and pronounce all three historical consonants as pronounced as //tʰ// (they have also lost the distinction between pronounced as //t͡ɬ// and pronounced as //t// - thus, for example, speakers of the Northern Sotho language commonly called Setlokwa call their language "Setokwa").[4]

The existence of (lightly) ejective consonants (all unvoiced unaspirated stops) is very strange for a Bantu language and is thought to be due to Khoisan influence. These consonants occur in the Sotho–Tswana and Nguni languages (being over four times more common in Southern Africa than anywhere else in the world), and the ejective quality is strongest in isiXhosa, which has been greatly influenced by Khoisan phonology.

As with most other Bantu languages, almost all palatal and postalveolar consonants are due to some form of palatalization or other related phenomena which result from a (usually palatal) approximant or vowel being "absorbed" into another consonant (with a possible subsequent nasalization).

The Southern Bantu languages have lost the Bantu distinction between long and short vowels. In Sesotho the long vowels have simply been shortened without any other effects on the syllables; while sequences of two dissimilar vowels have usually resulted in the first vowel being "absorbed" into the preceding consonant, and causing changes such as labialization and palatalization.

As with most Southern African Bantu languages, the "composite" or "secondary" vowels *e and *o have become pronounced as //ɛ, e// and pronounced as //ɔ, o//. These usually behave as two phonemes (conditioned by vowel harmony), although there are enough exceptions to justify the claim that they have become four separate phonemes in the Sotho–Tswana languages.

Additionally, the first-degree (or "superclose", "heavy") and second-degree vowels have not merged as in many other Bantu languages, resulting in a total of 9 phonemic vowels.

Almost uniquely among the Sotho–Tswana languages, Sesotho has adopted clicks.[5] There is one place of articulation, alveolar, and three manners and phonations: tenuis, aspirated, and nasalized. These most probably came with loanwords from the Khoisan and Nguni languages, though they also exist in various words which don't exist in these languages and in various ideophones.

These clicks also appear in environments which are rare or non-existent in the Nguni and Khoisan languages, such as a syllabic nasal followed by a nasalized click (pronounced as /[ŋ̩ǃn]/ written (nnq), as in pronounced as /[ŋ̩ǃnɑnɪ]/ nnqane 'that other side'), a syllabic nasal followed by a tenuis click (pronounced as /[ŋ̩ǃ]/, also written (nq), as in pronounced as /[sɪŋ̩ǃɑŋ̩ǃɑnɪ]/ senqanqane 'frog'; this is not the same as the prenasalized radical click written (nkq) in the Nguni languages), and a syllabic nasal followed by an aspirated click (pronounced as /[ŋ̩ǃʰ]/ written (nqh), as in pronounced as /[sɪǃʰɪŋ̩ǃʰɑ]/ seqhenqha 'hunk').

Vowels

Sesotho has a large inventory of vowels compared with many other Bantu languages. However, the nine phonemic vowels are collapsed into only five letters in the Sesotho orthography. The two close vowels i and u (sometimes called "superclose" or "first-degree" by Bantuists) are very high (with advanced tongue root) and are better approximated by French vowels than English vowels. That is especially true for pronounced as //u//, which, in English, is often noticeably more front and can be transcribed as pronounced as /[u̟]/ or pronounced as /[ʉ]/ in the IPA; that is absent from Sesotho (and French).

Vowels[6]
pronounced as /link/pronounced as /link/
pronounced as /[huˌbit͡sʼɑ]/ ho bitsa ('to call')beetpronounced as /[tʼumɔ]/ tumo ('fame')boot
pronounced as /link/pronounced as /link/
pronounced as /[hʊlɪkʼɑ]/ ho leka ('to attempt') pitpronounced as /[pʼʊt͡sʼɔ]/ potso ('query')put
pronounced as /link/pronounced as /link/
pronounced as /[hʊʒʷet͡sʼɑ]/ ho jwetsa ('to tell')cafepronounced as /[pʼon̩t͡sʰɔ]/ pontsho ('proof')oiseau
pronounced as /link/pronounced as /link/
pronounced as /[hʊʃɛbɑ]/ ho sheba ('to look')bedpronounced as /[mʊŋɔlɔ]/ mongolo ('writing') board
pronounced as /link/
pronounced as /[hʊˈɑbɛlɑ]/ ho abela ('to distribute')spa

Consonants

The Sotho–Tswana languages are peculiar among the Bantu family in that most do not have any prenasalized consonants and have a rather-large number of heterorganic compounds. Sesotho, uniquely among the recognised and standardised Sotho–Tswana languages, also has click consonants, which were acquired from Khoisan and Nguni languages.

LabialAlveolarPost-
alveolar
PalatalVelarUvularGlottal
centrallateral
Clickglottalizedpronounced as /ink/
aspiratedpronounced as /ink/
nasalpronounced as /ink/
Nasalpronounced as /ink/pronounced as /ink/pronounced as /ink/pronounced as /ink/
Stopejectivepronounced as /ink/pronounced as /ink/pronounced as /ink/
aspiratedpronounced as /ink/pronounced as /ink/pronounced as /ink/
voicedpronounced as /ink/(pronounced as /ink/)1
Affricateejectivepronounced as /ink/pronounced as /ink/pronounced as /ink/
aspiratedpronounced as /ink/pronounced as /ink/pronounced as /ink/pronounced as /ink/ ~ pronounced as /ink/
Fricativevoicelesspronounced as /ink/pronounced as /ink/pronounced as /ink/pronounced as /ink/pronounced as /ink/ ~ pronounced as /ink/
voicedpronounced as /ink/ ~ pronounced as /ink/
Approximantpronounced as /ink/pronounced as /ink/pronounced as /ink/
Trillrpronounced as /ink/
  1. pronounced as /[d]/ is an allophone of pronounced as //l//, occurring only before the close vowels (pronounced as //i// and pronounced as //u//). Dialectical evidence shows that in the Sotho–Tswana languages pronounced as //l// was originally pronounced as a retroflex flap pronounced as /[ɽ]/ before the two close vowels.

Sesotho makes a three-way distinction between lightly ejective, aspirated and voiced stops in several places of articulation.

IPANotesOrthographyExample
bilabialpronounced as /link/unaspirated: spitppronounced as /[pʼit͡sʼɑ]/ pitsa ('cooking pot')
pronounced as /link/ phpronounced as /[pʰupʼut͡sʼɔ]/ phuputso ('investigation')
pronounced as /link/this consonant is fully voicedbpronounced as /[lɪbɪsɪ]/ lebese ('milk')
alveolarpronounced as /link/unaspirated: stalktpronounced as /[bʊtʼɑlɑ]/ botala ('greenness')
pronounced as /link/ thpronounced as /[tʰɑʀʊl̩lɔ]/ tharollo ('solution')
pronounced as /link/an allophone of pronounced as //l//, only occurring before the close vowels (pronounced as //i// and pronounced as //u//)dpronounced as /[muˌdimʊ]/ Modimo ('God')
velarpronounced as /link/unaspirated: skillkpronounced as /[buˌˈikʼɑʀɑbɛlɔ]/ boikarabelo ('responsibility')
pronounced as /link/fully aspirated: kill; occurring mostly in old loanwords from Nguni languages and in ideophoneskhpronounced as /[lɪkʰɔkʰɔ]/ lekhokho ('pap baked onto the pot')

Sesotho possesses four simple nasal consonants. All of these can be syllabic and the syllabic velar nasal may also appear at the end of words.

IPANotesOrthographyExample
bilabialpronounced as /link/ mpronounced as /[hʊmɑmɑʀet͡sʼɑ]/ ho mamaretsa ('to glue')
pronounced as //m̩//syllabic version of the abovempronounced as /[m̩pɑ]/ mpa ('stomach')
alveolarpronounced as /link/ npronounced as /[lɪnɑnɛˈɔ]/ lenaneo ('programme')
pronounced as //n̩//syllabic version of the abovenpronounced as /[n̩nɑ]/ nna ('I')
alveolo-palatalpronounced as /link/a bit like Spanish el niñonypronounced as /[hʊɲɑlɑ]/ ho nyala ('to marry')
pronounced as //ɲ̩//syllabic version of the abovenpronounced as /[ɲ̩ɲeʊ]/ nnyeo ('so-and-so')
velarpronounced as /link/can occur initiallyngpronounced as /[lɪŋɔlɔ]/ lengolo ('letter')
pronounced as //ŋ̩//syllabic version of the abovenpronounced as /[hʊŋ̩kʼɑ]/ ho nka ('to take')

The following approximants occur. All instances of pronounced as //w// and pronounced as //j// most probably come from original close pronounced as //ʊ//, pronounced as //ɪ//, pronounced as //u//, and pronounced as //i// vowels or Proto-Bantu *u, *i, *û, and *î (under certain circumstances).

Note that when (w) appears as part of a syllable onset this actually indicates that the consonant is labialized.

IPANotesOrthographyExample
labial-velarpronounced as /link/ wpronounced as /[sɪwɑ]/ sewa ('epidemic')
lateralpronounced as /link/never occurs before close vowels (pronounced as //i// and pronounced as //u//), where it becomes pronounced as /[d]/lpronounced as /[sɪlɛpʼɛ]/ selepe ('axe')
pronounced as //l̩//a syllabic version of the above; note that if the sequence pronounced as /[l̩l]/ is followed by the close pronounced as /[i]/ or pronounced as /[u]/ then the second pronounced as /[l]/ is pronounced normally, not as a pronounced as /[d]/ lpronounced as /[mʊl̩lɔ]/ mollo ('fire')
palatalpronounced as /link/ ypronounced as /[hʊt͡sʼɑmɑjɑ]/ ho tsamaya ('to walk')

The following fricatives occur. The glottal fricative is often voiced between vowels, making it barely noticeable.[7] The alternative orthography used for the velar fricative is due to some loanwords from Afrikaans and ideophones which were historically pronounced with velar fricatives, distinct from the velar affricate. The voiced postalveolar affricative sometimes occurs as an alternative to the fricative.

IPANotesOrthographyExample
labiodentalpronounced as /link/ fpronounced as /[huˌfumɑnɑ]/ ho fumana ('to find')
alveolarpronounced as /link/ spronounced as /[sɪsʊtʰʊ]/ Sesotho
postalveolarpronounced as /link/ shpronounced as /[mʊʃʷɛʃʷɛ]/ Moshweshwe ('Moshoeshoe I')
pronounced as /link/ jpronounced as /[mʊʒɑlɪfɑ]/ mojalefa ('heir
lateralpronounced as /link/ hlpronounced as /[hʊɬɑɬʊbɑ]/ ho hlahloba ('to examine')
velarpronounced as /link/ kg. Also (g) in Gauta ('Gauteng') pronounced as /[xɑˈutʼɑ]/ and some ideophones such as gwa ('of extreme whiteness') pronounced as /[xʷɑ]/pronounced as /[sɪxɔ]/ sekgo ('spider')
glottalpronounced as /link/ hpronounced as /[hʊˈɑhɑ]/ ho aha ('to build')

There is one trill consonant. Originally, this was an alveolar rolled lingual, but today most individuals pronounce it at the back of the tongue, usually at the uvular position. The uvular pronunciation is largely attributed to the influence of French missionaries at Morija in Lesotho. Just like the French version, the position of this consonant is somewhat unstable and often varies even in individuals, but it generally differs from the "r"'s of most other South African language communities. The most stereotypical French-like pronunciations are found in certain rural areas of Lesotho, as well as some areas of Soweto (where this has affected the pronunciation of Tsotsitaal).

IPANotesNotesOrthographyExample
alveolar/r/can also be a tapsimilar to the spanish perro'r[ke.a u ɾata] kea o rata ('I love you')
uvularpronounced as /link/soft Parisian-type rrpronounced as /[muˌʀiʀi]/ moriri ('hair')

Sesotho has a relatively large number of affricates. The velar affricate, which was standard in Sesotho until the early 20th century, now only occurs in some communities as an alternative to the more common velar fricative.[8]

IPANotesOrthographyExample
alveolarpronounced as /link/ tspronounced as /[hʊt͡sʼʊkʼʊt͡sʼɑ]/ ho tsokotsa ('to rinse')
pronounced as /link/aspiratedtshpronounced as /[hʊt͡sʰʊhɑ]/ ho tshoha ('to become frightened')
lateralpronounced as /link/ tlpronounced as /[hʊt͡ɬʼɑt͡sʼɑ]/ ho tlatsa ('to fill')
pronounced as /link/occurs only as a nasalized form of hl or as an alternative to ittlhpronounced as /[t͡ɬʰɑhɔ]/ tlhaho ('nature')
postalveolarpronounced as /link/ tjpronounced as /[ɲ̩t͡ʃʼɑ]/ ntja ('dog')
pronounced as /link/ tjhpronounced as /[hʊɲ̩t͡ʃʰɑfɑt͡sʼɑ]/ ho ntjhafatsa ('to renew')
pronounced as /link/this is an alternative to the fricative pronounced as //ʒ//jpronounced as /[hʊd͡ʒɑ]/ ho ja ('to eat')
velarpronounced as /link/alternative to the velar fricativekgpronounced as /[k͡xʰɑlɛ]/ kgale ('a long time ago')

The following click consonants occur.[9] In common speech they are sometimes substituted with dental clicks. Even in standard Sesotho the nasal click is usually substituted with the tenuis click. (nq) is also used to indicate a syllabic nasal followed by an ejective click (pronounced as //ŋ̩ǃkʼ//), while (nnq) is used for a syllabic nasal followed by a nasal click (pronounced as //ŋ̩ǃŋ//).

IPANotesOrthographyExample
postalveolarpronounced as /link/ejectiveqpronounced as /[hʊǃkʼɔǃkʼɑ]/ ho qoqa ('to chat')
pronounced as /link/nasal; this is often pronounced as an ejective clicknqpronounced as /[hʊᵑǃʊsɑ]/ ho nqosa ('to accuse')
pronounced as /link/aspiratedqhpronounced as /[lɪǃʰekʼu]/ leqheku ('an elderly person')

The following heterorganic compounds occur. They are often substituted with other consonants, although there are a few instances when some of them are phonemic and not just allophonic. These are not considered consonant clusters.

In non-standard speech these may be pronounced in a variety of ways. bj may be pronounced pronounced as //bj// (followed by a palatal glide) and pj may be pronounced pronounced as //pjʼ//. pj may also sometimes be pronounced pronounced as //ptʃʼ//, which may alternatively be written ptj, though this is not to be considered standard.

IPANotesOrthographyExample
bilabial-palatalpronounced as //pʃʼ//alternative tjpjpronounced as /[hʊpʃʼɑt͡ɬʼɑ]/ ho pjatla ('to cook well;)
pronounced as //pʃʰ//aspirated version of the above; alternative tjhpjhpronounced as /[m̩pʃʰe]/ mpjhe ('ostrich')
pronounced as //bʒ//alternative jbjpronounced as /[hʊbʒɑʀɑnɑ]/ ho bjarana ('to break apart')
labiodental-palatalpronounced as //fʃ//only found in short passives of verbs ending with pronounced as /[fɑ]/ fa; alternative shfjpronounced as /[hʊbɔfʃʷɑ]/ ho bofjwa ('to be tied')

Syllable structure

Sesotho syllables tend to be open, with syllabic nasals and the syllabic approximant l also allowed. Unlike almost all other Bantu languages, Sesotho does not have prenasalized consonants (NC).

  1. The onset may be any consonant (C), a labialized consonant (Cw), an approximant (A), or a vowel (V).
  2. The nucleus may be a vowel, a syllabic nasal (N), or the syllabic l (L).
  3. No codas are allowed.

The possible syllables are:

Note that heterorganic compounds count as single consonants, not consonant clusters.

Additionally, the following phonotactic restrictions apply:

  1. A consonant may not be followed by the palatal approximant pronounced as //j// (i.e. C+y is not a valid onset).[10]
  2. Neither the labio-velar approximant pronounced as //w// nor a labialized consonant may be followed by a back vowel at any time.

Syllabic l occurs only due to a vowel being elided between two ls:

pronounced as /[mʊlɪlɔ]/ *molelo (Proto-Bantu *mu-dido) > pronounced as /[mʊl̩lɔ]/ mollo ('fire') (cf Setswana molelo, isiZulu umlilo)

pronounced as /[hʊlɪlɑ]/ *ho lela (Proto-Bantu *-dida) > pronounced as /[hʊl̩lɑ]/ ho lla ('to cry') (cf Setswana go lela, isiXhosa ukulila, Tshivenda u lila)

isiZulu ukuphuma ('to emerge') > ukuphumelela ('to succeed') > Sesotho pronounced as /[hʊpʰʊmɛl̩lɑ]/ ho phomella

There are no contrastive long vowels in Sesotho, the rule being that juxtaposed vowels form separate syllables (which may sound like long vowels with undulating tones during natural fast speech).[11] Originally there might have been a consonant between vowels which was eventually elided that prevented coalescence or other phonological processes (Proto-Bantu *g, and sometimes *j).

Other Bantu languages have rules against vowel juxtaposition, often inserting an intermediate approximant if necessary.

Sesotho pronounced as /[xɑˈutʼeŋ̩]/ Gauteng ('Gauteng') > isiXhosa Erhawudeni

Phonological processes

Vowels and consonants very often influence one another resulting in predictable sound changes. Most of these changes are either vowels changing vowels, nasals changing consonants, or approximants changing consonants. The sound changes are nasalization, palatalization, alveolarization, velarization, vowel elision, vowel raising, and labialization. Sesotho nasalization and vowel-raising are extra-strange since, unlike most processes in most languages, they actually decrease the sonority of the phonemes.

Nasalization (alternatively Nasal permutation or Strengthening) is a process in Bantu languages by which, in certain circumstances, a prefixed nasal becomes assimilated to a succeeding consonant and causes changes in the form of the phone to which it is prefixed. In the Sesotho language series of articles it is indicated by (N).

In Sesotho it is a fortition process and usually occurs in the formation of class 9 and 10 nouns, in the use of the objectival concord of the first person singular, in the use of the adjectival and enumerative concords of some noun classes, and in the forming of reflexive verbs (with the reflexive prefix).

Very roughly speaking, voiced consonants become devoiced and fricatives (except pronounced as //x// [12]) lose their fricative quality.

Vowels and the approximant pronounced as //w// get a pronounced as //kʼ// in front of them[13]

pronounced as //b// > pronounced as //pʼ//

pronounced as //l// > pronounced as //tʼ//

pronounced as //f// > pronounced as //pʰ//

pronounced as //ʀ// > pronounced as //tʰ//

pronounced as //s// > pronounced as //t͡sʰ//

pronounced as //ʃ// > pronounced as //t͡ʃʰ//

pronounced as //ɬ// > pronounced as //t͡ɬʰ// (except with adjectives)

The syllabic nasal causing the change is usually dropped, except for monosyllabic stems and the first person objectival concord. Reflexive verbs don't show a nasal.

pronounced as /[hʊˈɑʀbɑ]/ ho araba ('to answer') > pronounced as /[kʼɑʀɑbɔ]/ karabo ('response'), pronounced as /[hʊŋ̩kʼɑʀɑbɑ]/ ho nkaraba ('to answer me'), and pronounced as /[huˌˈikʼɑʀɑbɑ]/ ho ikaraba ('to answer oneself')

pronounced as /[hʊfɑ]/ ho fa ('to give') > pronounced as /[m̩pʰɔ]/ mpho ('gift'), pronounced as /[hʊm̩pʰɑ]/ ho mpha ('to give me'), and pronounced as /[huˌˈipʰɑ]/ ho ipha ('to give oneself')

Other changes may occur due to contractions in verb derivations:

pronounced as /[hʊbɔnɑ]/ ho bona ('to see') > pronounced as /[hʊbon̩t͡sʰɑ]/ ho bontsha ('to cause to see') (causative pronounced as /[bɔn]/ -bon- + pronounced as /[isɑ]/ -isa)

Nasal homogeneity consists of two points:

  1. When a consonant is preceded by a (visible or invisible) nasal it will undergo nasalization, if it supports it.
  2. When a nasal is immediately followed by another consonant with no vowel betwixt them, the nasal will change to a nasal in the same approximate position as the following consonant, after the consonant has undergone nasal permutation. If the consonant is already a nasal then the previous nasal will simply change to the same.

----

Palatalization is a process in certain Bantu languages where a consonant becomes a palatal consonant.

In Sesotho it usually occurs with the short form of passive verbs and the diminutives of nouns, adjectives, and relatives.

pronounced as //pʼ// > pronounced as //pʃʼ// / pronounced as //t͡ʃʼ//

pronounced as //pʰ// > pronounced as //pʃʰ// / pronounced as //t͡ʃʰ//

pronounced as //b// > pronounced as //bʒ// / pronounced as //ʒ//

pronounced as //f// > pronounced as //fʃ// / pronounced as //ʃ//

pronounced as //tʼ// > pronounced as //t͡ʃʼ//

pronounced as //tʰ// > pronounced as //t͡ʃʰ//

pronounced as //l// > pronounced as //ʒ//

pronounced as //n//, pronounced as //m//, and pronounced as //ŋ// > pronounced as //ɲ//

For example:

pronounced as /[hʊlɪfɑ]/ ho lefa ('to pay') > pronounced as /[hʊlɪfʃʷɑ]/ ho lefjwa / pronounced as /[hʊlɪʃʷɑ]/ ho leshwa ('to be paid')----

Alveolarization is a process whereby a consonant becomes an alveolar consonant. It occurs in noun diminutives, the diminutives of colour adjectives, and in the pronouns and concords of noun classes with a pronounced as /[di]/ di- or pronounced as /[di]/ di[N]- prefix. This results in either pronounced as //t͡sʼ// or pronounced as //t͡sʰ//.

Examples:

pronounced as /[xʷɑdi]/ -kgwadi ('black with white spots') > pronounced as /[xʷɑt͡sʼɑnɑ]/ -kgwatsana (diminutive)

pronounced as /[dikʼet͡sʼɔ  t͡sʼɑhɑˈʊ]/ diketso tsa hao ('your actions')

Other changes may occur due to phonological interactions in verbal derivatives:

pronounced as /[hʊbʊt͡sʼɑ]/ ho botsa ('to ask') > pronounced as /[hʊbʊt͡sʼet͡sʼɑ]/ ho botsetsa ('to ask on behalf of') (applied pronounced as /[bʊt͡sʼ]/ -bots- + pronounced as /[ɛlɑ]/ -ela)

The alveolarization which changes Sesotho pronounced as //l// to pronounced as //t͡sʼ// is by far the most commonly applied phonetic process in the language. It's regularly applied in the formation of some class 8 and 10 concords and in numerous verbal derivatives.

----

Velarization in Sesotho is a process whereby certain sounds become velar consonants due to the intrusion of an approximant. It occurs with verb passives, noun diminutives, the diminutives of relatives, and the formation of some class 1 and 3 prefixes.

For example:

pronounced as /[hʊsɪɲɑ]/ ho senya ('to destroy') > pronounced as /[hʊsɪŋ̩ŋʷɑ]/ ho senngwa ('to be destroyed') (short passive pronounced as /[sɪɲ]/ -seny- + pronounced as /[wɑ]/ -wa)

Class 1 pronounced as /[mʊ]/ mo- + pronounced as /[ɑhɑ]/ -aha > pronounced as /[ŋʷɑhɑ]/ ngwaha ('year') (cf Kiswahili mwaka; from Proto-Bantu *-jaka)

----Elision of vowels occurs in Sesotho less often than in those Bantu languages which have vowel "pre-prefixes" before the noun class prefixes (such as isiZulu), but there are still instances where it regularly and actively occurs.

There are two primary types of regular vowel elision:

  1. The vowels pronounced as //ɪ//, pronounced as //ɛ//, and pronounced as //ʊ// may be removed from between two instances of pronounced as //l//, thereby causing the first pronounced as //l// to become syllabic. This actively occurs with verbs, and has historically occurred with some nouns.
  2. When forming class 1 or 3 nouns from noun stems beginning with pronounced as //b// the middle pronounced as //ʊ// is removed and the pronounced as //b// is contracted into the pronounced as //m//, resulting in pronounced as /[m̩m]/. This actively occurs with nouns derived from verbs commencing with pronounced as /[b]/ and has historically occurred with many other nouns.

For example:

pronounced as /[bɑlɑ]/ -bala ('read') > pronounced as /[bɑl̩lɑ]/ -balla (applied verb suffix pronounced as /[ɛlɑ]/ -ela) ('read for'), and pronounced as /[m̩mɑdi]/ mmadi ('person who reads')----

Vowel raising is an uncommon form of vowel harmony where a non-open vowel (i.e. any vowel other than pronounced as //ɑ//) is raised in position by a following vowel (in the same phonological word) at a higher position. The first variety - in which the open-mid vowels become close-mid - is commonly found in most Southern African Bantu languages (where the Proto-Bantu "mixed" vowels have separated). In the 9-vowel Sotho–Tswana languages, a much less common process also occurs where the near-close vowels become raised to a position slightly lower than the close vowels (closer to the English beat and boot than the very high Sesotho vowels i and u) without ATR (or, alternatively, with both [+ATR] and [+[[retracted tongue root|RTR]]]).


Mid vowel raising is a process where pronounced as //ɛ// becomes pronounced as //e// and pronounced as //ɔ// becomes pronounced as //o// under the influence of close vowels or consonants that contain "hidden" close vowels.

ho tsheha‡ ('to laugh') pronounced as /[hʊt͡sʰɛhɑ]/ > ho tshehisa‡ ('to cause to laugh') pronounced as /[hʊt͡sʰehisɑ]/

ke a bona‡ ('I see') pronounced as /[kʼɪˈɑbɔnɑ]/ > ke bone‡ ('I saw') pronounced as /[kʼɪbonɪ]/

ho kena‡ ('to enter') pronounced as /[hʊkʼɛnɑ]/ > ho kenya‡ ('to insert') pronounced as /[hʊkʼeɲɑ]/

These changes are usually recursive to varying depths within the word, though, being a left spreading rule, it is often bounded by the difficulty of "foreseeing" the raising syllable:

diphoofolo‡ ('animals') pronounced as /[dipʰɔˈɔfɔlɔ]/ > diphoofolong‡ ('by the animals') pronounced as /[dipʰɔˈɔfoloŋ̩]/

Additionally, a right-spreading form occurs when a close-mid vowel is on the penultimate syllable (that is, the stressed syllable) and, due to some inflection or derivational process, is followed by an open-mid vowel. In this case the vowel on the final syllable is raised. This does not happen if the penultimate syllable is close (pronounced as //i// or (pronounced as //u//).

-besa ('roast') pronounced as /[besɑ]/ > subjunctive ke bese ('so I may roast...') pronounced as /[kʼɪbese]/but

-thola ('find') pronounced as /[tʰɔlɑ]/ > subjunctive ke thole ('so I may find...') pronounced as /[kʼɪtʰɔlɛ]/

These vowels can occur phonemically, however, and may thus be considered to be separate phonemes:

maele ('wisdom') pronounced as /[mɑˈele]/

ho retla ('to dismantle') pronounced as /[hʊʀet͡ɬʼɑ]/


Close vowel raising is a process which occurs under much less common circumstances. Near-close pronounced as //ɪ// becomes pronounced as /[iˌ]/ and near-close pronounced as //ʊ// becomes pronounced as /[uˌ]/[15] when immediately followed by a syllable containing the close vowels pronounced as //i// or pronounced as //u//. Unlike the mid vowel raising this processes is not iterative and is only caused directly by the close vowels (it cannot be caused by any hidden vowels or by other raised vowels).

pronounced as /[hʊt͡sʰɪlɑ]/ ho tshela ('to pass over') > pronounced as /[hʊt͡sʰiˌdisɑ]/ ho tshedisa ('to comfort')

pronounced as /[hʊlʊmɑ]/ ho loma ('to itch') > pronounced as /[sɪluˌmi]/ selomi ('period pains')Since these changes are allophonic, the Sotho–Tswana languages are rarely said to have 11 vowels.

----

Labialization is a modification of a consonant due to the action of a bilabial pronounced as //w// element which persists throughout the articulation of the consonant and is not merely a following semivowel. This labialization results in the consonant being pronounced with rounded lips[16] (but, in Sesotho, with no velarization) and with attenuated high frequencies (especially noticeable with fricatives and aspirated consonants).

It may be traced to an original pronounced as //ʊ// or pronounced as //u// being "absorbed" into the preceding consonant when the syllable is followed by another vowel. The consonant is labialized and the transition from the labialized syllable onset to the nucleus vowel sounds like a bilabial semivowel (or, alternatively, like a diphthong). Unlike in languages such as Chishona and Tshivenda, Sesotho labialization does not result in "whistling" of any consonants.

Almost all consonants may be labialized (indicated in the orthography by following the symbol with (w)), the exceptions being labial stops and fricatives (which become palatalized), the bilabial and palatal nasals (which become velarized), and the voiced alveolar pronounced as /[d]/ allophone of pronounced as //l// (which would become alveolarized instead). Additionally, syllabic nasals (where nasalization results in a labialized pronounced as /[ŋ̩kʼ]/ instead) and the syllabic pronounced as //l// (which is always followed by the non-syllabic pronounced as //l//) are never directly labialized. Note that the unvoiced heterorganic doubled articulant fricative pronounced as //fʃ// only occurs labialized (only as pronounced as /[fʃʷ]/).

Due to the inherent bilabial semivowel, labialized consonants never appear before back vowels:

pronounced as /[hʊlɑt͡sʼʷɑ]/ ho latswa ('to taste') > pronounced as /[tʼɑt͡sʼɔ]/ tatso ('flavour')

pronounced as /[hʊt͡sʼʷɑ]/ ho tswa ('to emerge') > pronounced as /[lɪt͡sʼɔ]/ letso ('a derivation')

pronounced as /[hʊnʷɑ]/ ho nwa ('to drink') > pronounced as /[sɪnɔ]/ seno ('a beverage')

pronounced as /[hʊˈɛlɛl̩lʷɑ]/ ho elellwa ('to realise') > pronounced as /[kʼɛlɛl̩lɔ]/ kelello ('the mind')

Tonology

See main article: Sesotho tonology. Sesotho is a tonal language spoken using two contrasting tones: low and high; further investigation reveals, however, that in reality it is only the high tones that are explicitly specified on the syllables in the speaker's mental lexicon, and that low tones appear when a syllable is tonally under-specified. Unlike the tonal systems of languages such as Mandarin, where each syllable basically has an immutable tone, the tonal systems of the Niger–Congo languages are much more complex in that several "tonal rules" are used to manipulate the underlying high tones before the words may be spoken, and this includes special rules ("melodies") which, like grammatical or syntax rules that operate on words and morphemes, may change the tones of specific words depending on the meaning one wishes to convey.

Stress

The word stress system of Sesotho (often called "penultimate lengthening" instead, though there are certain situations where it doesn't fall on the penultimate syllable) is quite simple. Each complete Sesotho word has exactly one main stressed syllable.

Except for the second form of the first demonstrative pronoun, certain formations involving certain enclitics, polysyllabic ideophones, most compounds, and a handful of other words, there is only one main stress falling on the penult.

The stressed syllable is slightly longer and has a falling tone. Unlike in English, stress does not affect vowel quality or height.

This type of stress system occurs in most of those Eastern and Southern Bantu languages which have lost contrastive vowel length.

The second form of the first demonstrative pronoun has the stress on the final syllable. Some proclitics can leave the stress of the original word in place, causing the resultant word to have the stress at the antepenultimate syllable (or even earlier, if the enclitics are compounded). Ideophones, which tend to not obey the phonetic laws which the rest of the language abides by, may also have irregular stress.

There is even at least one minimal pair: the adverb fela ('only') pronounced as /[ˈfɛlɑ]/ has regular stress, while the conjunctive fela ('but') pronounced as /[fɛˈlɑ]/ (like many other conjunctives) has stress on the final syllable. This is certainly not enough evidence to justify making the claim that Sesotho is a stress accent language, though.

Because the stress falls on the penultimate syllable, Sesotho, like other Bantu languages (and unlike many closely allied Niger–Congo languages), tends to avoid monosyllabic words and often employs certain prefixes and suffixes to make the word disyllabic (such as the syllabic nasal in front of class 9 nouns with monosyllabic stems, etc.).

Notes

  1. Other authors may choose to include the labialized consonants as contrastive phonemes, potentially increasing the number by 26 to 75. Labialization does create minimal pairs, as is exemplified by the short passive suffix, but different authors seem to be divided on whether or not these should be counted as authentic phonemes (especially since Sotho–Tswana-type labialization caused by vowel "absorption" is a fairly strange and rare process).

    Besides the passives, there are still numerous minimal pairs differing only in the labialization of a single consonant (note that each of the following pairs has similar tonal patterns):

    pronounced as /[ʀɑlɑ]/ -rala ('design'), versus pronounced as /[ʀʷɑlɑ]/ -rwala ('carry on the head')

    pronounced as /[lɑlɑ]/ -lala ('lie down' [old fashioned or poetic]), versus pronounced as /[lʷɑlɑ]/ -lwala ('be sick' [old fashioned])

    pronounced as /[mʊʀɑ]/ mora ('son'), versus pronounced as /[mʊʀʷɑ]/ morwa ('a Khoisan person')

    pronounced as /[hɑmɑ]/ -hama ('milk an animal'), versus pronounced as /[hʷɑmɑ]/ -hwama ('[of fat] congeal')

    pronounced as /[t͡sʰɑsɑ]/ -tshasa ('smear'), versus pronounced as /[t͡sʰʷɑsɑ]/ -tshwasa ('capture prey')

    pronounced as /[mʊɬɑ]/ mohla ('day'), versus pronounced as /[mʊɬʷɑ]/ mohlwa ('termite')

    Normal consonants and their labialised forms do not contrast before back vowels (that is, a labialized consonant will lose its labialization before a back vowel).

  2. The Sotho–Tswana ejective stops pronounced as //pʼ//, pronounced as //tʼ//, and pronounced as //kʼ// come from the Proto-Bantu *mb, *nd, and *ŋg due to the radical effects of the nasalization process. The Proto-Bantu stops *p, *t, and *k have usually become pronounced as //f//, pronounced as //r//, and pronounced as //x// (pronounced as //ʀ// and pronounced as //h// in modern Sesotho) with *kû becoming pronounced as /[fu]/, and the nasalized forms of these (Proto-Bantu *mp, *nt, and *ŋk) are the two aspirated stops pronounced as //pʰ// and pronounced as //tʰ//, and the aspirated velar affricate pronounced as //k͡xʰ// (pronounced as //x// in most Sesotho speaking communities).

    Note that some Sotho–Tswana languages do have prenasalized consonants, or at least have less strict and varied nasalization rules, but this is almost certainly as a result of influence from neighbouring non-Sotho–Tswana languages.

  3. Strictly speaking, pronounced as //t͡ɬʰ// should be an allophone of pronounced as //ɬ// found only when pronounced as //ɬ// is nasalized. However, possibly due to the mixed origins of Sesotho, there are several instances of pronounced as //t͡ɬʰ// appearing without nasalization (as is the case in Setswana) or of pronounced as //ɬ// failing to nasalize when the nasalizing consonant is not visible (such as when forming polysyllabic class 9 nouns).

    Thus one finds:

    pronounced as /[hʊɬɑhɑ]/ ho hlaha ('to emerge') > class 9 pronounced as /[t͡ɬʰɑhɔ]/ tlhaho ('nature')

    pronounced as /[hʊɬɔm̩pʰɑ]/ ho hlompha ('to respect') > class 9 pronounced as /[ɬɔm̩pʰɔ]/ hlompho ('respect')where the nasalization is applied in the first noun but not the second.

  4. A further collapse occurred in Silozi - which has lost the generally unusual distinction between plain and aspirated consonants. Thus Sesotho pronounced as //ɬ//, pronounced as //t͡ɬʼ//, pronounced as //t͡ɬʰ//, pronounced as //tʼ//, and pronounced as //tʰ// all map to the single Silozi phoneme pronounced as //t//.
  5. Urban varieties of Pedi are currently acquiring clicks as well.
  6. The IPA symbols used for the near-close vowels in this and related articles are different from those that are often used in the literature. Often, the symbols pronounced as //ɨ// and pronounced as //ʉ// are used instead of the standard pronounced as //ɪ// and pronounced as //ʊ//, but they represent the close central unrounded vowel and the close central rounded vowel, respectively, in modern IPA.
  7. There are many historical instances in Sesotho which show an occasional confusion between the phonemes pronounced as //j//, pronounced as //ɦ//, and (no consonant). For example, the verb pronounced as /[ɑhɑ]/ -aha ('build') often appears as pronounced as /[hɑhɑ]/ -haha (cf. Silozi -yaha), though comparison with other languages (Setswana -aga, Nguni -akha, etc.) reveals its true form.

    Other examples include the changing of the original verbal focus marker *-ya- to pronounced as /[ɑ]/ -a-; the second person singular objectival concord (pronounced as /[ʊ]/ -o-, but Setswana -go- and Nguni -ku-); the verb pronounced as /[lɑjɑ]/ -laya ('to correct'); its Proto-Bantu form *-dag- should have given pronounced as /[lɑˈɑ]/ -laa, which does occur as a variant); verbs which end in the form pronounced as /[ijɑ]/ -iya (e.g. pronounced as /[sijɑ]/ -siya 'leave behind', pronounced as /[dijɑ]/ -diya 'cause to fall', etc.) being alternatively rendered as pronounced as /[iˈɑ]/ -ia; pronounced as /[lɪˈɪ]/ lee (egg; Proto-Bantu *di-gi) often appearing as pronounced as /[lɪhɪ]/ lehe; etc. It should also be noted that many verbal derivatives treat verbs ending with pronounced as /[jɑ]/ -ya as if they end with pronounced as /[ɑ]/ -a (that is, the suffix replaces the entire pronounced as /[jɑ]/ -ya, not just the final pronounced as /[ɑ]/ -a).

  8. In Setswana and most Northern Sotho languages these are two different phonemes. The Setswana velar fricative corresponds to the Sesotho glottal fricative, and the velar affricate corresponds to the Sesotho velar fricative/affricate, but before the close vowel pronounced as //u// u Setswana regularly uses the unvoiced glottal fricative.
  9. For completeness, this table uses a narrower (more detailed) transcription of clicks than usual in Bantu languages, but the rest of this article and other articles in the series use the less detailed system of click transcription. See the full consonant table above to see the usual transcriptions.
  10. Historically, in various Bantu languages, this has resulted in palatalization (giving the postalveolar and palatal consonants) and the alveolar fricative pronounced as //s//.
  11. This is not to say that the glottal stop is part of the phoneme inventory of Sesotho, nor is it correct to say that the language has diphthongs or triphthongs (or even longer: pronounced as /[hɑˈʊˈɑˈiˌˈut͡ɬʼʷɑ]/ ha o a e utlwa 'you did not hear it'). Sequences of vowels may be pronounced with hiatus (thus they are not diphthongs), but in fast speech they may simply flow into each other (thus the glottal stop is not a contrastive phoneme).
  12. Historically pronounced as //x// ((kg) was an affricate pronounced as /[k͡xʰ]/ (this still appears as a variation) and was therefore not an exception.

    Some individuals nasalize pronounced as //x// and pronounced as //h// to pronounced as //kʰ// (possibly by analogy with the Setswana hu nasalizing to khu) and sometimes even pronounced as //kʼ// (perhaps due to the unstable nature of the voiced pronounced as /[ɦ]/, which is barely audible and may cause the syllable to sound as if it does not have an onset). Though this is certainly not to be considered standard, it is an understandable reaction to the frication ("weakening") of the affricate pronounced as /[k͡xʰ]/.

  13. Strangely, there are no polysyllabic verbs beginning with pronounced as //j//. The verb -ya pronounced as /[jɑ]/ cannot be used with an objectival concord (it may have an intransitive, locative, or instrumental import and an idiomatic passive, but is not transitive) and the approximant is removed in verbal derivations. There are also no adjectives beginning with pronounced as //y// or any other parts of speech which may be nasalized, so there are no instances of pronounced as //j// being nasalized.

    Note that if a pronounced as //j// were to nasalize by getting a pronounced as //kʼ// in front of it, the phonotactic restrictions and phonetic rules of the language would not allow the combination *pronounced as //kʼj//. In Silozi, which has many verbs with word-initial pronounced as //j// (many of which correspond to Sesotho vowel verbs), nasalization of pronounced as //y// results in pronounced as //t͡ʃ//, which has collapsed from original Sotho–Tswana pronounced as //ʒ//, pronounced as //t͡ʃʼ//, and pronounced as //t͡ʃʰ//. Since nasalization removes voicing and frication (and Sesotho palatalization preserves aspiration), one may then deduce that if Sesotho pronounced as //j// were to nasalize it would most probably become pronounced as //t͡ʃʼ// tj.

  14. This second change is very strange and does not occur in most other major Sotho–Tswana languages.
  15. The symbols used in this and related articles for the raised allophones of the near-close vowels are non-standard, though there really aren't any standard alternatives...

    The difficulty lies in acknowledging the role of ATR in this process. In the past, when they were recognised at all, they were often viewed as simply an extra vowel height, and the choice of symbols differed between authors since standard IPA does not recognise the possibility of so many contrastive close vowel heights.

  16. In Sesotho, when a consonant is followed by a vowel, the shape of the lips is changed to resemble the shape of the vowel while the consonant is being pronounced (or even before, when the syllable is the first after a pause) with the shaping being more severe the higher the vowel height. Thus, when a consonant is followed by a back vowel the lips are rounded when pronouncing the consonant, and the lips are spread when pronouncing a consonant followed by a front vowel. Labialization may be explained by saying that, for some reason, the lips are rounded in anticipation of a back vowel that is never pronounced.

    This also explains why labialization disappears before back vowels. Since the lips will already be rounded anyway in anticipation of the following vowel, there is no way to distinguish between a labialized consonant before a back vowel and a normal consonant before a back vowel (this is similar to the situation in English where pronounced as /link/ - written as (wh) - is pronounced pronounced as //h// in words such as whom, whole, and whore).

    Note that it is also possible for labialization to simply disappear, even if any other modification of the consonant caused as a side-effect of labialization remains. One example is the tentative evolution of modern Sesotho pronounced as /[ɲ̩t͡ʃʼɑ]/ ntja ('dog') from Proto-Bantu *N-bua:

    Proto-Bantu *N-bua > (nasal homogeneity) *pronounced as /m̩bua/ > (labialization) *pronounced as /m̩bʷa/ > (palatalization) *pronounced as /m̩pʃʷa/ > (loss of labialization + gaining of ejective quality) *pronounced as /m̩pʃʼa/ (as found in Northern Sotho) > (heterorganic simplification + nasal homogeneity) modern pronounced as /[ɲ̩t͡ʃʼɑ]/

References

. Doke . Clement Martyn . Clement Martyn Doke . Mofokeng . S. Machabe . 1974 . Textbook of Southern Sotho Grammar . 3rd . Cape Town . Longman Southern Africa . 0-582-61700-6.