Subject–object–verb word order explained

In linguistic typology, a subject–object–verb (SOV) language is one in which the subject, object, and verb of a sentence always or usually appear in that order. If English were SOV, "Sam oranges ate" would be an ordinary sentence, as opposed to the actual Standard English "Sam ate oranges" which is subject–verb–object (SVO).

The term is often loosely used for ergative languages like Adyghe and Basque that really have agents instead of subjects.

Incidence

Among natural languages with a word order preference, SOV is the most common type (followed by subject–verb–object; the two types account for more than 87% of natural languages with a preferred order).[1]

Languages that have SOV structure include

Standard Chinese is generally SVO but common constructions with verbal complements require SOV or OSV. Some Romance languages are SVO, but when the object is an enclitic pronoun, word order allows for SOV (see the examples below). German and Dutch are considered SVO in conventional typology and SOV in generative grammar. They can be considered SOV but with V2 word order as an overriding rule for the finite verb in main clauses, which results in SVO in some cases and SOV in others. For example, in German, a basic sentence such as "" ("I say something about Karl") is in SVO word order. Non-finite verbs are placed at the end, however, since V2 only applies to the finite verb: "" ("I want to say something about Karl"). In a subordinate clause, the finite verb is not affected by V2, and also appears at the end of the sentence, resulting in full SOV order: "" (Word-for-word: "I say that Karl a belt bought has.")

A rare example of SOV word order in English is "I (subject) thee (object) wed (verb)" in the wedding vow "With this ring, I thee wed."[2]

Properties

SOV languages have a strong tendency to use postpositions rather than prepositions, to place auxiliary verbs after the action verb, to place genitive noun phrases before the possessed noun, to place a name before a title or honorific ("James Uncle" and "Johnson Doctor" rather than "Uncle James" and "Doctor Johnson") and to have subordinators appear at the end of subordinate clauses. They have a weaker but significant tendency to place demonstrative adjectives before the nouns they modify. Relative clauses preceding the nouns to which they refer usually signals SOV word order, but the reverse does not hold: SOV languages feature prenominal and postnominal relative clauses roughly equally. SOV languages also seem to exhibit a tendency towards using a time–manner–place ordering of adpositional phrases.

In linguistic typology, one can usefully distinguish two types of SOV languages in terms of their type of marking:

  1. dependent-marking has case markers to distinguish the subject and the object, which allows it to use the variant OSV word order without ambiguity. This type usually places adjectives and numerals before the nouns they modify, and is exclusively suffixing without prefixes. SOV languages of this first type include Japanese and Tamil.
  2. head-marking distinguishes subject and object by affixes on the verb rather than markers on the nouns. It also differs from the dependent-marking SOV language in using prefixes as well as suffixes, usually for tense and possession. Adjectives in this type are much more verb-like than in dependent-marking SOV languages, and hence they usually follow the nouns. In most SOV languages with a significant level of head-marking or verb-like adjectives, numerals and related quantifiers (like "all", "every") also follow the nouns they modify. Languages of this type include Navajo and Seri.

In practice, of course, the distinction between these two types is far from sharp. Many SOV languages are substantially double-marking and tend to exhibit properties intermediate between the two idealised types above.

Many languages that have shifted to SVO word order from earlier SOV retain (at least to an extent) the properties: for example, the Finnish language (high usage of postpositions etc.)

Examples

Afroasiatic languages

The Ethio-Semitic, Cushitic and Omotic languages generally exhibit SOV order.

Somali

Somali generally uses the subject–object–verb structure when speaking formally.

Tigrinya

Basque

Basque in short sentences, usually, subject or agent–object–verb; in long sentences, usually, subject or agent-verb-objects:

Dravidian languages

The Dravidian languages commonly exhibit or prefer SOV order.

Malayalam

Tamil

Tamil being a strongly head-final language, the basic word-order is SOV. However, since it is highly inflected, word order is flexible and is used for pragmatic purposes. That is, fronting a word in a sentence adds emphasis on it; for instance, a VSO order would indicate greater emphasis on the verb, the action, than on the subject or the object. However, such word-orders are highly marked, and the basic order remains SOV.

Telugu

Georgian

The Georgian language is not extremely rigid with regards to word order, but is typically either SOV or SVO.

Indo-European languages

SOV word order is quite common among Indo-European languages, leading to a common hypothesis that this reflects the original preferred word order of the ancestral Proto-Indo-European language. However, the question remains unsettled.

Albanian

Albanian has free word order, but generally prefers SVO. SOV occurs only in poetic language.

Armenian

Armenian generally prefers SOV.

Germanic languages

Linguistic consensus holds that the Proto-Germanic language had free word order but preferred SOV. While some Germanic languages (including English and most North Germanic languages) have transitioned to SVO, SOV remains a feature of some major modern Germanic languages, including German and Dutch. However, these modern SOV Germanic languages also exhibit V2 word order, which supersedes the "default" SOV such that many sentences are rendered subject-verb-object.

Dutch

Dutch is SOV combined with V2 word order. The non-finite verb (infinitive or participle) remains in final position, but the finite (i.e. inflected) verb is moved to the second position. Simple verbs look like SVO, non-finite verbs (participles, infinitives) and compound verbs follow this pattern:

Pure SOV order is found in subordinate clauses:

German

German is SOV combined with V2 word order. The non-finite verb (infinitive or participle) remains in final position, but the finite (i.e. inflected) verb is moved to the second position. Simple verbs look like SVO, compound verbs follow this pattern:

The word order changes also depending on whether the phrase is a main clause or a dependent clause. In dependent clauses, the word order is always entirely SOV (cf. also Inversion):

Gothic

The Gothic language, an extinct East Germanic language, had free word order, but SOV constructions were common.

Greek (Classical)

Ancient Greek had free word order but generally preferred SOV sentences:

This is distinct from Modern Greek, where SVO is preferred.

Indo-Aryan languages

Vedic Sanskrit, the oldest known of the Indo-Aryan languages, was an inflected language and very flexible in word order, allowing all possible word combinations. Its descendant, Classical Sanskrit, shared this feature but generally preferred SOV sentences.

Most later Indo-Aryan languages continue to prefer SOV word order, for example:

Bengali

Hajong

re is a particle that indicates the accusative case and 'sei' indicates past tense declarative. Here, e is pronounced as the 'i' in 'girl' and 'ei' is pronounced as the 'ay' in 'say'.

Hindi

Marathi

Nepali

Odia

Urdu

This preference is not fixed in all Indo-Aryan languages. Punjabi, for instance, may be characterised as following a Subject—Object—Verb typology overall, but some flexibility is permitted, and this tendency does not follow in sentences involving personal pronouns. Examples are shown here in both Shahmukhi (top, right-to-left) and Gurmukhi (bottom, left-to-right). The word forms used reflect those typical of spoken language. For Shahmukhi, vocalised forms with vowel diacritics have been used to explicitly indicate the forms used; in typical writing these are omitted in most words where regular patterns allow this information to be inferred contextually.

The following sentence exhibits the typical SOV word order tendency. The verb phrase is in retrospective perfect participle form, indicating completion of the action, and takes on the feminine plural suffixes in agreement with the gender and number of the object. The subject here is a masculine plural form; in this context it does not require agreement from the verb.

By contrast, in the following sentence the person involved, referred to by a first-person pronoun, is the object rather than the subject. The significance of people as a semantic category takes precedent over the SOV word order tendency, and the person is typically first even in sentences where that person is the object. The pronoun "mainū̃" has the postposition "nū̃" agglutinated to it, approximately meaning "to." Abstract concepts like desires and emotions typically come "to" people as agentive subjects.

The copula in Punjabi is extraverbal in function. While it can constitute the predicate of a sentence on its own, it does not enter the verb phrase when used alongside a full lexical verb. Instead, it acts as a marker of existence remote to or near to the situation. Some western dialects such as Pothohari have forms of the copula to indicate occurrence of a situation in the future.

However, some Indo-Aryan languages exhibit V2 word order in combination with SOV, most prominently Kashmiri. The non-finite verb (infinitive or participle) remains in final position, but the finite (i.e. inflected) part of the verb appears in second position. Simple verbs look like SVO, whereas auxiliated verbs are discontinuous and adhere to this pattern:

Given that Kashmiri is a V2 language, if the word tsũũţh 'apple' comes first then the subject kuur 'girl' must follow the auxiliary chhi 'is': tsũũţh chhi kuur khyevaan [Lit. "Apples is girl eating."]

Also, the word order changes depending on whether the phrase is in a main clause or in certain kinds of dependent clause. For instance, in relative clauses, the word order is SOVAux:

Main clause + Subordinate Clauseمیے ان سوہ کور یوس ثونٹہ کہیوان چہے
Transcription=> mye eny swa kuur => ywas tsũũţh khyevaan chhi
Gloss=> I brought that girl => who apples eating is
PartsMain clause => Subject Verb Object Relative clause => Subject Object Verb Auxiliary
TranslationI brought the girl who is eating apples.

Iranian languages

The Iranian languages almost uniformly exhibit SOV word order:

Kurdish (Kurmanji):

Kurdish (Sorani):

Ossetian:

Pashto:

Persian:

Talysh:

The Zaza language usually uses a subject–object-verb structure,[3] but it sometimes uses subject-verb-object too.

Italic languages

Latin

See main article: Latin word order. Classical Latin was an inflected language and had a very flexible word order and sentence structure, but the most usual word order in formal prose was SOV.

Again, there are multiple valid translations (such as "a slave") that do not affect the overall analysis.

Romance languages

Although their common ancestor Latin had free word order and preferred SOV, the modern Romance languages lost the Latin declension that enabled free word order and in general require subject-verb-object structures. However, remnants of SOV remain, particularly the clitic object pronouns common in Romance grammar. For instance, in French:

And Portuguese:

And in Spanish:

Contrast this with the SVO structure of a sentence with an explicit object (again in Spanish):

The SOV tendency can also be seen when using auxiliary verbs, e.g. in Italian:

SOV also appears in Portuguese using a temporal adverb, optionally with the negative:

And in a suffix construction for the future and conditional tenses:

SVO form: or

Japanese

The basic principle in Japanese word order is that modifiers come before what they modify. For example, in the sentence "Japanese: こんな夢を見た。" (Konna yume o mita),[4] the direct object "こんな夢" (this sort of dream) modifies the verb "見た" (saw, or in this case had). Beyond this, the order of the elements in a sentence is relatively free. However, because the topic/subject is typically found in sentence-initial position and the verb is typically in sentence-final position, Japanese is considered an SOV language.[5]

A closely related quality of the language is that it is broadly head-final.[6]

Korean

–Korean: /–Korean: -ga/-i is a particle that indicates the subject. –Korean: /–Korean: -(r)eul is a particle that indicates the object.Korean: na "I" is changed to Korean: nae- before –Korean: -ga, and the verb stem Korean: yeol- is changed to Korean: yeo- before –Korean: ㄴ다 -nda.

Quechua

Quechuan languages have standard SOV word order. The following example is from Bolivian Quechua.

Sino-Tibetan languages

SOV is believed to have been the "default" order of the protolanguage of the Sino-Tibetan family. Most Sino-Tibetan languages exhibit SOV order; however, the largest sub-branch of the family, the Sinitic or Chinese languages, are uniformly SVO, with some SOV-derived features.

Burmese

Burmese is an analytic language.

Chinese

Generally, Chinese varieties all feature SVO word order. However, especially in Standard Mandarin, SOV is tolerated as well. There is even a special particle 把 (bǎ) used to form an SOV sentence.[7]

The following example that uses 把 is controversially labelled as SOV. 把 may be interpreted as a verb, meaning "to hold". However, it does not mean to hold something literally or physically. Rather, the object is held figuratively, and then another verb is acted on the object.

SOV structure is widely used in railway contact in order to clarify the objective of the order.

Yi

Tungusic languages

The Tungusic languages exhibit SOV word order by default.

Manchu

Turkic languages

The Turkic languages all exhibit flexibility in word order, so any order is possible. However, the SOV order is the "default" one that does not connote particular emphasis on any part of the sentence; alternate orders are possible, but are used for emphasis. For instance, in Turkish, the following is the "default" way of saying "Murat ate the apple":

However, this sentence could also be constructed as OSV (Elmayı Murat yedi.), OVS (Elmayı yedi Murat.), VSO (Yedi Murat elmayı.), VOS (Yedi elmayı Murat.), or SVO (Murat yedi elmayı.), to indicate the relative importance of the subject, object, or the verb.

Similarly, in Uzbek this SOV sentence is neutral:

(The marker "ga" is a dative case marker for the object that precedes it.)

But the sentence can be changed into OSV as well ("Xivaga Anvar ketdi") to change the emphasis ("It was Anvar who went to Khiva").

The same holds in Kazakh, where the below is neutral:

But an OSV sentence (Кітапті Дастан оқыды) can be used to change the emphasis.

Other examples of SOV sentences in Turkic:

Azerbaijani

Kyrgyz

Uralic languages

The "idealized" profile of the Uralic languages has subject-verb-object word order. However, some Uralic languages, including the most widely spoken (Hungarian) prefer SOV.

The protolanguage of the Uralic language family is understood to have exhibited SOV order.[8] [9]

Hungarian

Hungarian word order is free, although the meaning slightly changes. Almost all permutations of the following sample are valid, but with stress on different parts of the meaning.

Udmurt

Zarma

See also

Notes and References

  1. Book: Crystal, David . David Crystal . The Cambridge Encyclopedia of Language . 2nd . 1997 . Cambridge University Press . Cambridge . 0-521-55967-7.
  2. Andreas Fischer, "'With this ring I thee wed': The verbs to wed and to marry in the history of English". Language History and Linguistic Modelling: A Festschrift for Jacek Fisiak on his 60th Birthday. Ed. Raymond Hickey and Stanislaw Puppel. Trends in Linguistics, Studies and Monographs 101 (Berlin, New York: Mouton de Gruyter, 1997), pp.467-81
  3. Ahmadi, S. (2020, December). Building a Corpus for the Zaza–Gorani Language Family. In Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects (pp. 70-78).
  4. Book: Sōseki, Natsume . Ten Nights of Dreams . July 26, 1988 . . 4-480-02170-1 . ja . ja:夢十夜 . Ten Nights of Dreams . Natsume Sōseki . First published July 25, 1908 . Aozora Bunko.
  5. Book: Makino . Seiichi . Tsutsui . Michio . March 1999 . First published March 1986 . A Dictionary of Basic Japanese Grammar . The Japan Times, Ltd. . 16 . 4-7890-0454-6.
  6. Head-Initial Constructions in Japanese . Siegel . Melanie . Bender . Emily M. . 2004 . 244–260 . Proceedings of the 11th International Conference on Head-Driven Phrase Structure Grammar, Center for Computational Linguistics, Katholieke Universiteit Leuven . Müller . Stefan . . CSLI Publications .
  7. Web site: Understanding 把 (bǎ) in ten minutes . ChineseBoost.com . 28 February 2015 . https://web.archive.org/web/20220121201738/https://www.chineseboost.com/grammar/ba-ten-minutes/ . 2022-01-21.
  8. Book: The Oxford guide to the Uralic languages . 2022 . Oxford University Press . 978-0-19-876766-4 . Bakró-Nagy . Marianne . Oxford guides to the world's languages . Oxford . As regards constituent order, Proto-Uralic was most obviously an SOV language with postpositions. . Laakso . Johanna . Skribnik . Elena K..
  9. Janhunen, Juha. 1982. On the structure of Proto-Uralic. Finno-Ugrische Forschungen 44. 23–42. Cited in Katalin É. Kiss. 2023. The (non-)finiteness of subordination correlates with basic word order: Evidence from Uralic.