A language isolate is a language that has no demonstrable genetic relationship with any other languages.[1] Basque in Europe, Ainu in Asia, Sandawe in Africa, Haida and Zuni in North America, Kanoê in South America, Tiwi in Australia and Burushaski in Pakistan are all examples of language isolates. The exact number of language isolates is yet unknown due to insufficient data on several languages.[2]
One explanation for the existence of language isolates is that they might be the last remaining branch of a larger language family. The language possibly had relatives in the past that have since disappeared without being documented. Another explanation for language isolates is that they developed in isolation from other languages. This explanation mostly applies to sign languages that have arisen independently of other spoken or signed languages.[3]
Some languages once seen as isolates may be reclassified as small families if some of their dialects are judged to be sufficiently different from the standard to be seen as different languages. Examples include Japanese and Georgian: Japanese is now part of the Japonic language family with the Ryukyuan languages, and Georgian is the main language in the Kartvelian language family. There is a difference between language isolates and unclassified languages, but they can be difficult to differentiate when it comes to classifying extinct languages. If such efforts eventually do prove fruitful, a language previously considered an isolate may no longer be considered one, as happened with the Yanyuwa language of northern Australia, which has been placed in the Pama–Nyungan family.[4] Since linguists do not always agree on whether a genetic relationship has been demonstrated, it is often disputed whether a language is an isolate.
A genetic relationship is when two different languages are descended from a common ancestral language.[5] This is what makes up a language family, which is a set of languages for which sufficient evidence exists to demonstrate that they descend from a single ancestral language and are therefore genetically related. For example, English is related to other Indo-European languages and Mandarin Chinese is related to other Sino-Tibetan languages. By this criterion, each language isolate constitutes a family of its own.
In some situations, a language with no ancestor can arise. This frequently happens with sign languages—most famously in the case of Nicaraguan Sign Language, where deaf children with no language were placed together and developed a new language.[6]
Caution is required when speaking of extinct languages as language isolates. Despite their great age, Sumerian and Elamite can be safely classified as isolates, as the languages are well enough documented that, if modern relatives existed, they would be recognizably related.[7] A language thought to be an isolate may turn out to be related to other languages once enough material is recovered, but this is unlikely for extinct languages whose written records have not been preserved.
Many extinct languages are very poorly attested, which may lead to them being considered unclassified languages instead of language isolates. This occurs when linguists do not have enough information on a language to classify it as either a language isolate or as a part of another language family.
Unclassified languages are different from language isolates in that they have no demonstrable genetic relationships to other languages due to a lack of sufficient data. In order to be considered a language isolate, a language needs to have sufficient data for comparisons with other languages through methods of historical-comparative linguistics to show that it does not have any genetic relationships.
Many extinct languages and living languages today are very poorly attested, and the fact that they cannot be linked to other languages may be a reflection of our poor knowledge of them. Hattic, Gutian, and Kassite are all considered unclassified languages, but their status is disputed by a minority of linguists.[8] Many extinct languages of the Americas such as Cayuse and Majena may likewise have been isolates.[9] Several unclassified languages could also be language isolates, but linguists cannot be sure of this without sufficient evidence.
A number of sign languages have arisen independently, without any ancestral language, and thus are language isolates. The most famous of these is the Nicaraguan Sign Language, a well documented case of what has happened in schools for the deaf in many countries.[6] In Tanzania, for example, there are seven schools for the deaf, each with its own sign language with no known connection to any other language.[10] Sign languages have also developed outside schools, in communities with high incidences of deafness, such as Kata Kolok in Bali, and half a dozen sign languages of the hill tribes in Thailand including the Ban Khor Sign Language.[11] [12]
These and more are all presumed isolates or small local families, because many deaf communities are made up of people whose hearing parents do not use sign language, and have manifestly, as shown by the language itself, not borrowed their sign language from other deaf communities during the recorded history of these languages.
Some languages once seen as isolates may be reclassified as small families because their genetic relationship to other languages has been established. This happened with Japanese and Ryukyuan languages, Korean and Koreanic languages, Atakapa and Akokisa languages, Tol and Jicaque of El Palmar languages, and the Xincan Guatemala language family in which linguists have grouped the Chiquimulilla, Guazacapán, Jumaytepeque, and Yupiltepeque languages.
Below is a list of known language isolates, arranged by continent, along with notes on possible relations to other languages or language families.
The status column indicates the degree of endangerment of the language, according to the definitions of the UNESCO Atlas of the World's Languages in Danger.[13] "Vibrant" languages are those in full use by speakers of every generation, with consistent native acquisition by children. "Vulnerable" languages have a similarly wide base of native speakers, but a restricted use and the long-term risk of language shift. "Endangered" languages are either acquired irregularly or spoken only by older generations. "Moribund" languages have only a few remaining native speakers, with no new acquisition, highly restricted use, and near-universal multilingualism. "Extinct" languages have no native speakers, but are sufficiently documented to be classified as isolates.
With few exceptions, all of Africa's languages have been gathered into four major phyla: Afroasiatic, Niger–Congo, Nilo-Saharan and Khoisan.[14] However, the genetic unity of some language families, like Nilo-Saharan,[15] is questionable, and so there may be many more language families and isolates than currently accepted. Data for several African languages, like Kwisi, are not sufficient for classification. In addition, Jalaa, Shabo, Laal, Kujargé, and a few other languages within Nilo-Saharan and Afroasiatic-speaking areas may turn out to be isolates upon further investigation. Defaka and Ega are highly divergent languages located within Niger–Congo-speaking areas, and may also possibly be language isolates.[16]
Language | data-sort-type=number | Speakers | Status | Countries | Comments | |
---|---|---|---|---|---|---|
Bangime | 2,000 | data-sort-value=1 | Vibrant | Mali | Spoken in the Bandiagara Escarpment. Used as an anti-language.[17] | |
Hadza | 1,000 | data-sort-value=2 | Vulnerable | Tanzania | Spoken on the southern shore of Lake Eyasi in the southwest of Arusha Region. Once listed as an outlier among the Khoisan languages.[18] Language use is vigorous, though there are fewer than 1,000 speakers.[19] | |
Jalaa | Extinct | Nigeria | Strongly influenced by Dikaka, but most vocabulary is very unusual.[20] | |||
Laal | 750 | data-sort-value=4 | Moribund | Chad | Spoken in three villages along the Chari River in Moyen-Chari Region. Poorly known. Also known as Gori. Possibly a distinct branch of Niger–Congo, Chadic of the Afroasiatic languages, or mixed. | |
Sandawe | 60,000 | data-sort-value=1 | Vibrant | Tanzania | Spoken in the northwest of Dodoma Region. Tentatively linked to the Khoe languages. | |
Shabo | 400 | data-sort-value=3 | Endangered | Ethiopia | Spoken in Anderaccha, Gecha, and Kaabo of the Southern Nations, Nationalities, and Peoples' Region. Linked to the Gumuz and Koman families in the proposed Komuz branch of the Nilo-Saharan languages.[21] |
Language | data-sort-type="number" | Speakers | Status | Countries | Comments |
---|---|---|---|---|---|
Burushaski | 300,000[22] | data-sort-value=2 | Vulnerable | Pakistan | Spoken in the Yasin Valley and Hunza Valley of Gilgit-Baltistan. Linked to Caucasian languages,[23] Indo-European,[24] [25] and Na-Dene languages[26] [27] in various proposals. |
Elamite | Extinct | Iran | Formerly spoken in Elam, along the northeast coast of the Persian Gulf. Attested from around 2800 BC to 300 BC.[28] Some propose a relationship to the Dravidian languages (see Elamo-Dravidian), but this is not well-supported.[29] | ||
Kusunda | At least 1 (2023)[30] | data-sort-value=4 | Moribund | Nepal | Spoken in Gandaki Province. The recent discovery of a few speakers shows that it is not demonstrably related to anything else.[31] |
Puroik[32] | 20,000 | data-sort-value=3 | Vulnerable | India | |
Nihali | 2,000 | data-sort-value=3 | Endangered | India | Also known as Nahali. Spoken in northeastern Maharashtra and southwestern Madhya Pradesh, along the Tapti River. Strong lexical Munda influence from Korku.[33] Used as anti-language by speakers.[34] |
Nivkh | 200 | data-sort-value=4 | Moribund | Russia | Also known as Gilyak. Spoken in the lower Amur River basin and in the northern part of Sakhalin. Dialects sometimes considered two languages.[35] Has been linked to Chukotko-Kamchatkan languages.[36] |
Sumerian | Extinct | Iraq | Spoken in Mesopotamia until around 1800 BC, but used as a classical language until 100 AD.[37] Long-extinct but well-attested language of ancient Sumer. | ||
Tambora | Extinct | Indonesia | Poorly documented, extinct since the 1815 eruption of Mount Tambora, basic vocabulary points towards it being an isolate. |
Current research considers that the "Papuasphere" centered in New Guinea includes as many as 37 isolates.[38] (The more is known about these languages in the future, the more likely it is for these languages to be later assigned to a known language family.) To these, one must add several isolates found among non-Pama-Nyungan languages of Australia:
Language | Speakers | Status | Countries | Comments | |
---|---|---|---|---|---|
Abinomn | 300 | Vibrant | Indonesia | Spoken in the far north of New Guinea. Also known as Bas or Foia. Language is considered safe by UNESCO but endangered by Ethnologue.[39] | |
Anêm | 800 | Papua New Guinea | Spoken on the northwest coast of New Britain.[40] Perhaps related to Yélî Dnye and Ata.[41] | ||
Ata | 2,000 | Spoken in the central highlands of New Britain. Also known as Wasi. Perhaps related to Yélî Dnye and Anem. | |||
Busa | 370 | Spoken in Sandaun Province, northwestern Papua New Guinea. Added to Senu River.[42] | |||
Giimbiyu | Extinct | Australia | Spoken in the northern part of Arnhem Land until the early 1980s. Sometimes considered a small language family consisting of Mengerrdji, Urningangk and Erre.[43] Part of a proposal for the undemonstrated Arnhem Land language family. | ||
Kol | 4,000 | data-sort-value=1 | Vibrant | Papua New Guinea | Spoken in the northeastern part of New Britain. Possibly related to the poorly known Sulka, or the Baining languages, suggested as part of the East Papuan languages.[44] [45] |
Kuot | 2,400 | data-sort-value=2 | Vulnerable | Papua New Guinea | Spoken on New Ireland. Also known as Panaras. Suggested to form part of the East Papuan family. |
Malak-Malak | 10 | data-sort-value=4 | Moribund | Australia | Spoken in northern Australia. Often considered part of one Northern Daly family together with Tyeraity. Used to be considered genetically related to the Wagaydyic languages, but nowadays they are considered genetically distinct.[46] |
Murrinh-patha | 1,973 | data-sort-value=1 | Vibrant | Spoken on the eastern coast of Joseph Bonaparte Gulf in the Top End. The proposed linkage to Ngan'gityemerri in one Southern Daly family[47] is generally accepted to be valid. | |
Mpur | 5000 | Vibrant | Indonesia | Spoken in the Mpur and Amberbaken Districts, Tambrauw Regency on the north coast of the Bird's Head Peninsula. | |
Ngan'gityemerri | 26 | data-sort-value="4" | Moribund | Australia | Spoken in the Top End along the Daly River. The proposed linkage to Murrinh-patha in one Southern Daly family is generally accepted to be valid. |
Pyu | 250 | Vibrant | Papua New Guinea | Spoken in Green River Rural LLG in Sandaun Province, near the Indonesian border. Linked to neighboring Left May and Amto-Musan in a proposed Arai-Samaia family.[48] | |
Sulka | 2,500–3,000 | data-sort-value=1 | Vibrant | New Britain, Papua New Guinea | Possible language isolate spoken across the eastern end of New Britain. Poorly attested. Suggested to form part of the East Papuan family. |
Tayap | >50 | data-sort-value=4 | Moribund | Papua New Guinea | Formerly spoken in the village of Gapun. Links to Lower Sepik languages and Torricelli languages have been explored, but the general consensus among linguists is that it is an isolate unrelated to surrounding languages.[49] |
Tiwi | 2,040 | data-sort-value=2 | Vulnerable | Australia | Spoken in the Tiwi Islands in the Timor Sea. Traditionally Tiwi is polysynthetic, but the Tiwi spoken by younger generations is not.[50] |
Wagiman | 11 | Moribund | Spoken in the southern part of the Top End. May be distantly related to the Yangmanic languages,[51] which might in turn be a member of the Macro-Gunwinyguan family,[52] but neither link has been demonstrated. | ||
Wardaman | 50 | Spoken in the southern part of the Top End. The extinct and poorly attested Dagoman and Yangman dialects are sometimes treated as separate languages, forming a Yangmanic family, to which Wagiman may be distantly related. Possibly a member of the Macro-Gunwinyguan family, but this has yet to be demonstrated. |
Language | Speakers | Status | Countries | Comments | |
---|---|---|---|---|---|
Alsea | Extinct | United States | Poorly attested. Spoken along the central coast of Oregon until the early 1950s.[58] Sometimes regarded as two separate languages. Often included in the Penutian hypothesis in a Coast Oregon Penutian branch.[59] | ||
Atakapa | Spoken on the Gulf coast of eastern Texas and southwestern Louisiana until the early 1900s. Often linked to Muskogean in a Gulf hypothesis.[60] | ||||
Chimariko | Spoken in northern California until the 1950s.[61] Part of the Hokan hypothesis.[62] | ||||
Chitimacha | Well-attested. Spoken along the Gulf coast of southeastern Louisiana until 1940.[63] Possibly in the Totozoquean family of Mesoamerica. | ||||
Coahuilteco | United States, Mexico | Spoken in southern Texas and northeastern Mexico until the 1700s. Part of the Pakawan hypothesis,[64] has been linked to the hypothesised Hokan languages in a larger group.[65] | |||
Cuitlatec | Mexico | Spoken in northern Guerrero until the 1960s.[66] Has been proposed to be part of Macro-Chibchan[67] and Uto-Aztecan. | |||
Esselen | United States | Poorly known. Spoken in the Big Sur region of California until the early 1800s. Part of the Hokan hypothesis.[68] | |||
Haida | 24 | data-sort-value=4 | Moribund | Canada, United States | Spoken in the Haida Gwaii archipelago off the northwest coast of British Columbia, and the southern islands of the Alexander Archipelago in southeastern Alaska. Some proposals connect it to the Na-Dené languages, but these have fallen into disfavor.[69] |
Huave | 20,000 | data-sort-value=3 | Endangered | Mexico | Spoken in the Isthmus of Tehuantepec, in the southeast of Oaxaca state. Has been linked to various language families, but is still generally considered an isolate.[70] |
Karuk | 12 | data-sort-value=4 | Moribund | United States | Spoken along the Klamath River in northwestern California. Part of the Hokan hypothesis, but little evidence for this. |
Keres | 13,190 | data-sort-value=3 | Endangered | Spoken in several pueblos throughout New Mexico, including Cochiti and Acoma Pueblos. Has two main dialects: Eastern and Western. Sometimes those two dialects are separated into languages in a Keresan family.[71] | |
Kutenai | 345 | data-sort-value=4 | Moribund | Canada, United States | Spoken in the Rockies of northeastern Idaho, northwestern Montana and southeastern British Columbia. Attempts have been made to place it in a Macro-Algic or Macro-Salishan family, but these have not gained significant support. |
Natchez | Extinct | United States | Spoken in southern Mississippi and eastern Louisiana until 1957.[72] Often linked to Muskogean in a Gulf hypothesis.[73] Attempts at revival have produced six people with some fluency.[74] | ||
Purépecha | 140,000 | data-sort-value=3 | Endangered | Mexico | Spoken in the north of Michoacán state. Language of the ancient Tarascan kingdom. Sometimes regarded as two languages. |
Salinan | Extinct | United States | Spoken along the south-central coast of California. Part of the Hokan hypothesis.[75] | ||
Seri | 720 | data-sort-value=2 | Vulnerable | Mexico | Spoken along the coast of the Gulf of California, in the southwest of Sonora state. Part of the Hokan hypothesis.[76] |
Siuslaw | Extinct | United States | Spoken on the southwest coast of Oregon until 1960. Likely related to Alsea, Coosan languages, or possibly the Wintuan languages. Part of the Penutian hypothesis. | ||
Takelma | Spoken in western Oregon until mid 20th century.[77] Part of the Penutian hypothesis. | ||||
Timucua | Well attested. Spoken in northern Florida and southern Georgia until the mid- to late 1700s. Briefly spoken in Cuba by a migrant community established in 1763. A connection with the poorly known Tawasa language has been suggested, but this may be a dialect.[78] | ||||
Tonkawa | Spoken in central and northern Texas until the early 1940s. | ||||
Tunica | Spoken in western Mississippi, northeastern Louisiana, and southeastern Arkansas until 1948. Attempts at revitalization have produced 32 second-language speakers. | ||||
Washo | 20 | data-sort-value=4 | Moribund | Spoken along the Truckee River in the Sierra Nevada of eastern California and northwestern Nevada. Part of the Hokan hypothesis.[79] | |
Yana | Extinct | Well-attested. Spoken in northern California until 1916. Part of the Hokan hypothesis.[80] | |||
Yuchi | Extinct | Spoken in Oklahoma, but formerly spoken in eastern Tennessee. A connection to the Siouan languages has been proposed.[81] The last native speaker passed away in 2021, but there is an ongoing revitalization project that has trained a small number of L2s. | |||
Zuni | 9,620 | data-sort-value=2 | Vulnerable | Spoken in Zuni Pueblo in northwestern New Mexico. Links to Penutian[82] and Keres[83] have been proposed. |
Language | Speakers | Status | Countries | Comments | ||
---|---|---|---|---|---|---|
Aikanã | 200 | Endangered | Brazil | Spoken in the Amazon of eastern Rondônia. Links to Kanoê and Kwaza have been tentatively proposed.[84] Arawakan has been suggested. | ||
Andoque | 370 | Colombia, Peru | Spoken on the upper reaches of the Japurá River. Extinct in Peru. Possibly Witotoan.[85] | |||
Betoi | Extinct | Venezuela | Spoken in the Apure River basin near the Colombian border until the 18th century. Paezan has been suggested. | |||
Candoshi-Shapra | 1,100 | data-sort-value="3" | Endangered | Peru | Spoken along the Chapuli, Huitoyacu, Pastaza, and Morona river valleys in southwestern Loreto. Has been linked to various language families, but no agreement exists on its classification.[86] | |
Canichana | Extinct | Bolivia | Spoken in the Llanos de Moxos region of Beni Department until around 2000. Connections with various language families have been proposed, none widely accepted.[87] | |||
Cayuvava | 4 | data-sort-value=4 | Moribund | Spoken in the Amazon west of Mamore River, north of Santa Ana del Yacuma in the Beni Department.[88] | ||
Chimane | 5,300 | data-sort-value=2 | Vulnerable | Spoken along the Beni river in Beni Department. Also spelled Tsimané. Sometimes split into multiple languages in a Moséten family. Linked to the Chonan languages in a Moseten-Chonan hypothesis.[89] | ||
Chiquitano | 5,900 | Endangered | Bolivia, Brazil | Spoken in the eastern part of Santa Cruz department and the southwestern part of Mato Grosso state. Has been linked to the Macro-Jê family.[90] [91] | ||
Cofán | 2,400 | Colombia, Ecuador | Spoken in northern Sucumbíos Province and southern Putumayo Department. Also called A'ingae.[92] Sometimes classified as Chibchan, but the similarities appear to be due to borrowings. Seriously endangered in Colombia.[93] | |||
Fulniô | 1,000 | Moribund | Brazil | Spoken in the states of Paraíba, Pernambuco, Alagoas, Sergipe, and the northern part of Bahia. Divided into two dialects, Fulniô and Yatê.[94] Sometimes classified as a Macro-Jê language.[95] [96] | ||
Guató | 6 | Spoken in the far south of Mato Grosso near the Bolivian border. Has been classified as Macro-Jê, but this is disputed.[97] | ||||
Itonama | 5 | Bolivia | Spoken in the far-eastern part of Beni Department. A relationship to Paezan has been suggested.[98] | |||
Kamëntsá | 4,000 | data-sort-value="3" | Endangered | Colombia | Spoken in Sibundoy in the Putumayo Department. Also known as Camsa, Coche, Sibundoy, Kamentxa, Kamse, or Camëntsëá. | |
Kanoê | 5 | data-sort-value=4 | Moribund | Brazil | Spoken in southeastern Rondônia. Also known as Kapishana. Tentatively linked to Kwaza and Aikanã. Part of a Macro-Paesan proposal.[99] | |
Kunza | Extinct | Chile | Spoken in areas near Salar de Atacama until the 1950s. Also known as Atacameño. Part of a Macro-Paesan proposal. | |||
Kwaza | 54 | Moribund | Brazil | Spoken in eastern Rondônia. Connections have been proposed with Aikanã and Kanoê. | ||
Leco | 20 | Bolivia | Spoken at the foot of the Andes in the department of La Paz.[100] | |||
Mapuche | 260,000 | data-sort-value=2 | Vulnerable | Chile, Argentina | Spoken in areas of the far-southern Andes and in the Chiloé Archipelago. Also known as Mapudungun, Araucano or Araucanian.[101] Variously part of Andean, Macro-Panoan, or Mataco–Guaicuru[102] proposals. Sometimes Huilliche is treated as a separate language, reclassifying Mapuche into an Araucanian family.[103] | |
Munichi | Extinct | Peru | Spoken in the southern part of Loreto Region until the late 1990s. Possibly evolved either from a mixed language or a sister language to Proto-Arawak.[104] | |||
Movima | 1,400 | data-sort-value=2 | Vulnerable | Bolivia | Spoken in the Llanos de Moxos, in the north of Beni Department. Affiliations with Canichana, Chibcha and Macro-Tucanoan have been proposed, none of these have been proven.[105] | |
Oti | Extinct | Brazil | Spoken in São Paulo until the early 1900s. Macro-Jê has been suggested.[106] | |||
Páez | 60,000 | data-sort-value=2 | Vulnerable | Colombia | Spoken in the northern part of Cauca Department. Several proposed relationships in the Paezan hypothesis but nothing conclusive.[107] | |
Puelche | Extinct | Argentina, Chile | Spoken in the Pampas region, last speaker died around 1960.[108] Sometimes linked to Het, as part of the Chonan languages.[109] Included in a proposed Macro-Jibaro family.[110] | |||
Tequiraca | Peru | Spoken in the central part of Loreto until the 1950s. Also known as Auishiri. A connection with Canichana has been proposed. | ||||
Trumai | 50 | data-sort-value=4 | Moribund | Brazil | Settled on the upper Xingu River. Currently reside in the Xingu National Park in the northern part of Mato Grosso.[111] | |
Urarina | 3,000 | Vulnerable | Peru | Spoken in the central part of the Loreto Region.[112] Part of the Macro-Jibaro proposal.[113] | ||
Waorani | 2,000 | Ecuador, Peru | Also known as Sabela. Spoken between the Napo and Curaray rivers. Could be spoken by several groups living in isolation.[114] | |||
Warao | 28,000 | data-sort-value=3 | Endangered | Guyana, Suriname and Venezuela | Spoken in the Orinoco Delta. Sometimes linked to Paezan. | |
Yaghan | Extinct | Chile | Spoken in far-southern Tierra del Fuego until 2022. Also called Yámana.[115] | |||
Yaruro | 7,900 | data-sort-value=1 | Vibrant | Venezuela | Spoken along the Orinoco, Cinaruco, Meta, and Apure rivers. Linked to the extinct Esmeralda language.[116] | |
Yuracaré | 2,700 | data-sort-value=3 | Endangered | Bolivia | Spoken in the foothills of the Andes, in Cochabamba and Beni Departments. Connections to Mosetenan, Pano–Tacanan, Arawakan, and Chonan have been suggested.[117] |
Languages of the World
. 22nd . Eberhard . David M. . Simons . Gary F. . Fennig . Charles D. . 2019 . Dallas . SIL International.