Tamil script explained

Tamil
Also Known As:Tamil: தமிழ்
Type:Abugida
Time:c. 400 CE – present[1] [2]
Languages:Tamil
Kanikkaran
Badaga
Irula
Paniya
Saurashtra
Fam1:Egyptian
Fam2:Proto-Sinaitic
Fam3:Phoenician
Fam4:Aramaic
Fam5:Brahmi script
Fam6:Tamil Brahmi
Fam7:Pallava script
Sisters:Grantha, Old Mon, Khmer, Cham, Kawi
Iso15924:Taml
Sample:Word Tamil.svg

The Tamil script (Tamil: தமிழ் அரிச்சுவடி in Tamil pronounced as /tamiɻ ˈaɾitːɕuʋaɽi/) is an abugida script that is used by Tamils and Tamil speakers in India, Sri Lanka, Malaysia, Singapore, Indonesia and elsewhere to write the Tamil language.[3] It is one of the official scripts of the Indian Republic. Certain minority languages such as Saurashtra, Badaga, Irula and Paniya are also written in the Tamil script.

Characteristics

The Tamil script has 12 vowels (Tamil: உயிரெழுத்து,, "soul-letters"), 18 consonants (Tamil: மெய்யெழுத்து,, "body-letters") and one special character, the Tamil: (Tamil: ஆய்த எழுத்து,). Tamil: is called "அக்கு", akku and is classified in Tamil orthography as being neither a consonant nor a vowel.[4] However, it is listed at the end of the vowel set. The script is syllabic, not alphabetic. It is written from left to right.

History

See also: Tamil-Brahmi, Vatteluttu alphabet, Grantha script, Pallava script, Kolezhuthu and Arwi. The Tamil script, like the other Brahmic scripts, is thought to have evolved from the original Brahmi script. The earliest inscriptions which are accepted examples of Tamil writing date to the Ashokan period. The script used by such inscriptions is commonly known as the Tamil-Brahmi or "Tamili script" and differs in many ways from standard Ashokan Brahmi. For example, early Tamil-Brahmi, unlike Ashokan Brahmi, had a system to distinguish between pure consonants (m, in this example) and consonants with an inherent vowel (ma, in this example). In addition, according to Iravatham Mahadevan, early Tamil Brahmi used slightly different vowel markers, had extra characters to represent letters not found in Sanskrit and omitted letters for sounds not present in Tamil such as voiced consonants and aspirates. Inscriptions from the 2nd century use a later form of Tamil-Brahmi, which is substantially similar to the writing system described in the Tolkāppiyam, an ancient Tamil grammar. Most notably, they used the puḷḷi to suppress the inherent vowel. The Tamil letters thereafter evolved towards a more rounded form and by the 5th or 6th century, they had reached a form called the early vaṭṭeḻuttu.

The modern Tamil script does not, however, descend from that script. In the 4th century,[5] the Pallava dynasty created a new script called Pallava script for Tamil and the Grantha alphabet evolved from it, adding the Vaṭṭeḻuttu alphabet for sounds not found to write Sanskrit. Parallel to Grantha alphabet a new script (Chola-Pallava script, which evolved to modern Tamil script) again emerged in Pallava and Chola territories resembling the same glyph development like Grantha, however, heavily reduced in its shapes and not overtaking non-native Tamil sounds. By the 8th century, the new scripts supplanted Vaṭṭeḻuttu in the Pallava and Chola kingdoms which lay in the north portion of the Tamil-speaking region. However, Vaṭṭeḻuttu continued to be used in the southern portion of the Tamil-speaking region, in the Chera and Pandyan kingdoms until the 11th century, when the Pandyan kingdom was conquered by the Cholas who inherited while being feudatory of Pallavas for a short time.

With the fall of Pallava kingdom, the Chola dynasty pushed the Chola-Pallava script as the de facto script. Over the next few centuries, the Chola-Pallava script evolved into the modern Tamil script. The Grantha and its parent script influenced the Tamil script notably. The use of palm leaves as the primary medium for writing led to changes in the script. The scribe had to be careful not to pierce the leaves with the stylus while writing because a leaf with a hole was more likely to tear and decay faster. As a result, the use of the puḷḷi to distinguish pure consonants became rare, with pure consonants usually being written as if the inherent vowel were present. Similarly, the vowel marker (Tamil: ) called:, a half-rounded u which occurs at the end of some words and in the medial position in certain compound words, marking a shortened u sound, also fell out of use and was replaced by the marker for the simple u (Tamil: ). The puḷḷi (Tamil: ) did not fully reappear until the introduction of printing, but the marker kuṟṟiyal-ukaram (Tamil: ) never came back for this purpose into use although its usage is retained in certain grammatical conceptual words whereas the sound itself still exists and plays an important role in Tamil prosody.

The forms of some of the letters were simplified in the 19th century to make the script easier to typeset. In the 20th century, the script was simplified even further in a series of reforms, which regularised the vowel markers used with consonants by eliminating special markers and most irregular forms.

Relationship with other Indic scripts

The Tamil script differs from other Brahmi-derived scripts in a number of ways. Unlike every other Brahmic script, it does not regularly represent voiced or aspirated stop consonants as these are not phonemes of the Tamil language even though voiced and fricative allophones of stops do appear in spoken Tamil. Thus the character Tamil: க் k, for example, represents pronounced as /link/ but can also be pronounced [{{IPA|g}}] or [{{IPA|x}}] based on the rules of Tamil phonology. A separate set of characters appears for these sounds when the Tamil script is used to write Sanskrit or other languages.

Also unlike other Brahmi scripts, the Tamil script rarely uses typographic ligatures to represent conjunct consonants, which are far less frequent in Tamil than in other Indian languages. Where they occur, conjunct consonants are written by writing the character for the first consonant, adding the puḷḷi to suppress its inherent vowel, and then writing the character for the second consonant. There are a few exceptions, namely Tamil: க்ஷ kṣa and Tamil: ஶ்ரீ śrī.

ISO 15919 is an international standard for the transliteration of Tamil and other Indic scripts into Latin characters. It uses diacritics to map the much larger set of Brahmic consonants and vowels to the Latin script.

Letters

Basic consonants

Consonants are called the "body" (mei) letters. The consonants are classified into three categories: vallinam (hard consonants), mellinam (soft consonants, including all nasals), and itayinam (medium consonants).

There are some lexical rules for the formation of words. The Tolkāppiyam describes such rules. Some examples: a word cannot end in certain consonants, and cannot begin with some consonants including r-, l- and ḻ-; there are six nasal consonants in Tamil: a velar nasal ங், a palatal nasal ஞ், a retroflex nasal ண், a dental nasal ந், a bilabial nasal ம், and an alveolar nasal ன்.

The order of the alphabet (strictly abugida) in Tamil closely matches that of the nearby languages both in location and linguistics, reflecting the common origin of their scripts from Brahmi.

Tamil language has 18 consonants - mey eluttukkal. Traditional grammarians have classified these 18 into three groups of 6 letters each. This classification is done based on the method of articulation and hence the nature of these letters. Vallinam (hard group,) Mellinam (soft group) and idaiyinam (medium group). All consonants are pronounced for a half unit (māttirai) time length when isolated (consonants combined with vowels will be pronounced with the time length of the vowel).

Tamil consonants
Consonant Category IPA
Tamil: க் vallinam pronounced as //k//
Tamil: ங் mellinam pronounced as //ŋ//
Tamil: ச் vallinam pronounced as //t͡ʃ, s//
Tamil: ஞ் mellinam pronounced as //ɲ//
Tamil: ட் vallinam pronounced as //ʈ//
Tamil: ண் mellinam pronounced as //ɳ//
Tamil: த் vallinam pronounced as //t̪//
Tamil: ந் mellinam pronounced as //n̪//
Tamil: ப் vallinam pronounced as //p//
Tamil: ம் mellinam pronounced as //m//
Tamil: ய் idaiyinam pronounced as //j//
Tamil: ர் idaiyinam pronounced as //ɾ//
Tamil: ல் idaiyinam pronounced as //l//
Tamil: வ் idaiyinam pronounced as //ʋ//
Tamil: ழ் idaiyinam pronounced as //ɻ//
Tamil: ள் idaiyinam pronounced as //ɭ//
Tamil: ற் vallinam pronounced as //r//
Tamil: ன் mellinam pronounced as /[n]/

Extra consonants used in Tamil

The Tamil speech has incorporated many phonemes that were not part of the Tolkāppiyam classification. The letters used to write these sounds, known as Grantha, are used as part of Tamil. These are taught from elementary school and incorporated in Tamil All Character Encoding (TACE16).

Grantha consonants in Tamil
Consonant IPA
Tamil: ஜ் pronounced as //d͡ʒ//
Tamil: ஶ் pronounced as //ʃ//
Tamil: ஷ் pronounced as //ʂ//
Tamil: ஸ் pronounced as //s//
Tamil: ஹ் pronounced as //h//
Tamil: க்ஷ் pronounced as //kʂ//

There is also the compound Tamil: ஶ்ரீ, equivalent to Indic languages: श्री in Devanagari.

Combinations of consonants with Tamil: (Tamil: ஆய்த எழுத்து,, equivalent to nuqta) are occasionally used to represent phonemes of foreign languages, especially to write Islamic and Christian texts. For example: asif = Tamil: அசிஃப், azārutīn̠ = Tamil: அஃஜாருதீன், Genghis Khan = Tamil: கெங்கிஸ் ஃகான்.

A nuqta-like diacritic is used while writing the Badaga language and double dot nuqta for the Irula language to transcribe its sounds.[6]

There has also been effort to differentiate voiced and voiceless consonants through subscripted numbers – two, three, and four which stand for the unvoiced aspirated, voiced, voiced aspirated respectively. This was used to transcribe Sanskrit words in Sanskrit–Tamil books, as shown in the table below.[7] [8]

Tamil: Tamil: க₂ Tamil: க₃ Tamil: க₄
Tamil: Tamil: ச₂ Tamil: Tamil: ஜ₂
Tamil: Tamil: ட₂ Tamil: ட₃ Tamil: ட₄
Tamil: Tamil: த₂ Tamil: த₃ Tamil: த₄
Tamil: Tamil: ப₂ Tamil: ப₃ Tamil: ப₄
The Unicode Standard uses superscripted digits for the same purpose, as in Tamil: ப², Tamil: ப³, and Tamil: ப⁴ .[9]

Vowels

Vowels are also called the 'life' (uyir) or 'soul' letters. Together with the consonants (mei, which are called 'body' letters), they form compound, syllabic (abugida) letters that are called 'living' or 'embodied' letters (uyir mei, i.e. letters that have both 'body' and 'soul').

Tamil language has 12 vowels which are divided into short and long (five of each type) and two diphthongs.

Tamil vowels
Independent Vowel signIPA
Tamil: pronounced as //ɐ//
Tamil: pronounced as //aː//
Tamil: ிpronounced as //i//
Tamil: pronounced as //iː//
Tamil: pronounced as //u//
Tamil: pronounced as //uː//
Tamil: pronounced as //e//
Tamil: pronounced as //eː//
Tamil: pronounced as //ɐi̯//
Tamil: pronounced as //o//
Tamil: pronounced as //oː//
Tamil: pronounced as //ɐu̯//

Compound form

Using the consonant 'k' as an example:

Formation Compound form IPA
க் + அ pronounced as //kɐ//
க் + ஆ கா pronounced as //kaː//
க் + இ கி pronounced as //ki//
க் + ஈ கீ pronounced as //kiː//
க் + உ கு pronounced as //ku//
க் + ஊ கூ pronounced as //kuː//
க் + எ கெ pronounced as //ke//
க் + ஏ கேpronounced as //keː//
க் + ஐ கை pronounced as //kɐi̯//
க் + ஒ கொ pronounced as //ko//
க் + ஓ கோpronounced as //koː//
க் + ஔ கௌ pronounced as //kɐu̯//

The special letter Tamil: , represented by three dots, is called or aḵ. It originally represented an archaic Tamil retention of the Dravidian sound ḥ, which has been lost in almost all modern Dravidian languages, and in Tamil traditionally serves a purely grammatical function, but in modern times it has come to be used as a diacritic to represent foreign sounds. For example, Tamil: ஃப is used for the English sound f, not found in Tamil. It also served before palm leaves became the primary writing medium for words ending with an inherent consonsant-vowel u as a pronouncing rule for a short u, called. Following consonants rendered this behaviour: Tamil: கு, Tamil: சு, Tamil: டு, Tamil: து, Tamil: பு, Tamil: று. Instead of writing like in modern days without any markers, for example, it was written with a preceding Tamil: , like .

Another archaic Tamil letter Tamil: , represented by a small hollow circle and called, is the Anusvara. It was traditionally used as a homorganic nasal when in front of a consonant, and either as a bilabial nasal (pronounced as /m/) or alveolar nasal (pronounced as /n/) at the end of a word, depending on the context.

The long vowels are about twice as long as the short vowels. The diphthongs are usually pronounced about one and a half times as long as the short vowels, though some grammatical texts place them with the long vowels.

As can be seen in the compound form, the vowel sign can be added to the right, left or both sides of the consonants. It can also form a ligature. These rules are evolving and older use has more ligatures than modern use. What you actually see on this page depends on your font selection; for example, Code2000 will show more ligatures than Latha.

There are proponents of script reform who want to eliminate all ligatures and let all vowel signs appear on the right side.

Unicode encodes the character in logical order (always the consonant first), whereas legacy 8-bit encodings (such as TSCII) prefer the written order. This makes it necessary to reorder when converting from one encoding to another; it is not sufficient simply to map one set of code points to the other.

Compound table of Tamil letters

The following table lists vowel (or life) letters across the top and consonant (or body) letters along the side, the combination of which gives all Tamil compound letters.

English: Tolkāppiyam<br/>consonantsEnglish: Vowels
∅aா āி
i

ī

u

ū

e

ē

ai

o

ō

au
∅ (Independent)
க்kகாகிகீகுகூகெகேகைகொகோகௌ
ங்ஙாஙிஙீஙுஙூஙெஙேஙைஙொஙோஙௌ
ச்cசாசிசீசுசூசெசேசைசொசோசௌ
ஞ்ñஞாஞிஞீஞுஞூஞெஞேஞைஞொஞோஞௌ
ட்டாடிடீடுடூடெடேடைடொடோடௌ
ண்ணாணிணீணுணூணெணேணைணொணோணௌ
த்tதாதிதீதுதூதெதேதைதொதோதௌ
ந்nநாநிநீநுநூநெநேநைநொநோநௌ
ப்pபாபிபீபுபூபெபேபைபொபோபௌ
ம்mமாமிமீமுமூமெமேமைமொமோமௌ
ய்yயாயியீயுயூயெயேயையொயோயௌ
ர்rராரிரீருரூரெரேரைரொரோரௌ
ல்lலாலிலீலுலூலெலேலைலொலோலௌ
வ்vவாவிவீவுவூவெவேவைவொவோவௌ
ழ்ழாழிழீழுழூழெழேழைழொழோழௌ
ள்ளாளிளீளுளூளெளேளைளொளோளௌ
ற்றாறிறீறுறூறெறேறைறொறோறௌ
ன்னானினீனுனூனெனேனைனொனோனௌ
English: Grantha compound table
English: Grantha<br/>consonantsEnglish: Vowels
∅aா āி
i

ī

u

ū

e

ē

ai

o

ō

au
ஶ்śஶாஶிஶீஶுஶூஶெஶேஶைஶொஶோஶௌ
ஜ்jஜாஜிஜீஜுஜூஜெஜேஜைஜொஜோஜௌ
ஷ்ஷாஷிஷீஷுஷூஷெஷேஷைஷொஷோஷௌ
ஸ்sஸாஸிஸீஸுஸூஸெஸேஸைஸொஸோஸௌ
ஹ்hஹாஹிஹீஹுஹூஹெஹேஹைஹொஹோஹௌ
க்ஷ்kṣக்ஷக்ஷாக்ஷிக்ஷீக்ஷுக்ஷூக்ஷெக்ஷேக்ஷைக்ஷொக்ஷோக்ஷௌ

Writing order

Notes and References

  1. Rajan . K. . Territorial Division as Gleaned from Memorial Stones . 29757518 . East and West . Istituto Italiano per l'Africa e l'Oriente (IsIAO) . December 2001 . 51 . 3/4 . 363 . (table showing Tamil in row for the 601–800 period)
  2. Book: Diringer . David . Alphabet a key to the history of mankind . 1948 . 385.
  3. at p. 324
  4. https://dsal.uchicago.edu/cgi-bin/romadict.pl?table=tamil-lex&page=148&display=utf8 University of Madras Tamil Lexicon, page 148: "Tamil: அலியெழுத்து [{{transliteration|ta|ISO|aliyeḻuttu}} ] n . < Tamil: அலி¹ +. 1. The letter Tamil: , as being regarded as neither a vowel nor a consonant; Tamil: ஆய்தம். (Tamil: வெண்பாப். முதன்மொ. 6, உரை.) 2. Consonants; Tamil: மெய்யெ ழுத்து. (பிங்.)."
  5. Web site: Griffiths . Arlo . Early Indic Inscriptions of Southeast Asia . 2014 . registration.
  6. The Unicode Standard Version 13.0 – Core Specification, South and Central Asia-I, Official Scripts of India pg. 498
  7. Sharma, Shriramana. (2010a). Proposal to encode characters for Extended Tamil.
  8. Sharma, Shriramana. (2010c). Follow-up #2 to Extended Tamil proposal.
  9. Unicode Consortium (2019). Tamil. In The Unicode Standard Version 12.0 (pp. 489–498).
  10. Selvakumar, V. (2016). History of Numbers and Fractions and Arithmetic Calculations in the Tamil Region: Some Observations. HuSS: International Journal of Research in Humanities and Social Sciences, 3(1), 27–35. https://doi.org/10.15613/HIJRH/2016/V3I1/111730
  11. Sharma, Shriramana. (2010b). Follow-up to Extended Tamil proposal L2/10-256R.
  12. Eraiyarasan, B. Dr. B. Eraiyarasan's comments on Tamil Unicode And Grantham proposals.
  13. Nalankilli, Thanjai. (2018). Attempts to "Pollute" Tamil Unicode with Grantha Characters. Tamil Tribune. Retrieved 12 March 2019 from http://www.tamiltribune.com/18/1201.html
  14. Government of India. (2010). Unicode Standard for Grantha Script.
  15. Sharma, Shriramana. (2012). Proposal to encode Tamil fractions and symbols.
  16. ICTA of Sri Lanka. (2014). Comments on the Proposals to Encode Tamil Symbols and Fractions.
  17. Government of Tamil Nadu. (2017). Finalized proposal to encode Tamil fractions and symbols.
  18. Web site: Pournader. Roozbeh. 24 January 2018. The two ways to represent Tamil Shri. live. Unicode . https://web.archive.org/web/20230404201743/https://www.unicode.org/L2/L2018/18054-tamil-shri.txt . Apr 4, 2023 .
  19. Web site: Open-Tamil 0.65 : Python Package Index.