Slovene alphabet explained

Slovene alphabet
Type:	Alphabet
Languages:	Slovene
Time:	early 19th century – present
Fam1:	Egyptian hieroglyphs
Fam2:	Proto-Sinaitic alphabet
Fam3:	Phoenician alphabet
Fam4:	Greek alphabet
Fam5:	Old Italic scripts
Fam6:	Latin alphabet
Fam7:	Czech alphabet
Fam8:	Gaj's Latin alphabet
Children:	Slovene national phonetic transcription
Unicode:	Subset of Latin (Basic Latin and Latin Extended-A)
Sample:	Slovenska abeceda.png

The Slovene alphabet (sl|slovenska abeceda, pronounced as /sl/ or slovenska gajica pronounced as /sl/) is an extension of the Latin script used to write Slovene. The standard language uses a Latin alphabet which is a slight modification of the Croatian Gaj's Latin alphabet, consisting of 25 lower- and upper-case letters:

Characters

The following Latin letters are also found separately alphabetized in words of non-Slovene origin: Ć (mehki č), Đ (dže), Q (ku), W (dvojni ve), X (iks), and Y (ipsilon).

Letter	Name	IPA	English approx.
A, a	a	pronounced as //a//	arm
B, b	be	pronounced as //b//	bat
C, c	ce	pronounced as //ts//	cats
Č, č	če	pronounced as //tʃ//	charge
D, d	de	pronounced as //d//	day
E, e	e	pronounced as //ɛ//, pronounced as //e//, pronounced as //ə//	bed, sleigh
F, f	ef	pronounced as //f//	fat
G, g	ge	pronounced as //ɡ//	gone
H, h	ha	pronounced as //x//	(Scottish English) loch
I, i	i	pronounced as //i//	me
J, j	je	pronounced as //j//	yes
K, k	ka	pronounced as //k//	cat
L, l	el	pronounced as //l//, pronounced as //w//	lid
M, m	em	pronounced as //m//	month
N, n	en	pronounced as //n//	nose
O, o	o	pronounced as //ɔ//, pronounced as //o//	void, so
P, p	pe	pronounced as //p//	poke
R, r	er	pronounced as //r//	(trilled) risk
S, s	es	pronounced as //s//	sat
Š, š	eš	pronounced as //ʃ//	shin
T, t	te	pronounced as //t//	took
U, u	u	pronounced as //u//	sooth
V, v	ve	pronounced as //v//, pronounced as //w//	virus
Z, z	ze	pronounced as //z//	zoo
Ž, ž	že	pronounced as //ʒ//	parmesan, vision

Diacritics

To compensate for the shortcomings of the standard orthography, Slovenian also uses standardized diacritics or accent marks to denote stress, vowel length and pitch accent, much like the closely related Serbo-Croatian. However, as in Serbo-Croatian, use of such accent marks is restricted to dictionaries, language textbooks and linguistic publications. In normal writing, the diacritics are almost never used, except in a few minimal pairs where real ambiguity could arise.

Two different and mutually incompatible systems of diacritics are used. The first is the simpler non-tonemic system, which can be applied to all Slovene dialects. It is more widely used and is the standard representation in dictionaries such as SSKJ. The tonemic system also includes tone as part of the representation. However, neither system reliably distinguishes schwa pronounced as //ə// from the front mid-vowels, nor vocalised l pronounced as //w// from regular l pronounced as //l//. Some sources write these as ə and ł, respectively, but this is not as common.

Non-tonemic diacritics

In the non-tonemic system, the distinction between the two mid-vowels is indicated, as well as the placement of stress and length of vowels:

Long stressed vowels are notated with an acute diacritic: á é í ó ú ŕ (IPA: pronounced as //aː eː iː oː uː ər//).
However, the rarer long stressed low-mid vowels pronounced as //ɛː// and pronounced as //ɔː// are notated with a circumflex: ê ô.
Short stressed vowels are notated with a grave: à è ì ò ù (IPA: pronounced as //a ɛ i ɔ u//). Some systems may also include ə̀ for pronounced as //ə//.

Tonemic diacritics

The tonemic system uses the diacritics somewhat differently from the non-tonemic system. The high-mid vowels pronounced as //eː// and pronounced as //oː// are written ẹ ọ with a subscript dot, while the low-mid vowels pronounced as //ɛː// and pronounced as //ɔː// are written as plain e o.

Pitch accent and length is indicated by four diacritical marks:

The acute (´) indicates long and low pitch: á é ẹ́ í ó ọ́ ú ŕ (IPA: pronounced as //àː ɛ̀ː èː ìː ɔ̀ː òː ùː ə̀r//).
The inverted breve (̑) indicates long and high pitch: ȃ ȇ ẹ̑ ȋ ȏ ọ̑ ȗ ȓ (IPA: pronounced as //áː ɛ́ː éː íː ɔ́ː óː úː ə́r//).
The grave (`) indicates short and low pitch. This occurs only on è (IPA: pronounced as //ə̀//), optionally written as ə̀.
The double grave (̏) indicates short and high pitch: ȁ ȅ ȉ ȍ ȕ (IPA: á ɛ́ í ɔ́ ú). ȅ is also used for pronounced as //ə́//, optionally written as ə̏.

The schwa vowel pronounced as //ə// is written ambiguously as e, but its accentuation will sometimes distinguish it: a long vowel mark can never appear on a schwa, while a grave accent can appear only on a schwa. Thus, only ȅ and unstressed e are truly ambiguous.

Others

The writing in its usual form uses additional accentual marks, which are used to disambiguate similar words with different meanings. For example:

gòl (naked) | gól (goal),
jêsen (ash (tree)) | jesén (autumn),
kót (angle, corner) | kot (as, like),
kózjak (goat's dung) | kozják (goat-shed),
med (between) | méd (brass) | méd (honey),
pól (pole) | pól (half (of)) | pôl (expresses a half an hour before the given hour),
prècej (at once) | precéj (a great deal (of))),
remí (draw) | rémi (rummy (- a card game)),
je (he/she is) | jé (he/she eats).

Foreign words

There are 5 letters for vowels (a, e, i, o, u) and 20 for consonants. The letters q, w, x, y are excluded from the standard spelling, as are some Serbo-Croatian graphemes (ć, đ), however they are collated as independent letters in some encyclopedias and dictionary listings; foreign proper nouns or toponyms are often not adapted to Slovene orthography as they are in some other Slavic languages, such as partly in Russian or entirely in the Serbian standard of Serbo-Croatian.

In addition, the graphemes ö and ü are used in certain non-standard dialect spellings (usually representing loanwords from German, Hungarian or Turkish) – for example, dödöli (Prekmurje potato dumplings) and Danilo Türk (a politician).

Encyclopedic listings (such as in the 2001 Slovenski pravopis and the 2006 Leksikon SOVA) use this alphabet:

a, b, c, č, ć, d, đ, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, š, t, u, v, w, x, y, z, ž.Therefore, Newton and New York remain the same and are not transliterated to Njuton or Njujork; transliterated forms would seem very odd to a Slovene. However, the unit of force is written as njuton as well as newton. Some place names are transliterated (e.g. Philadelphia – Filadelfija; Hawaii – Havaji). Other names from non-Latin languages are transliterated in a fashion similar to that used by other European languages, albeit with some adaptations. Japanese, Indonesian and Arabic names such as Kajibumi, Jakarta and Jabar are written as Kadžibumi, Džakarta and Džabar, where j is replaced with dž. Except for ć and đ, graphemes with diacritical marks from other foreign alphabets (e.g., ä, å, æ, ç, ë, ï, ń, ö, ß, ş, ü) are not used as independent letters.

History

The modern alphabet (abeceda) was standardised in the mid-1840s from an arrangement of the Croatian national reviver and leader Ljudevit Gaj which would become the Croatian alphabet, and was in turn patterned on the Czech alphabet. Before the current alphabet became standard, š was, for example, written as ʃ, ʃʃ or ſ; č as tʃch, cz, tʃcz or tcz; i sometimes as y as a relic of the letter now rendered as Ы (yery) in modern Russian; j as y; l as ll; v as w; ž as ʃ, ʃʃ or ʃz.

In the old alphabet used by most distinguished writers, the Bohorič alphabet (bohoričica), developed by Adam Bohorič, the characters č, š and ž would be spelt as zh, ſh and sh respectively, and c, s and z would be spelt as z, ſ and s respectively. To remedy this, so that there was a one-to-one correspondence between sounds and letters, Jernej Kopitar urged the development of a new alphabet.

In 1825, Franc Serafin Metelko proposed his version of the alphabet (the Metelko alphabet, metelčica). However, it was banned in 1833 in favour of the Bohorič alphabet after the so-called "Suit of the Letters" (Črkarska pravda) (1830 - 1833), which was won by France Prešeren and Matija Čop. Another alphabet, the Dajnko alphabet (dajnčica), was developed by Peter Dajnko in 1824, but did not catch on as widely as the Metelko alphabet; it was banned in 1838 because it mixed Latin and Cyrillic characters, which was seen as a poor way to handle missing characters.

Gaj's Latin alphabet (gajica) was adopted afterwards, although it still fails to distinguish all the phonemes of Slovene.

Computer encoding

The preferred character encodings (writing codes) for Slovene texts are UTF-8 (Unicode), UTF-16, and ISO/IEC 8859-2 (Latin-2), which generally supports Central and Eastern European languages that are written in the Latin script.

In the original ASCII frame of 1 to 126 characters one can find these examples of writing text in Slovene:

a, b, c, *c, d, e, f, g, h, i, j, k, l, m, n, o, p, r, s, *s, t, u, v, z, *z

a, b, c, "c, d, e, f, g, h, i, j, k, l, m, n, o, p, r, s, "s, t, u, v, z, "z

a, b, c, c(, d, e, f, g, h, i, j, k, l, m, n, o, p, r, s, s(, t, u, v, z, z(

a, b, c, c^, d, e, f, g, h, i, j, k, l, m, n, o, p, r, s, s^, t, u, v, z, z^

a, b, c, cx, d, e, f, g, h, i, j, k, l, m, n, o, p, r, s, sx, t, u, v, z, zx

In ISO/IEC 8859-1 (Latin-1) typical workarounds for missing characters Č (č), Š (š), and Ž (ž) can be C~ (c~), S~ (s~), Z~ (z~) or similar as for ASCII encoding.

For usage under DOS and Microsoft Windows also code pages 852 and Windows-1250 respectively fully supported Slovene alphabet.

In TeX notation, č, š and ž become \v c, \v s, \v z, \v, \v, \v or in their macro versions, "c, "s and "z, or in other representations as \~, \