ISO/IEC 8859-9 explained

ISO/IEC 8859-9
Mime:ISO-8859-9
Standard:TS 5881, ECMA-128, ISO/IEC 8859
Alias:iso-ir-148, latin5, l5, csISOLatin5
Classification:ISO 8859 (extended ASCII, ISO 4873 level 1)
Extends:US-ASCII
Prev:ISO/IEC 8859-3
Basedon:ISO/IEC 8859-1
Otherrelated:Windows-1254

ISO/IEC 8859-9:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 9: Latin alphabet No. 5, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1989. It is designated ECMA-128 by Ecma International and TS 5881 as a Turkish standard.[1] It is informally referred to as Latin-5 or Turkish. It was designed to cover the Turkish language (and the vast majority of users use it for that language, even though it can also be used for some other languages), designed as being of more use than the ISO/IEC 8859-3 encoding. It is identical to ISO/IEC 8859-1 except for the replacement of six Icelandic characters (Ðð, Ýý, Þþ) with characters unique to the Turkish alphabet (Ğğ, İ, ı, Şş). And the uppercase of i is İ; the lowercase of I is ı.

ISO-8859-9 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. In modern applications Unicode and UTF-8 are preferred; authors of new web pages and the designers of new protocols are instructed to use UTF-8 instead.[2] Since 2023, less than 0.05% of all web pages use ISO-8859-9,[3] [4] while 2.1% of web pages located in Turkey declare use of ISO-8859-9.[5] However, the WHATWG Encoding Standard, which specifies the character encodings which are permitted in HTML5 and which compliant browsers must support,[6] requires that web pages marked as ISO-8859-9 be handled as Windows-1254, which differs from ISO-8859-9 by using the CR range which ISO-8859-9 reserves for C1 control codes for additional graphical characters instead (analogous to the relationship between ISO-8859-1 and Windows-1252).

Microsoft has assigned code page 28599 a.k.a. Windows-28599 to ISO-8859-9 in Windows. IBM has assigned code page 920 (CCSID 920) to ISO-8859-9.[7] [8] It is published by Ecma International as ECMA-128.

Codepage layout

Differences from ISO-8859-1 have the Unicode code point number below the character.

See also

External links

Notes and References

  1. Web site: Latin-5: A list of the Latin-5 client and server CCSIDs, which includes Turkey. . . https://archive.today/20220213152735/https://www.ibm.com/docs/en/cics-tg-zos/9.3.0?topic=conversions-latin-5 . 2022-02-13.
  2. Web site: Names and labels . Encoding Standard . van Kesteren . Anne . Anne van Kesteren . WHATWG.
  3. Web site: Historical trends in the usage of character encodings for websites. w3techs.com.
  4. Web site: Frequently Asked Questions. w3techs.com.
  5. Web site: Distribution of character encodings among websites that use Turkey. w3techs.com.
  6. Web site: 8.2.2.3. Character encodings . HTML 5.1 2nd Edition . . User agents must support the encodings defined in the WHATWG Encoding standard, including, but not limited to […].
  7. Web site: Code page 920 information document. https://web.archive.org/web/20170116144609/https://www-01.ibm.com/software/globalization/cp/cp00920.html. 2017-01-16.
  8. Web site: CCSID 920 information document. https://web.archive.org/web/20160327100212/http://www-01.ibm.com/software/globalization/ccsid/ccsid920.html. 2016-03-27.