Latin-1 Supplement Explained

Blockname:Latin-1 Supplement

C1 Controls and Latin-1 Supplement
Rangestart:0080
Rangeend:00FF
Symbols:Punctuation
Mathematics
Currency
Alphabets:French
German
Icelandic
Portuguese
Spanish
1 0 0:128
Controls:33
Sources:ISO/IEC 8859-1
Note:[1] [2]

The Latin-1 Supplement (also called C1 Controls and Latin-1 Supplement) is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). C1 Controls (0080 - 009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

The C1 Controls and Latin-1 Supplement block has been included in its present form, with the same character repertoire since version 1.0 of the Unicode Standard.[3] Its block name in Unicode 1.0 was simply Latin1.[4]

Character table

CodeResultDescriptionAcronym
C1 Controls
U+0080Padding Character PAD
U+0081High Octet Preset HOP
U+0082Break Permitted Here BPH
U+0083No Break Here NBH
U+0084Index IND
U+0085NEL
U+0086Start of Selected Area SSA
U+0087End of Selected Area ESA
U+0088Character (Horizontal) Tabulation Set HTS
U+0089Character (Horizontal) Tabulation with Justification HTJ
U+008ALine (Vertical) Tabulation Set LTS
U+008BPartial Line Forward (Down) PLD
U+008CPartial Line Backward (Up) PLU
U+008DReverse Line Feed (Index) RI
U+008ESingle-Shift Two SS2
U+008FSingle-Shift Three SS3
U+0090Device Control String DCS
U+0091Private Use One PU1
U+0092Private Use Two PU2
U+0093Set Transmit State STS
U+0094CCH
U+0095Message Waiting MW
U+0096Start of Protected Area SPA
U+0097End of Protected Area EPA
U+0098Start of String SOS
U+0099Single Graphic Character Introducer SGCI
U+009ASingle Character Introducer SCI
U+009BCSI
U+009CString Terminator ST
U+009DOperating System Command OSC
U+009EPrivate Message PM
U+009FApplication Program Command APC
Latin-1 Punctuation and Symbols
U+00A0 NBSP
U+00A1¡
U+00A2¢
U+00A3£
U+00A4¤
U+00A5¥
U+00A6¦
U+00A7§
U+00A8¨
U+00A9©
U+00AAª
U+00AB«
U+00AC¬
U+00ADSHY
U+00AE®
U+00AF¯
U+00B0°
U+00B1±
U+00B2
U+00B3³
U+00B4
U+00B5
U+00B6Pilcrow sign
U+00B7·
U+00B8¸
U+00B9¹
U+00BAº
U+00BB»
U+00BC¼Vulgar fraction one quarter
U+00BD½
U+00BE¾Vulgar fraction three quarters
U+00BF¿
Letters
U+00C0À
U+00C1Á
U+00C2Â
U+00C3Ã
U+00C4Ä
U+00C5Å
U+00C6Æ
U+00C7Ç
U+00C8È
U+00C9É
U+00CAÊ
U+00CBË
U+00CCÌ
U+00CDÍ
U+00CEÎ
U+00CFÏ
U+00D0Ð
U+00D1Ñ
U+00D2Ò
U+00D3Ó
U+00D4Ô
U+00D5Õ
U+00D6Ö
Mathematical operator
U+00D7×
Letters
U+00D8Ø
U+00D9Ù
U+00DAÚ
U+00DBÛ
U+00DCÜ
U+00DDÝ
U+00DEÞ
U+00DFß
U+00E0à
U+00E1á
U+00E2â
U+00E3ã
U+00E4ä
U+00E5å
U+00E6æ
U+00E7ç
U+00E8è
U+00E9é
U+00EAê
U+00EBë
U+00ECì
U+00EDí
U+00EEî
U+00EFï
U+00F0ð
U+00F1ñ
U+00F2ò
U+00F3ó
U+00F4ô
U+00F5õ
U+00F6ö
Mathematical operator
U+00F7÷
Letters
U+00F8ø
U+00F9ù
U+00FAú
U+00FBû
U+00FCü
U+00FDý
U+00FEþ
U+00FFÿ

Subheadings

The C1 Controls and Latin-1 Supplement block has four subheadings within its character collection: C1 controls, Latin-1 Punctuation and Symbols, Letters, and Mathematical operator(s).[5]

C1 controls

The C1 controls subheading contains 32 supplementary control codes inherited from ISO/IEC 8859-1 and many other 8-bit character standards. The alias names for the C0 and C1 control codes are taken from .

Latin-1 punctuation and symbols

The Latin-1 Punctuation and Symbols subheading contains 32 characters of common international punctuation characters, such as the inverted question and exclamation marks, a middle dot, and symbols such as currency signs, spacing diacritic marks, vulgar fractions, and superscript numbers.

Letters

The Letters subheading contains 30 pairs of majuscule and minuscule accented or novel Latin characters for western European languages, and two extra minuscule characters (ß and ÿ) not commonly used as the first letter of words.

Mathematical operator

The Mathematical operator subheading is used for the multiplication and division signs.

Number of symbols, letters and control codes

The table below shows the number of letters, symbols and control codes in each of the subheadings in the C1 Controls and Latin-1 Supplement block.

Type of subheadingNumber of symbolsRange of characters
C1 controls32 control codesU+0080 to U+009F
Latin-1 punctuation and symbols32 punctuation and symbolsU+00A0 to U+00BF
Letters30 pairs of majuscule and minuscule accented Latin charactersU+00C0 to U+00D6, U+00D8 to U+00F6 and U+00F8 to U+00FF
Mathematical operatorsThe and symbols.U+00D7 and U+00F7

Compact table

Emoji

The Latin-1 Supplement block contains two emoji:U+00A9 and U+00AE.[6] [7]

The block has four standardized variants defined to specify emoji-style (U+FE0F VS16) or text presentation (U+FE0E VS15) for thetwo emoji, both of which default to a text presentation.[8]

Emoji variation sequences
U+ 00A9 00AE
base code point © ®
base+VS15 (text)
base+VS16 (emoji)

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Latin-1 Supplement block:

See also

References

Notes and References

  1. Web site: Unicode character database. The Unicode Standard. 2023-07-26.
  2. Web site: Enumerated Versions of The Unicode Standard. The Unicode Standard. 2023-07-26.
  3. Book: The Unicode Standard Version 1.0, Volume 1. 1990. 1991. Addison-Wesley Publishing Company, Inc.. 0-201-56788-1.
  4. Web site: 3.8: Block-by-Block Charts . The Unicode Standard . version 1.0 . Unicode Consortium.
  5. Web site: Unicode 6.2 code charts. The Unicode Standard. 1 April 2013.
  6. Web site: UTR #51: Unicode Emoji. Unicode Consortium. 2023-09-05.
  7. Web site: UCD: Emoji Data for UTR #51. Unicode Consortium. 2023-02-01.
  8. Web site: UTS #51 Emoji Variation Sequences . The Unicode Consortium.