Regional indicator symbol explained

The regional indicator symbols are a set of 26 alphabetic Unicode characters (A–Z) intended to be used to encode ISO 3166-1 alpha-2 two-letter country codes in a way that allows optional special treatment.

These were defined by as part of the Unicode 6.0 support for emoji, as an alternative to encoding separate characters for each country flag. Although they can be displayed as Roman letters, it is intended that implementations may choose to display them in other ways, such as by using national flags.[1] [2] The Unicode FAQ indicates that this mechanism should be used and that symbols for national flags will not be directly encoded.[3]

They are encoded in the range to within the Enclosed Alphanumeric Supplement block in the Supplementary Multilingual Plane.[4]

Emoji flag sequences

A pair of regional indicator symbols is referred to as an emoji flag sequence (although it represents a specific region, not a specific flag for that region).

Out of the 676 possible pairs of regional indicator symbols (26 × 26), only 270 are considered valid Unicode region codes.These are a subset of the region sequences in the Common Locale Data Repository (CLDR):[5] [6]

flagcoderegion[7] possible rendering<-- subdivision column will go here when/if Unicode proposal is accepted -->
AC
AD
AE
AF
AG
AI
AL
AM
AO
AQ <---->
AR
AS <---->
AT
AU
AW
AX data-sort-value="Aland Islands" <---->
AZ
BA
BB
BD
BE
BF
BG
BH
BI
BJ
BL data-sort-value="Saint Barthelemy" <---->
BM
BN
BO
BQ <---->
BR
BS
BT
BV <---->
BW
BY
BZ
CA
CC <---->
CD
CF
CG
CH
CI data-sort-value="Cote dIvoire"
CK <---->
CL
CM
CN
CO
CP <---->
CQ
CR
CU
CV
CW data-sort-value="Curacao" <---->
CX <---->
CY
CZ
DE
DG <---->
DJ
DK
DM
DO
DZ
EA <---->
EC
EE
EG
EH
ER
ES
ET
EU <---->
FI
FJ
FK
FM
FO
FR
GA
GB
GD
GE
GF <---->
GG <---->
GH
GI
GL
GM
GN
GP <---->
GQ
GR
GS <---->
GT
GU
GW
GY
HK Hong Kong SAR China
HM <---->
HN
HR
HT
HU
IC <---->
ID
IE
IL
IM <---->
IN
IO <---->
IQ
IR
IS
IT
JE
JM
JO
JP
KE
KG
KH
KI
KM
KN data-sort-value="Saint Kitts & Nevis"
KP
KR
KW
KY
KZ
LA
LB
LC data-sort-value="Saint Lucia"
LI
LK
LR
LS
LT
LU
LV
LY
MA
MC
MD
ME
MF data-sort-value="Saint Martin" <---->
MG
MH
MK
ML
MM
MN
MO
MP <---->
MQ <---->
MR
MS
MT
MU
MV
MW
MX
MY
MZ
NA
NC
NE
NF <---->
NG
NI
NL
NO
NP
NR
NU
NZ
OM
PA
PE
PF
PG
PH
PK
PL
PM data-sort-value="Saint Pierre & Miquelon" <---->
PN <---->
PR
PS
PT
PW
PY
QA
RE data-sort-value="Reunion" <---->
RO
RS
RU
RW
SA
SB
SC
SD
SE
SG
SH data-sort-value="Saint Helena"
SI <---->
SJ <---->
SK
SL
SM
SN
SO
SR
SS <---->
ST data-sort-value="Sao Tome & Principe"
SV
SX <---->
SY
SZ
TA <---->
TC <---->
TD
TF <---->
TG
TH
TJ
TK <---->
TL
TM
TN
TO
TR
TT
TV
TW
TZ
UA
UG
UM data-sort-value="United States Outlying Islands" <---->
UN <---->
US
UY
UZ
VA
VC data-sort-value="Saint Vincent & Grenadines"
VE
VG <---->
VI data-sort-value="United States Virgin Islands"
VN
VU
WF
WS
XK
YE
YT <---->
ZA
ZM
ZW
as of Unicode 16.0
deprecated replacement[8] [9]
code region flag code region possible rendering
AN style="text-align:center"CW <---->
SX <---->
BQ <---->
BU Burma style="text-align:center"MM
CS style="text-align:center"RS
ME
DD style="text-align:center"DE
FX style="text-align:center"FR
NT style="text-align:center"SA
IQ
QU European Union style="text-align:center"EU <---->
SU style="text-align:center"RU
AM
AZ
BY
EE
GE
KZ
KG
LV
LT
MD
TJ
TM
UA
UZ
TP East Timor style="text-align:center"TL
YD style="text-align:center"YE
YU style="text-align:center"RS
ME
ZR style="text-align:center"CD

A separate mechanism (emoji tag sequences) is used for regional flags, such as England, Scotland, Wales, Texas or California .[10] It uses and formatting tag characters instead of regional indicator symbols. It is based on ISO 3166-2 regions with hyphen removed and lowercase, e.g. GB-ENG → gbeng, terminating with . Flag of England is therefore represented by a sequence U+1F3F4, U+E0067, U+E0062, U+E0065, U+E006E, U+E0067, U+E007F. In the tenth revision the Unicode consortium was considering instead,[11] but from eleventh onwards it is black.[12] Some vendors choose to include custom zero-width joiner sequences that only show up on their platform, such as WhatsApp and their Refugee Nation Flag ️‍‍⬛️‍.[13]

Unicode block

See main article: Enclosed Alphanumeric Supplement (Unicode block). subset=regional

Background

In 2007 a draft proposal was presented to the Unicode Technical Committee to encode emoji symbols, specifically those in widespread use on mobile phones by Japanese telecommunications companies DoCoMo, KDDI, and SoftBank.[14] The proposed symbols included ten national flags:[15] China, Germany, Spain, France, the UK, Italy, Japan, South Korea, Russia, and the United States .Encoding these flags but not other countries' flags was considered, by some, as prejudicial.[16] One rejected solution was to encode the ten flags but call them "EMOJI COMPATIBILITY SYMBOL-n" and represent them visually in the Standard as "EC n" instead of showing the flags they represent.[17] Another rejected solution would have allocated 676 codepoints (26×26) for each possible two letter combination of A–Z. They would represent political entities based on ISO 3166 such as "JP" for Japan or Internet ccTLDs (country code top-level domains) such as "EU" for the European Union.[18]

The accepted solution was to add 26 characters for letters used for the representation of regional indicators, which used in pairs would represent the ten national flags and possible future extensions.[2] Per the Unicode Standard [19] specifically the ten national flags:[20],,,,,,,,, and .

See also

Further reading

Notes and References

  1. Web site: What's new in Unicode 6.0. Andrew West. Babelstone. 2014-08-18. 2014-04-06. https://web.archive.org/web/20140406061417/http://babelstone.blogspot.com/2009/11/whats-new-in-unicode-60.html. dead.
  2. Web site: N3727: Proposal to encode Regional Indicator Symbols in the UCS. Michael Everson and Ken Whistler. Working Group Document, ISO/IEC JTC1/SC2/WG2 and UTC. 2014-08-18.
  3. Web site: Unicode FAQ: Emoji and Dingbats . The Unicode Consortium . 2009-10-28 . 2014-08-18.
  4. Web site: Enclosed Alphanumeric Supplement, Range 1F100 - 1F1FF, The Unicode Standard, Version 6.0. Unicode Consortium. 2010. 2014-08-18.
  5. Web site: UTR #35: Unicode Locale Data Markup Language (LDML), Validity Data . Unicode Consortium .
  6. Web site: 2020-10-28 . CLDR v38 Region Validity Data . Unicode Common Locale Data Repository (CLDR) . https://web.archive.org/web/20190502054108if_/http://unicode.org/repos/cldr/tags/latest/common/validity/region.xml
  7. Web site: UCD: Emoji Sequence Data for UTR #51 . Unicode Consortium . 2023-06-05.
  8. Web site: UTR #35: Unicode Locale Data Markup Language (LDML), Supplemental Metadata. Unicode Consortium.
  9. Web site: 2020-10-28 . CLDR v38 Supplemental Metadata . Unicode Common Locale Data Repository (CLDR) . https://web.archive.org/web/20190328031414if_/http://unicode.org/repos/cldr/tags/latest/common/supplemental/supplementalMetadata.xml
  10. Web site: UTR #51: Unicode Emoji. 2017-05-18 . Unicode Consortium.
  11. Web site: UTR #51: Unicode Emoji.
  12. Web site: UTS #51: Unicode Emoji.
  13. Web site: 2020. WhatsApp Portal. live. https://web.archive.org/web/20210622150842/https://c.r74n.com/whatsapp/. 22 June 2021. 23 June 2021. Copy Paste Dump. R74n.
  14. Web site: Kat . Momoi . Mark . Davis . Markus . Scherer . L2/07-257: Working Draft Proposal for Encoding Emoji Symbols . 2007-08-03 . 2014-08-18.
  15. Web site: Unicode Mapping for Emoji with Reference to Japanese Carriers, AU/KDDI, DoCoMo, and Softbank . ZIP archive file format . 2014-08-18.
  16. Web site: L2/09-114 N3607: Towards an encoding of symbol characters used as emoji . 2009-04-06 . 2014-08-18.
  17. Web site: INCITS/L2/09-304: Comments accompanying the U.S. negative vote on PDAM 8 to ISO/IEC 10646:2003 (SC2 N4078) . 2009-08-15 . 2014-08-18.
  18. Web site: Karl . Pentzlin . L2/08-305: Some suggestions about the encoding of national flags as requested by the Emoji proposal . 2008-08-09 . 2014-08-18.
  19. Book: The Unicode Standard, Version 6.2, Chapter 15: Symbols . Unicode, Inc . September 2012 . 534 . 978-1-936213-07-8.
  20. Web site: Emoji Sources . Unicode, Inc. . 2013-12-17 . plain text . 2014-08-18.