Geocode Explained

Geocode should not be confused with Address geocoding.

A geocode is a code that represents a geographic entity (location or object). It is a unique identifier of the entity, to distinguish it from others in a finite set of geographic entities. In general the geocode is a human-readable and short identifier.

Typical geocodes and entities represented by it:

The ISO 19112:2019 standard (section 3.1.2) adopted the term "geographic identifier" instead geocode, to encompass long labels: spatial reference in the form of a label or code that identifies a location. For example, for ISO, the country name “People's Republic of China” is a label.

Geocodes are mainly used (in general as an atomic data type) for labelling, data integrity, geotagging and spatial indexing.

In theoretical computer science a geocode system is a locality-preserving hashing function.

Classification

There are some common aspects of many geocodes (or geocode systems) that can be used as classification criteria:

Geocode system

The set of all geocodes used as unique identifiers of the cells of a full-coverage of the geographic surface (or any well-defined area like a country or the oceans), is a geocode system (also named geocode scheme). The syntax and semantic of the geocodes are also components of the system definition:

Many syntax and semantic characteristics are also summarized by classification.

Encode and decode

Any geocode can be translated from a formal (and expanded) expression of the geographical entity, or vice versa, the geocode translated to entity. The first is named encode process, the second decode. The actors and process involved, as defined by OGC,[3] are:

geocoder: A software agent that transforms the description of a geographic entity (e.g. location name or latitude/longitude coordinates), into a normalized data and encodes it as a geocode.
geocoder service: A geocoder implemented as web service (or similar service interface), that accepts a set of geographic entity descriptors as input. The request is "sent" to the Geocoder Service, which processes the request and returns the resulting geocodes. More general services can also return geographic features (e.g. GeoJSON object) represented by the geocodes.
geocoding: Geocoding refers to the assignment of geocodes or coordinates to geographically reference data provided in a textual format. Examples are the two letter country codes and coordinates computed from addresses.
Note: when a physical addressing schemes (street name and house number) is expressed in a standardized and simplified way, it can be conceived as geocode. So, the term geocoding (used for addresses) sometimes is generalized for geocodes.

In spatial indexing applications the geocode can also be translated between human-readable (e.g. hexadecimal) and internal (e.g. binary 64-bit unsigned integer) representations.

Systems of standard names

See main article: Toponym resolution.

Geocodes like country codes, city codes, etc. comes from a table of official names, and the corresponding official codes and geometries (typically polygon of administrative areas). "Official" in the context of control and consensus, typically a table controlled by a standards organization or governmental authority. So, the most general case is a table of standard names and the corresponding standard codes (and its official geometries).

Strictly speaking, the "name" related to a geocode is a toponym, and the table (e.g. toponym to standard code) is the resource for toponym resolution: is the relationship process, usually effectuated by a software agent, between a toponym and "an unambiguous spatial footprint of the same place".[4] Any standardized system of toponym resolution, having codes or encoded abbreviations, can be used as geocode system. The "resolver" agent in this context is also a geocoder.

Sometimes names are translated into numeric codes, to be compact or machine-readable. Since numbers, in this case, are name identifiers, we can consider "numeric names" - so this set of codes will be a kind of "system of standard names".

Hierarchical naming

In the geocode context, space partitioning is the process of dividing a geographical space into two or more disjoint subsets, resulting in a mosaic of subdivisions. Each subdivision can be partitioned again, recursively, resulting in an hierarchical mosaic.

When subdivisions's names are expressed as codes, and code syntax can be decomposed into a parent-child relations, through a well-defined syntactic scheme, the geocode set configures a hierarchical system. A geocode fragment (associated to a subdivision name) can be an abbreviation, numeric or alphanumeric code.

A popular example is the ISO 3166-2 geocode system, representing country names and the names of respective administrative subdivisions separated by hyphen. For example DE is Germany, a simple geocode, and its subdivisions (illustrated) are DE-BW for Baden-Württemberg, DE-BY for Bayern, ..., DE-NW for Nordrhein-Westfalen, etc. The scope is only the first level of the hierarchy. For more levels there are other conventions, like HASC code.[5] The HASC codes are alphabetic and its fragments have constant length (2 letters). Examples:

DE.NW - North Rhine-Westphalia. A two-level hierarchical geocode.

DE.NW.CE - Kreis Coesfeld. A 3-level hierarchical geocode.

Two geocodes of a hierarchical geocode system with same prefix represents different parts of the same location. For instance DE.NW.CE and DE.NW.BN represents geographically interior parts of DE.NW, the common prefix.

Changing the subdivision criteria we can obtain other hierarchical systems. For example, for hydrological criteria there is a geocode system, the US's hydrologic unit code (HUC), that is a numeric representation of basin names in a hierarchical syntax schema (first level illustred). For example, the HUC 17 is the identifier of "Pacific Northwest Columbia basin"; HUC 1706 of "Lower Snake basin", a spatial subset of HUC 17 and a superset of 17060102 ("Imnaha River").

Systems of regular grids

thumb|420px|Each cell of a regular grid is labeled by a geocode. The non-global grids were the most used before the 2000s.
This hierarchical system of local grids, used since the 1930s as British National Grid, generates hierarchical geocodes. Each cell subdivides recurrently its area into a new 10x10 grid.

Inspired in the classic alphanumeric grids, a discrete global grid (DGG) is a regular mosaic which covers the entire Earth's surface (the globe). The regularity of the mosaic is defined by the use of cells of same shape in all the grid, or "near the same shape and near same area" in a region of interest, like a country.

All cells of the grid have an identifier (DGG's cell ID), and the center of the cell can be used as reference for cell ID conversion into geographical point. When a compact human-readable expression of the cell ID is standardized, it becomes a geocode.

Geocodes of different geocode systems can represent the same position in the globe, with same shape and precision, but differ in string-length, digit-alphabet, separators, etc. Non-global grids also differ by scope, and in general are geometrically optimized (avoid overlaps, gaps or loss of uniformity) for the local use.

Hierarchical grids

Each cell of a grid can be transformed into a new local grid, in a recurring process. In the illustrated example, the cell TQ 2980 is a sub-cell of TQ 29, that is a sub-cell of TQ. A system of geographic regular grid references is the base of a hierarchical geocode system.

Two geocodes of a hierarchical geocode grid system can use the prefix rule: geocodes with same prefix represents different parts of the same broader location. Using again the side illustration: TQ 28 and TQ 61 represents geographically interior parts of TQ, the common prefix.

Hierarchical geocode can be split into keys. The Geohash 6vd23gq is the key q of the cell 6vd23g, that is a cell of 6vd23 (key g), and so on, per-digit keys. The OLC 58PJ642P is the key 48 of the cell 58PJ64, that is a cell of 58Q8 (key 48), and so on, two-digit keys. In the case of OLC there is a second key schema, after the + separator: 58PJ642P+48 is the key 2 of the cell 58PJ642P+4. It uses two key schemas. Some geocodes systems (e.g. S2 geometry) also use initial prefix with non-hierarchical key schema.

In general, as technical and non-compact optional representation, geocode systems (based on hierarchical grids) also offer the possibility of expressing their cell identifier with a fine-grained schema, by longer path of keys. For example, the Geohash 6vd2, which is a base32 code, can be expanded to base4 0312312002, which is also a schema with per-digit keys. Geometrically, each Geohash cell is a rectangle that subdivides space recurrently into 32 new rectangles, so, base4 subdividing into 4, is the encoding-expansion limit.[6]

The uniformity of shape and area of cells in a grid can be important for other uses, like spatial statistics. There are standard ways to build a grid covering the entire globe with cells of equal area, regular shape and other properties: Discrete Global Grid System (DGGS) is a series of discrete global grids satisfying all standardized requirements defined in 2017 by the OGC.[7] When human-readable codes obtained from cell identifiers of a DGGS are also standardized, it can be classified as DGGS based geocode system.

Name-and-grid systems

There are also mixed systems, using a syntactical partition, where for example the first part (code prefix) is a name-code and the other part (code suffix) is a grid-code. Example:

Mapcode entrance to the elevator of the Eiffel Tower in Paris is FR-4J.Q2, where FR is the name-code[8] and 4J.Q2 is the grid-code. Semantically France is the context, to obtain its local grid.

For mnemonic coherent semantics, in fine-grained geocode applications, the mixed solutions are most suitable.

Shortening grid-based codes by context

Any geocode system based on regular grid, in general is also a shorter way to express a latitudinal/longitudinal coordinate. But a geocode with more than 6 characters is difficult for remember. On the other hand, a geocode based on standard name (or abbreviation or the complete name) is easier to remember.

This suggests that a "mixed code" can solve the problem, reducing the number of characters when a name can be used as the "context" for the grid-based geocode. For example, in a book where the author says "all geocodes here are contextualized by the chapter's city". In the chapter about Paris, where all places have a Geohash with prefix u09, that code can be removed - . For instance Geohash u09tut can be reduced to tut, or, by an explicit code for context "FR-Paris tut". This is only possible when the context resolution (e.g. translation from "FR-Paris" to the prefix u09) is well-known.

In fact a methodology exists for hierarchical grid-based geocodes with non-variable size, where the code prefix describes a broader area, which can be associated with a name. So, it is possible to shorten by replacing the prefix to the associated context. The most usual context is an official name. Examples:

Standards mixedGrid-basedMixed reference
Grid OLC and country's official names796RWF8Q+WFCape Verde, Praia, WF8Q+WF
Grid Geohash and ISO 3166-2 hierarchical abbreviationse6xkbgxedCV-PR, bgxed

The examples of the Mixed reference column are significantly easier than remembering DGG code column. The methods vary, for example OLC can be shortened by elimination of its first four digits and attaching a suitable sufficiently close locality.[9]

When the mixed reference is also short (9 characters in the second example) and there are a syntax convention to express it (suppose CP‑PR~bgxed), this convention is generating a new name-and-grid geocode system. This is not the case of the first example because, strictly speaking, "Cape Verde, Praia" is not a code.

To be both, a name-and-grid system and also a mixed reference convention, the system must be reversible. Pure name-and-grid systems, like Mapcode, with no way to transform it into a global code, is not a mixed reference, because there is no algorithm to transform the mixed geocode into a grid-based geocode.

Cataloged examples

In use, general scope

Geocodes in use and with general scope:

Geocode InceptionCoverageFormationOwnershipRep. entityContext and description
ISO 3166 (alpha-2 and alpha-3) 1974 globe/only nations Name abbreviation free polygonAdministrative divisions. Country codes and codes of their subdivisions. Two letters (alpha-2) or three letters (alpha-3).
1970 globe/only nations Serial number free polygonAdministrative divisions. Country codes expressed by serial numbers.
~1970 globe/only nations Serial number free polygonAdministrative divisions. region codes, area code, continents, countries (re-using ISO 3166-1 numeric codes).
2008 globe encode(latLon,precision) free grid cellHash notation for locations. See also Geohash and its variants, like OpenStreetMap's short-link[10]
Open Location Code (OLC)2014 globe encode(latLon,precision) free grid cellSee also PlusCodes.[11]
What3words2013 globe encode(latLon) patented grid cellpatent-restrictions system, converts 3x3 meter squares into 3 words.[12] It is in use at Mongol Post.[13]
2001 globe encode(latLon) patented pointA mapcode is a code consisting of two groups of letters and digits, separated by a dot.
Geopeg 2020 globe/only nations encode(latLon) open standard grid cellGeopeg is word-based GPS address, using simple words like London.RedFish. It is a combination of a city and two simple words. It is an open standard geocoding of Earth, currently in development. Geopeg

In use, alternative address

Geocodes can be used in place of official street names and/or house numbers, particularly when a given location has not been assigned an address by authorities. They can also be used as an "alternative address" if it can be converted to a Geo URI. Even if the geocode is not the official designation for a location, it can be used as a "local standard" to allow homes to receive deliveries, access emergency services, register to vote, etc.

Geocode InceptionCoverageFormationOwnershipRep. entityContext and description
Local OLC (Cape Verde)2016 globe encode(latLon,precision) free grid cellOLC is used to provide postal services.[14]
Eircode (Ireland)2014[15] Ireland encode(latLon,precision) copyrighted[16] grid cellIt is used officially as alternative address and as postal code. Limited database and algorithm access. It is a kind of fine-grained postal code.

In use, postal codes

Geocodes in use, as postal codes. A geocode recognized by Universal Postal Union and adopted as "official postal code" by a country, is also a valid postal code. Not all postal codes are geographic, and for some postal code systems, there are codes that are not geocodes (e.g. in UK system). Samples, not a complete list:

Geocode InceptionCoverageFormationOwnershipRep. entityContext and description
CEP (Brazil) 1970? cities or streets Hierarchical serial number proprietary (variable)... The CEP5 is geographic and CEP8 can be a city (polygon), a street (also street side or a fragment of street side) or a point (specific address).
Postal Index Number (India) ? postal regions Hierarchical serial number? proprietary? (undefined?)...
ZIP Code (United States)? postal regions Hierarchical serial number? proprietary? (undefined?)...

In use, telephony and radio

Geocodes in use for telephony or radio broadcasting scope:

In use, others

Geocodes in use and with specific scope:

GeocodeInceptionScopeCoverageFormationOwnershipRep. entityContext and description
ONS code 2001UK only UK/themes Serial number free polygonAdministrative divisions. Geographical areas of the UK, for use in tabulating census.
NUTS area code 2003EU only Europe Hierarchical free polygonAdministrative divisions. Partially administrative, worldwide (countries) and Europe (country to community)
MARC country codes 1971USA only? globe/only nations Name abbreviation free polygonAdministrative divisions. Country codes.
?Canada only ? Serial number free polygonAdministrative divisions, numeric codes. ... Statistical, like ONS.
?trade and transport globe Serial number free polygonAdministrative divisions. UN codes for trade and transport locations.
IATA airport codes 1930sairport globe ? free polygonAdministrative divisions. area /point codes, airports and 3-letter city codes
ICAO airport codes 1950sairport globe ? free polygonAdministrative divisions.area /point codes, airports
IANA country codes 1994Internet globe ? free polygonAdministrative divisions. Similar to ISO 3166-1 alpha-2, see Country code top-level domain, List and Internationalized country codes.
IOC country codes ~1960Sport globe abbreviation free polygonAdministrative divisions. Codes of IOC members; uses three-letter abbreviation country codes, like ISO 3166-1 alpha-3.
?Environment globe ? free polygonAdministrative divisions. A set of four-letter codes used in ecological/geographic regions in oceanography.
?sport/football global ? free polygonAdministrative divisions.
FIPS country codes 1994?scope ? free polygonAdministrative divisions. (FIPS 10-4) area code.
FIPS place codes ?place ? free polygon(FIPS 55). Administrative divisions.
FIPS country codes ?globe/nations ? free polygon(FIPS 6-4). Administrative divisions
FIPS state codes ?? ? free polygon(FIPS 5-2). Administrative divisions

Historical or less widely used

GeocodeInceptionScopeCoverageFormationOwnershipRep. entityContext and description
? general nations and subdivs. Name abbreviation free polygonAdministrative divisions. HASC stands "Hierarchical Administrative Subdivision Codes".
? general ? ? free grid cell?
? general ? ? free grid cellbased on UTM Zones, and Latitude bands of MGRS..
~2005? Meteorology globe grid free grid cell... replaced by modern DGGS's ...
2002 general globe ? free grid cellcompact encoding of geographic coordinate bounds (latitude-longitude). Uses WMO squares as starting point for hierarchical subdivision.
? general ? ? free polygonWorld Geographic Reference System, a military / air navigation coordinate system for point and area identification
~2007? general ? ? free polygonreference system developed by the National Geospatial-Intelligence Agency (NGA)
~1960s general ? ? free grid cellMilitary Grid Reference System. Derived from UTM and UPS grids by NATO with a unique naming convention.

Other examples

Other geocodes:

Other standards

Some standards and name servers include: ISO 3166, FIPS, INSEE, Geonames, IATA and ICAO.

A number of commercial solutions have also been proposed:

See also

Notes and References

  1. The OGS's standard "Discrete Global Grid Systems" definition.
  2. For internet formats and protocols, the WGS84 is de facto and de juri standard: see geo URI protocol, GeoJSON, GML and KML formats.
  3. Definitions of the OGC's "Glossary of Terms".
  4. PhD. DeLozier. Jochen L.. 2007. Toponym resolution in text: annotation, evaluation and applications of spatial grounding. University of Edinburgh. 1842/1849.
  5. Book: Gwillim Law. 2016. Administrative Subdivisions of Countries: A Comprehensive World Reference, 1900 Through 1998. McFarland . 978-0-7864-0729-3. registration.
  6. Note: in practical use Geohash can expand to base2, but geometrically it is based on latitude and longitude (2+2) partitions, so base2 can result in loss of symmetry. Strictly Geohash base32 also need two-digit keys for base4 compatibility.
  7. "Topic 21: Discrete Global Grid Systems Abstract Specification", Open Geospatial Consortium (2017). https://docs.opengeospatial.org/as/15-104r5/15-104r5.html
  8. See formal use of ISO country codes in Mapcode at https://www.mapcode.com/territory
  9. Web site: Guidance for shortening codes · google/Open-location-code Wiki. GitHub.
  10. The OpenStreetMap's short link, documented in wiki.openstreetmap.org, was released in 2009, is near the same source-code 10 years after. It is strongly based on Morton's interlace algorithm.
  11. Web site: Home . plus.codes.
  12. Web site: What3words: Find and share very precise locations via Google Maps with just 3 words. 2 July 2013. 8 July 2014.
  13. Web site: Mongolia adopts what3words as national addressing system – Geospatial Solutions : Geospatial Solutions. June 2016 .
  14. (2016-09-08) "Correios de Cabo Verde testam novo sistema de endereçamento da Google", https://web.archive.org/web/20170209155133/http://aicep.pt/?%2Fnoticias%2F1%2F2534
  15. News: Dept of Communications . Minister Rabbitte launches Eircode the new location codes for Irish addresses . . 28 April 2014 . 2015-07-15.
  16. Web site: Eircode Terms of Use.
  17. Web site: Overview. s2geometry.io. en. 2018-05-11.
  18. Web site: S2 cells and space-filling curves: Keys to building better digital map tools for cities. Kreiss. Sven. 2016-07-27. Medium. 2018-05-11.
  19. Web site: Uber Blog announcing h3. uber.com. en. 2023-02-08.
  20. Web site: h3 open source code. github.com. en. 2023-02-08.
  21. Web site: h3 documentation. h3geo.org. en. 2023-02-08.
  22. https://gssc.esa.int/navipedia/index.php/Step_By_Step_Navigation|Navipedia / ESA
  23. Web site: Second Administrative Level Boundaries . 2020-04-09 . 2021-04-04 . https://web.archive.org/web/20210404034644/https://www.unsalb.org/ . dead .
  24. Web site: OpenPostcode.org. 10 June 2012.
  25. Web site: Shortlink - OpenStreetMap Wiki.
  26. Web site: Understanding Geographic Identifiers (GEOIDs) . . March 3, 2016.