A geographic information system (GIS) consists of integrated computer hardware and software that store, manage, analyze, edit, output, and visualize geographic data.[1] [2] Much of this often happens within a spatial database; however, this is not essential to meet the definition of a GIS.[1] In a broader sense, one may consider such a system also to include human users and support staff, procedures and workflows, the body of knowledge of relevant concepts and methods, and institutional organizations.
The uncounted plural, geographic information systems, also abbreviated GIS, is the most common term for the industry and profession concerned with these systems. It is roughly synonymous with geoinformatics. The academic discipline that studies these systems and their underlying geographic principles, may also be abbreviated as GIS, but the unambiguous GIScience is more common.[3] GIScience is often considered a subdiscipline of geography within the branch of technical geography.
Geographic information systems are utilized in multiple technologies, processes, techniques and methods. They are attached to various operations and numerous applications, that relate to: engineering, planning, management, transport/logistics, insurance, telecommunications, and business.[4] For this reason, GIS and location intelligence applications are at the foundation of location-enabled services, which rely on geographic analysis and visualization.
GIS provides the capability to relate previously unrelated information, through the use of location as the "key index variable". Locations and extents that are found in the Earth's spacetime are able to be recorded through the date and time of occurrence, along with x, y, and z coordinates; representing, longitude (x), latitude (y), and elevation (z). All Earth-based, spatial–temporal, location and extent references should be relatable to one another, and ultimately, to a "real" physical location or extent. This key characteristic of GIS has begun to open new avenues of scientific inquiry and studies.
While digital GIS dates to the mid-1960s, when Roger Tomlinson first coined the phrase "geographic information system",[5] many of the geographic concepts and methods that GIS automates date back decades earlier.
One of the first known instances in which spatial analysis was used came from the field of epidemiology in the "Rapport sur la marche et les effets du choléra dans Paris et le département de la Seine" (1832).[6] French geographer and cartographer, Charles Picquet created a map outlining the forty-eight Districts in Paris, using halftone color gradients, to provide a visual representation for the number of reported deaths due to cholera per every 1,000 inhabitants.
In 1854, John Snow, an epidemiologist and physician, was able to determine the source of a cholera outbreak in London through the use of spatial analysis. Snow achieved this through plotting the residence of each casualty on a map of the area, as well as the nearby water sources. Once these points were marked, he was able to identify the water source within the cluster that was responsible for the outbreak. This was one of the earliest successful uses of a geographic methodology in pinpointing the source of an outbreak in epidemiology. While the basic elements of topography and theme existed previously in cartography, Snow's map was unique due to his use of cartographic methods, not only to depict, but also to analyze clusters of geographically dependent phenomena.
The early 20th century saw the development of photozincography, which allowed maps to be split into layers, for example one layer for vegetation and another for water. This was particularly used for printing contours – drawing these was a labour-intensive task but having them on a separate layer meant they could be worked on without the other layers to confuse the draughtsman. This work was initially drawn on glass plates, but later plastic film was introduced, with the advantages of being lighter, using less storage space and being less brittle, among others. When all the layers were finished, they were combined into one image using a large process camera. Once color printing came in, the layers idea was also used for creating separate printing plates for each color. While the use of layers much later became one of the typical features of a contemporary GIS, the photographic process just described is not considered a GIS in itself -– as the maps were just images with no database to link them to.
Two additional developments are notable in the early days of GIS: Ian McHarg's publication "Design with Nature"[7] and its map overlay method and the introduction of a street network into the U.S. Census Bureau's DIME (Dual Independent Map Encoding) system.[8]
The first publication detailing the use of computers to facilitate cartography was written by Waldo Tobler in 1959.[9] Further computer hardware development spurred by nuclear weapon research led to more widespread general-purpose computer "mapping" applications by the early 1960s.[10]
In 1963, the world's first true operational GIS was developed in Ottawa, Ontario, Canada, by the federal Department of Forestry and Rural Development. Developed by Roger Tomlinson, it was called the Canada Geographic Information System (CGIS) and was used to store, analyze, and manipulate data collected for the Canada Land Inventory, an effort to determine the land capability for rural Canada by mapping information about soils, agriculture, recreation, wildlife, waterfowl, forestry and land use at a scale of 1:50,000. A rating classification factor was also added to permit analysis.[11] [12]
CGIS was an improvement over "computer mapping" applications as it provided capabilities for data storage, overlay, measurement, and digitizing/scanning. It supported a national coordinate system that spanned the continent, coded lines as arcs having a true embedded topology and it stored the attribute and locational information in separate files. As a result of this, Tomlinson has become known as the "father of GIS", particularly for his use of overlays in promoting the spatial analysis of convergent geographic data.[13] CGIS lasted into the 1990s and built a large digital land resource database in Canada. It was developed as a mainframe-based system in support of federal and provincial resource planning and management. Its strength was continent-wide analysis of complex datasets. The CGIS was never available commercially.
In 1964, Howard T. Fisher formed the Laboratory for Computer Graphics and Spatial Analysis at the Harvard Graduate School of Design (LCGSA 1965–1991), where a number of important theoretical concepts in spatial data handling were developed, and which by the 1970s had distributed seminal software code and systems, such as SYMAP, GRID, and ODYSSEY, to universities, research centers and corporations worldwide.[14] These programs were the first examples of general purpose GIS software that was not developed for a particular installation, and was very influential on future commercial software, such as Esri ARC/INFO, released in 1983.
By the late 1970s two public domain GIS systems (MOSS and GRASS GIS) were in development, and by the early 1980s, M&S Computing (later Intergraph) along with Bentley Systems Incorporated for the CAD platform, Environmental Systems Research Institute (ESRI), CARIS (Computer Aided Resource Information System), and ERDAS (Earth Resource Data Analysis System) emerged as commercial vendors of GIS software, successfully incorporating many of the CGIS features, combining the first generation approach to separation of spatial and attribute information with a second generation approach to organizing attribute data into database structures.[15]
In 1986, Mapping Display and Analysis System (MIDAS), the first desktop GIS product[16] was released for the DOS operating system. This was renamed in 1990 to MapInfo for Windows when it was ported to the Microsoft Windows platform. This began the process of moving GIS from the research department into the business environment.
By the end of the 20th century, the rapid growth in various systems had been consolidated and standardized on relatively few platforms and users were beginning to explore viewing GIS data over the Internet, requiring data format and transfer standards. More recently, a growing number of free, open-source GIS packages run on a range of operating systems and can be customized to perform specific tasks. The major trend of the 21st Century has been the integration of GIS capabilities with other Information technology and Internet infrastructure, such as relational databases, cloud computing, software as a service (SAAS), and mobile computing.[17]
See main article: Geographic information system software. The distinction must be made between a singular geographic information system, which is a single installation of software and data for a particular use, along with associated hardware, staff, and institutions (e.g., the GIS for a particular city government); and GIS software, a general-purpose application program that is intended to be used in many individual geographic information systems in a variety of application domains. Starting in the late 1970s, many software packages have been created specifically for GIS applications. Esri's ArcGIS, which includes ArcGIS Pro and the legacy software ArcMap, currently dominates the GIS Market. Other examples of GIS include Autodesk and MapInfo Professional and open source programs such as QGIS, GRASS GIS, MapGuide, and Hadoop-GIS.[18] These and other desktop GIS applications include a full suite of capabilities for entering, managing, analyzing, and visualizing geographic data, and are designed to be used on their own.
Starting in the late 1990s with the emergence of the Internet, as computer network technology progressed, GIS infrastructure and data began to move to servers, providing another mechanism for providing GIS capabilities. This was facilitated by standalone software installed on a server, similar to other server software such as HTTP servers and relational database management systems, enabling clients to have access to GIS data and processing tools without having to install specialized desktop software. These networks are known as distributed GIS.[19] [20] This strategy has been extended through the Internet and development of cloud-based GIS platforms such as ArcGIS Online and GIS-specialized software as a service (SAAS). The use of the Internet to facilitate distributed GIS is known as Internet GIS.[19] [20]
An alternative approach is the integration of some or all of these capabilities into other software or information technology architectures. One example is a spatial extension to Object-relational database software, which defines a geometry datatype so that spatial data can be stored in relational tables, and extensions to SQL for spatial analysis operations such as overlay. Another example is the proliferation of geospatial libraries and application programming interfaces (e.g., GDAL, Leaflet, D3.js) that extend programming languages to enable the incorporation of GIS data and processing into custom software, including web mapping sites and location-based services in smartphones.
The core of any GIS is a database that contains representations of geographic phenomena, modeling their geometry (location and shape) and their properties or attributes. A GIS database may be stored in a variety of forms, such as a collection of separate data files or a single spatially-enabled relational database. Collecting and managing these data usually constitutes the bulk of the time and financial resources of a project, far more than other aspects such as analysis and mapping.[21]
GIS uses spatio-temporal (space-time) location as the key index variable for all other information. Just as a relational database containing text or numbers can relate many different tables using common key index variables, GIS can relate otherwise unrelated information by using location as the key index variable. The key is the location and/or extent in space-time.
Any variable that can be located spatially, and increasingly also temporally, can be referenced using a GIS. Locations or extents in Earth space–time may be recorded as dates/times of occurrence, and x, y, and z coordinates representing, longitude, latitude, and elevation, respectively. These GIS coordinates may represent other quantified systems of temporo-spatial reference (for example, film frame number, stream gage station, highway mile-marker, surveyor benchmark, building address, street intersection, entrance gate, water depth sounding, POS or CAD drawing origin/units). Units applied to recorded temporal-spatial data can vary widely (even when using exactly the same data, see map projections), but all Earth-based spatial–temporal location and extent references should, ideally, be relatable to one another and ultimately to a "real" physical location or extent in space–time.
Related by accurate spatial information, an incredible variety of real-world and projected past or future data can be analyzed, interpreted and represented.[22] This key characteristic of GIS has begun to open new avenues of scientific inquiry into behaviors and patterns of real-world information that previously had not been systematically correlated.
See main article: Data model (GIS) and GIS file formats. GIS data represents phenomena that exist in the real world, such as roads, land use, elevation, trees, waterways, and states. The most common types of phenomena that are represented in data can be divided into two conceptualizations: discrete objects (e.g., a house, a road) and continuous fields (e.g., rainfall amount or population density). Other types of geographic phenomena, such as events (e.g., location of World War II battles), processes (e.g., extent of suburbanization), and masses (e.g., types of soil in an area) are represented less commonly or indirectly, or are modeled in analysis procedures rather than data.
Traditionally, there are two broad methods used to store data in a GIS for both kinds of abstractions mapping references: raster images and vector. Points, lines, and polygons represent vector data of mapped location attribute references.
A new hybrid method of storing data is that of identifying point clouds, which combine three-dimensional points with RGB information at each point, returning a "3D color image". GIS thematic maps then are becoming more and more realistically visually descriptive of what they set out to show or determine.
GIS data acquisition includes several methods for gathering spatial data into a GIS database, which can be grouped into three categories: primary data capture, the direct measurement phenomena in the field (e.g., remote sensing, the global positioning system); secondary data capture, the extraction of information from existing sources that are not in a GIS form, such as paper maps, through digitization; and data transfer, the copying of existing GIS data from external sources such as government agencies and private companies. All of these methods can consume significant time, finances, and other resources.[21]
Survey data can be directly entered into a GIS from digital data collection systems on survey instruments using a technique called coordinate geometry (COGO). Positions from a global navigation satellite system (GNSS) like Global Positioning System can also be collected and then imported into a GIS. A current trend in data collection gives users the ability to utilize field computers with the ability to edit live data using wireless connections or disconnected editing sessions.[23] Current trend is to utilize applications available on smartphones and PDAs - Mobile GIS. This has been enhanced by the availability of low-cost mapping-grade GPS units with decimeter accuracy in real time. This eliminates the need to post process, import, and update the data in the office after fieldwork has been collected. This includes the ability to incorporate positions collected using a laser rangefinder. New technologies also allow users to create maps as well as analysis directly in the field, making projects more efficient and mapping more accurate.
Remotely sensed data also plays an important role in data collection and consist of sensors attached to a platform. Sensors include cameras, digital scanners and lidar, while platforms usually consist of aircraft and satellites. In England in the mid 1990s, hybrid kite/balloons called helikites first pioneered the use of compact airborne digital cameras as airborne geo-information systems. Aircraft measurement software, accurate to 0.4 mm was used to link the photographs and measure the ground. Helikites are inexpensive and gather more accurate data than aircraft. Helikites can be used over roads, railways and towns where unmanned aerial vehicles (UAVs) are banned.
Recently aerial data collection has become more accessible with miniature UAVs and drones. For example, the Aeryon Scout was used to map a 50-acre area with a ground sample distance of 1inches in only 12 minutes.[24]
The majority of digital data currently comes from photo interpretation of aerial photographs. Soft-copy workstations are used to digitize features directly from stereo pairs of digital photographs. These systems allow data to be captured in two and three dimensions, with elevations measured directly from a stereo pair using principles of photogrammetry. Analog aerial photos must be scanned before being entered into a soft-copy system, for high-quality digital cameras this step is skipped.
Satellite remote sensing provides another important source of spatial data. Here satellites use different sensor packages to passively measure the reflectance from parts of the electromagnetic spectrum or radio waves that were sent out from an active sensor such as radar. Remote sensing collects raster data that can be further processed using different bands to identify objects and classes of interest, such as land cover.
The most common method of data creation is digitization, where a hard copy map or survey plan is transferred into a digital medium through the use of a CAD program, and geo-referencing capabilities. With the wide availability of ortho-rectified imagery (from satellites, aircraft, Helikites and UAVs), heads-up digitizing is becoming the main avenue through which geographic data is extracted. Heads-up digitizing involves the tracing of geographic data directly on top of the aerial imagery instead of by the traditional method of tracing the geographic form on a separate digitizing tablet (heads-down digitizing). Heads-down digitizing, or manual digitizing, uses a special magnetic pen, or stylus, that feeds information into a computer to create an identical, digital map. Some tablets use a mouse-like tool, called a puck, instead of a stylus.[25] [26] The puck has a small window with cross-hairs which allows for greater precision and pinpointing map features. Though heads-up digitizing is more commonly used, heads-down digitizing is still useful for digitizing maps of poor quality.
Existing data printed on paper or PET film maps can be digitized or scanned to produce digital data. A digitizer produces vector data as an operator traces points, lines, and polygon boundaries from a map. Scanning a map results in raster data that could be further processed to produce vector data.
When data is captured, the user should consider if the data should be captured with either a relative accuracy or absolute accuracy, since this could not only influence how information will be interpreted but also the cost of data capture.
After entering data into a GIS, the data usually requires editing, to remove errors, or further processing. For vector data it must be made "topologically correct" before it can be used for some advanced analysis. For example, in a road network, lines must connect with nodes at an intersection. Errors such as undershoots and overshoots must also be removed. For scanned maps, blemishes on the source map may need to be removed from the resulting raster. For example, a fleck of dirt might connect two lines that should not be connected.
See main article: Spatial reference system. The earth can be represented by various models, each of which may provide a different set of coordinates (e.g., latitude, longitude, elevation) for any given point on the Earth's surface. The simplest model is to assume the earth is a perfect sphere. As more measurements of the earth have accumulated, the models of the earth have become more sophisticated and more accurate. In fact, there are models called datums that apply to different areas of the earth to provide increased accuracy, like North American Datum of 1983 for U.S. measurements, and the World Geodetic System for worldwide measurements.
The latitude and longitude on a map made against a local datum may not be the same as one obtained from a GPS receiver. Converting coordinates from one datum to another requires a datum transformation such as a Helmert transformation, although in certain situations a simple translation may be sufficient.[27]
In popular GIS software, data projected in latitude/longitude is often represented as a Geographic coordinate system. For example, data in latitude/longitude if the datum is the 'North American Datum of 1983' is denoted by 'GCS North American 1983'.
While no digital model can be a perfect representation of the real world, it is important that GIS data be of a high quality. In keeping with the principle of homomorphism, the data must be close enough to reality so that the results of GIS procedures correctly correspond to the results of real world processes. This means that there is no single standard for data quality, because the necessary degree of quality depends on the scale and purpose of the tasks for which it is to be used. Several elements of data quality are important to GIS data:
The quality of a dataset is very dependent upon its sources, and the methods used to create it. Land surveyors have been able to provide a high level of positional accuracy utilizing high-end GPS equipment, but GPS locations on the average smartphone are much less accurate.[30] Common datasets such as digital terrain and aerial imagery[31] are available in a wide variety of levels of quality, especially spatial precision. Paper maps, which have been digitized for many years as a data source, can also be of widely varying quality.
A quantitative analysis of maps brings accuracy issues into focus. The electronic and other equipment used to make measurements for GIS is far more precise than the machines of conventional map analysis. All geographical data are inherently inaccurate, and these inaccuracies will propagate through GIS operations in ways that are difficult to predict.[32]
Data restructuring can be performed by a GIS to convert data into different formats. For example, a GIS may be used to convert a satellite image map to a vector structure by generating lines around all cells with the same classification, while determining the cell spatial relationships, such as adjacency or inclusion.
More advanced data processing can occur with image processing, a technique developed in the late 1960s by NASA and the private sector to provide contrast enhancement, false color rendering and a variety of other techniques including use of two dimensional Fourier transforms. Since digital data is collected and stored in various ways, the two data sources may not be entirely compatible. So a GIS must be able to convert geographic data from one structure to another. In so doing, the implicit assumptions behind different ontologies and classifications require analysis.[33] Object ontologies have gained increasing prominence as a consequence of object-oriented programming and sustained work by Barry Smith and co-workers.
Spatial ETL tools provide the data processing functionality of traditional extract, transform, load (ETL) software, but with a primary focus on the ability to manage spatial data. They provide GIS users with the ability to translate data between different standards and proprietary formats, whilst geometrically transforming the data en route. These tools can come in the form of add-ins to existing wider-purpose software such as spreadsheets.
GIS spatial analysis is a rapidly changing field, and GIS packages are increasingly including analytical tools as standard built-in facilities, as optional toolsets, as add-ins or 'analysts'. In many instances these are provided by the original software suppliers (commercial vendors or collaborative non commercial development teams), while in other cases facilities have been developed and are provided by third parties. Furthermore, many products offer software development kits (SDKs), programming languages and language support, scripting facilities and/or special interfaces for developing one's own analytical tools or variants. The increased availability has created a new dimension to business intelligence termed "spatial intelligence" which, when openly delivered via intranet, democratizes access to geographic and social network data. Geospatial intelligence, based on GIS spatial analysis, has also become a key element for security. GIS as a whole can be described as conversion to a vectorial representation or to any other digitisation process.
Geoprocessing is a GIS operation used to manipulate spatial data. A typical geoprocessing operation takes an input dataset, performs an operation on that dataset, and returns the result of the operation as an output dataset. Common geoprocessing operations include geographic feature overlay, feature selection and analysis, topology processing, raster processing, and data conversion. Geoprocessing allows for definition, management, and analysis of information used to form decisions.[34]
See main article: Geomorphometry.
See also: Surface gradient. Many geographic tasks involve the terrain, the shape of the surface of the earth, such as hydrology, earthworks, and biogeography. Thus, terrain data is often a core dataset in a GIS, usually in the form of a raster Digital elevation model (DEM) or a Triangulated irregular network (TIN). A variety of tools are available in most GIS software for analyzing terrain, often by creating derivative datasets that represent a specific aspect of the surface. Some of the most common include:
Most of these are generated using algorithms that are discrete simplifications of vector calculus. Slope, aspect, and surface curvature in terrain analysis are all derived from neighborhood operations using elevation values of a cell's adjacent neighbours.[38] Each of these is strongly affected by the level of detail in the terrain data, such as the resolution of a DEM, which should be chosen carefully.[39]
See main article: Proximity analysis. Distance is a key part of solving many geographic tasks, usually due to the friction of distance. Thus, a wide variety of analysis tools have analyze distance in some form, such as buffers, Voronoi or Thiessen polygons, Cost distance analysis, and network analysis.
It is difficult to relate wetlands maps to rainfall amounts recorded at different points such as airports, television stations, and schools. A GIS, however, can be used to depict two- and three-dimensional characteristics of the Earth's surface, subsurface, and atmosphere from information points. For example, a GIS can quickly generate a map with isopleth or contour lines that indicate differing amounts of rainfall. Such a map can be thought of as a rainfall contour map. Many sophisticated methods can estimate the characteristics of surfaces from a limited number of point measurements. A two-dimensional contour map created from the surface modeling of rainfall point measurements may be overlaid and analyzed with any other map in a GIS covering the same area. This GIS derived map can then provide additional information - such as the viability of water power potential as a renewable energy source. Similarly, GIS can be used to compare other renewable energy resources to find the best geographic potential for a region.[40]
Additionally, from a series of three-dimensional points, or digital elevation model, isopleth lines representing elevation contours can be generated, along with slope analysis, shaded relief, and other elevation products. Watersheds can be easily defined for any given reach, by computing all of the areas contiguous and uphill from any given point of interest. Similarly, an expected thalweg of where surface water would want to travel in intermittent and permanent streams can be computed from elevation data in the GIS.
A GIS can recognize and analyze the spatial relationships that exist within digitally stored spatial data. These topological relationships allow complex spatial modelling and analysis to be performed. Topological relationships between geometric entities traditionally include adjacency (what adjoins what), containment (what encloses what), and proximity (how close something is to something else).
See main article: Transport network analysis.
Geometric networks are linear networks of objects that can be used to represent interconnected features, and to perform special spatial analysis on them. A geometric network is composed of edges, which are connected at junction points, similar to graphs in mathematics and computer science. Just like graphs, networks can have weight and flow assigned to its edges, which can be used to represent various interconnected features more accurately. Geometric networks are often used to model road networks and public utility networks, such as electric, gas, and water networks. Network modeling is also commonly employed in transportation planning, hydrology modeling, and infrastructure modeling.
See main article: Map algebra.
Dana Tomlin coined the term "cartographic modeling" in his PhD dissertation (1983); he later used it in the title of his book, Geographic Information Systems and Cartographic Modeling (1990).[41] Cartographic modeling refers to a process where several thematic layers of the same area are produced, processed, and analyzed. Tomlin used raster layers, but the overlay method (see below) can be used more generally. Operations on map layers can be combined into algorithms, and eventually into simulation or optimization models.
See main article: Vector overlay and Map algebra. The combination of several spatial datasets (points, lines, or polygons) creates a new output vector dataset, visually similar to stacking several maps of the same region. These overlays are similar to mathematical Venn diagram overlays. A union overlay combines the geographic features and attribute tables of both inputs into a single new output. An intersect overlay defines the area where both inputs overlap and retains a set of attribute fields for each. A symmetric difference overlay defines an output area that includes the total area of both inputs except for the overlapping area.
Data extraction is a GIS process similar to vector overlay, though it can be used in either vector or raster data analysis. Rather than combining the properties and features of both datasets, data extraction involves using a "clip" or "mask" to extract the features of one data set that fall within the spatial extent of another dataset.
In raster data analysis, the overlay of datasets is accomplished through a process known as "local operation on multiple rasters" or "map algebra", through a function that combines the values of each raster's matrix. This function may weigh some inputs more than others through use of an "index model" that reflects the influence of various factors upon a geographic phenomenon.
See main article: Geostatistics.
Geostatistics is a branch of statistics that deals with field data, spatial data with a continuous index. It provides methods to model spatial correlation, and predict values at arbitrary locations (interpolation).
When phenomena are measured, the observation methods dictate the accuracy of any subsequent analysis. Due to the nature of the data (e.g. traffic patterns in an urban environment; weather patterns over the Pacific Ocean), a constant or dynamic degree of precision is always lost in the measurement. This loss of precision is determined from the scale and distribution of the data collection.
To determine the statistical relevance of the analysis, an average is determined so that points (gradients) outside of any immediate measurement can be included to determine their predicted behavior. This is due to the limitations of the applied statistic and data collection methods, and interpolation is required to predict the behavior of particles, points, and locations that are not directly measurable.
Interpolation is the process by which a surface is created, usually a raster dataset, through the input of data collected at a number of sample points. There are several forms of interpolation, each which treats the data differently, depending on the properties of the data set. In comparing interpolation methods, the first consideration should be whether or not the source data will change (exact or approximate). Next is whether the method is subjective, a human interpretation, or objective. Then there is the nature of transitions between points: are they abrupt or gradual. Finally, there is whether a method is global (it uses the entire data set to form the model), or local where an algorithm is repeated for a small section of terrain.
Interpolation is a justified measurement because of a spatial autocorrelation principle that recognizes that data collected at any position will have a great similarity to, or influence of those locations within its immediate vicinity.
Digital elevation models, triangulated irregular networks, edge-finding algorithms, Thiessen polygons, Fourier analysis, (weighted) moving averages, inverse distance weighting, kriging, spline, and trend surface analysis are all mathematical methods to produce interpolative data.
See main article: Geocoding. Geocoding is interpolating spatial locations (X,Y coordinates) from street addresses or any other spatially referenced data such as ZIP Codes, parcel lots and address locations. A reference theme is required to geocode individual addresses, such as a road centerline file with address ranges. The individual address locations have historically been interpolated, or estimated, by examining address ranges along a road segment. These are usually provided in the form of a table or database. The software will then place a dot approximately where that address belongs along the segment of centerline. For example, an address point of 500 will be at the midpoint of a line segment that starts with address 1 and ends with address 1,000. Geocoding can also be applied against actual parcel data, typically from municipal tax maps. In this case, the result of the geocoding will be an actually positioned space as opposed to an interpolated point. This approach is being increasingly used to provide more precise location information.
See main article: Reverse geocoding. Reverse geocoding is the process of returning an estimated street address number as it relates to a given coordinate. For example, a user can click on a road centerline theme (thus providing a coordinate) and have information returned that reflects the estimated house number. This house number is interpolated from a range assigned to that road segment. If the user clicks at the midpoint of a segment that starts with address 1 and ends with 100, the returned value will be somewhere near 50. Note that reverse geocoding does not return actual addresses, only estimates of what should be there based on the predetermined range.
See main article: Multiple-criteria decision analysis. Coupled with GIS, multi-criteria decision analysis methods support decision-makers in analysing a set of alternative spatial solutions, such as the most likely ecological habitat for restoration, against multiple criteria, such as vegetation cover or roads. MCDA uses decision rules to aggregate the criteria, which allows the alternative solutions to be ranked or prioritised.[42] GIS MCDA may reduce costs and time involved in identifying potential restoration sites.
GIS or spatial data mining is the application of data mining methods to spatial data. Data mining, which is the partially automated search for hidden patterns in large databases, offers great potential benefits for applied GIS-based decision making. Typical applications include environmental monitoring. A characteristic of such applications is that spatial correlation between data measurements require the use of specialized algorithms for more efficient data analysis.[43]
See main article: Cartographic design and Digital mapping. Cartography is the design and production of maps, or visual representations of spatial data. The vast majority of modern cartography is done with the help of computers, usually using GIS but production of quality cartography is also achieved by importing layers into a design program to refine it. Most GIS software gives the user substantial control over the appearance of the data.
Cartographic work serves two major functions:
First, it produces graphics on the screen or on paper that convey the results of analysis to the people who make decisions about resources. Wall maps and other graphics can be generated, allowing the viewer to visualize and thereby understand the results of analyses or simulations of potential events. Web Map Servers facilitate distribution of generated maps through web browsers using various implementations of web-based application programming interfaces (AJAX, Java, Flash, etc.).
Second, other database information can be generated for further analysis or use. An example would be a list of all addresses within one mile (1.6 km) of a toxic spill.
An archeochrome is a new way of displaying spatial data. It is a thematic on a 3D map that is applied to a specific building or a part of a building. It is suited to the visual display of heat-loss data.
See main article: Terrain cartography. Traditional maps are abstractions of the real world, a sampling of important elements portrayed on a sheet of paper with symbols to represent physical objects. People who use maps must interpret these symbols. Topographic maps show the shape of land surface with contour lines or with shaded relief.
Today, graphic display techniques such as shading based on altitude in a GIS can make relationships among map elements visible, heightening one's ability to extract and analyze information. For example, two types of data were combined in a GIS to produce a perspective view of a portion of San Mateo County, California.
A GIS was used to register and combine the two images to render the three-dimensional perspective view looking down the San Andreas Fault, using the Thematic Mapper image pixels, but shaded using the elevation of the landforms. The GIS display depends on the viewing point of the observer and time of day of the display, to properly render the shadows created by the sun's rays at that latitude, longitude, and time of day.
See main article: Web mapping.
In recent years there has been a proliferation of free-to-use and easily accessible mapping software such as the proprietary web applications Google Maps and Bing Maps, as well as the free and open-source alternative OpenStreetMap. These services give the public access to huge amounts of geographic data, perceived by many users to be as trustworthy and usable as professional information.[44] For example, during the COVID-19 pandemic, web maps hosted on dashboards were used to rapidly disseminate case data to the general public.[45]
Some of them, like Google Maps and OpenLayers, expose an application programming interface (API) that enable users to create custom applications. These toolkits commonly offer street maps, aerial/satellite imagery, geocoding, searches, and routing functionality. Web mapping has also uncovered the potential of crowdsourcing geodata in projects like OpenStreetMap, which is a collaborative project to create a free editable map of the world. These mashup projects have been proven to provide a high level of value and benefit to end users outside that possible through traditional geographic information.[46] [47]
Web mapping is not without its drawbacks. Web mapping allows for the creation and distribution of maps by people without proper cartographic training.[48] This has led to maps that ignore cartographic conventions and are potentially misleading, with one study finding that more than half of United States state government COVID-19 dashboards did not follow these conventions.[49] [50]
Since its origin in the 1960s, GIS has been used in an ever-increasing range of applications, corroborating the widespread importance of location and aided by the continuing reduction in the barriers to adopting geospatial technology. The perhaps hundreds of different uses of GIS can be classified in several ways:
The implementation of a GIS is often driven by jurisdictional (such as a city), purpose, or application requirements. Generally, a GIS implementation may be custom-designed for an organization. Hence, a GIS deployment developed for an application, jurisdiction, enterprise, or purpose may not be necessarily interoperable or compatible with a GIS that has been developed for some other application, jurisdiction, enterprise, or purpose.[61]
GIS is also diverging into location-based services, which allows GPS-enabled mobile devices to display their location in relation to fixed objects (nearest restaurant, gas station, fire hydrant) or mobile objects (friends, children, police car), or to relay their position back to a central server for display or other processing.
GIS is also used in digital marketing and SEO for audience segmentation based on location.[62] [63]
The use of digital maps generated by GIS has also influenced the development of an academic field known as spatial humanities.[64]
See main article: Open Geospatial Consortium. The Open Geospatial Consortium (OGC) is an international industry consortium of 384 companies, government agencies, universities, and individuals participating in a consensus process to develop publicly available geoprocessing specifications. Open interfaces and protocols defined by OpenGIS Specifications support interoperable solutions that "geo-enable" the Web, wireless and location-based services, and mainstream IT, and empower technology developers to make complex spatial information and services accessible and useful with all kinds of applications. Open Geospatial Consortium protocols include Web Map Service, and Web Feature Service.[65]
GIS products are broken down by the OGC into two categories, based on how completely and accurately the software follows the OGC specifications.
Compliant products are software products that comply to OGC's OpenGIS Specifications. When a product has been tested and certified as compliant through the OGC Testing Program, the product is automatically registered as "compliant" on this site.
Implementing products are software products that implement OpenGIS Specifications but have not yet passed a compliance test. Compliance tests are not available for all specifications. Developers can register their products as implementing draft or approved specifications, though OGC reserves the right to review and verify each entry.
See also: Historical geographic information system and Time geography. The condition of the Earth's surface, atmosphere, and subsurface can be examined by feeding satellite data into a GIS. GIS technology gives researchers the ability to examine the variations in Earth processes over days, months, and years through the use of cartographic visualizations.[66] As an example, the changes in vegetation vigor through a growing season can be animated to determine when drought was most extensive in a particular region. The resulting graphic represents a rough measure of plant health. Working with two variables over time would then allow researchers to detect regional differences in the lag between a decline in rainfall and its effect on vegetation.
GIS technology and the availability of digital data on regional and global scales enable such analyses. The satellite sensor output used to generate a vegetation graphic is produced for example by the advanced very-high-resolution radiometer (AVHRR). This sensor system detects the amounts of energy reflected from the Earth's surface across various bands of the spectrum for surface areas of about 1 square kilometer. The satellite sensor produces images of a particular location on the Earth twice a day. AVHRR and more recently the moderate-resolution imaging spectroradiometer (MODIS) are only two of many sensor systems used for Earth surface analysis.
In addition to the integration of time in environmental studies, GIS is also being explored for its ability to track and model the progress of humans throughout their daily routines. A concrete example of progress in this area is the recent release of time-specific population data by the U.S. Census. In this data set, the populations of cities are shown for daytime and evening hours highlighting the pattern of concentration and dispersion generated by North American commuting patterns. The manipulation and generation of data required to produce this data would not have been possible without GIS.
Using models to project the data held by a GIS forward in time have enabled planners to test policy decisions using spatial decision support systems.
Tools and technologies emerging from the World Wide Web Consortium's Semantic Web are proving useful for data integration problems in information systems. Correspondingly, such technologies have been proposed as a means to facilitate interoperability and data reuse among GIS applications and also to enable new analysis mechanisms.[67] [68] [69] [70]
Ontologies are a key component of this semantic approach as they allow a formal, machine-readable specification of the concepts and relationships in a given domain. This in turn allows a GIS to focus on the intended meaning of data rather than its syntax or structure. For example, reasoning that a land cover type classified as deciduous needleleaf trees in one dataset is a specialization or subset of land cover type forest in another more roughly classified dataset can help a GIS automatically merge the two datasets under the more general land cover classification. Tentative ontologies have been developed in areas related to GIS applications, for example the hydrology ontology[71] developed by the Ordnance Survey in the United Kingdom and the SWEET ontologies[72] developed by NASA's Jet Propulsion Laboratory. Also, simpler ontologies and semantic metadata standards are being proposed by the W3C Geo Incubator Group[73] to represent geospatial data on the web. GeoSPARQL is a standard developed by the Ordnance Survey, United States Geological Survey, Natural Resources Canada, Australia's Commonwealth Scientific and Industrial Research Organisation and others to support ontology creation and reasoning using well-understood OGC literals (GML, WKT), topological relationships (Simple Features, RCC8, DE-9IM), RDF and the SPARQL database query protocols.
Recent research results in this area can be seen in the International Conference on Geospatial Semantics[74] and the Terra Cognita – Directions to the Geospatial Semantic Web[75] workshop at the International Semantic Web Conference.
See main article: Neogeography and Public participation GIS. With the popularization of GIS in decision making, scholars have begun to scrutinize the social and political implications of GIS.[76] [77] GIS can also be misused to distort reality for individual and political gain.[78] [79] It has been argued that the production, distribution, utilization, and representation of geographic information are largely related with the social context and has the potential to increase citizen trust in government.[80] Other related topics include discussion on copyright, privacy, and censorship. A more optimistic social approach to GIS adoption is to use it as a tool for public participation.
See also: Esri Education User Conference. At the end of the 20th century, GIS began to be recognized as tools that could be used in the classroom.[81] [82] [83] The benefits of GIS in education seem focused on developing spatial cognition, but there is not enough bibliography or statistical data to show the concrete scope of the use of GIS in education around the world, although the expansion has been faster in those countries where the curriculum mentions them.[84]
GIS seem to provide many advantages in teaching geography because they allow for analyses based on real geographic data and also help raise many research questions from teachers and students in classrooms. They also contribute to improvement in learning by developing spatial and geographical thinking and, in many cases, student motivation.[84]
Courses in GIS have also been offered by educational institutions.[85] [86]
GIS is proven as an organization-wide, enterprise and enduring technology that continues to change how local government operates.[87] Government agencies have adopted GIS technology as a method to better manage the following areas of government organization:
The Open Data initiative is pushing local government to take advantage of technology such as GIS technology, as it encompasses the requirements to fit the Open Data/Open Government model of transparency. With Open Data, local government organizations can implement Citizen Engagement applications and online portals, allowing citizens to see land information, report potholes and signage issues, view and sort parks by assets, view real-time crime rates and utility repairs, and much more.[89] [90] The push for open data within government organizations is driving the growth in local government GIS technology spending, and database management.