Schema crosswalk explained

A schema crosswalk is a table that shows equivalent elements (or "fields") in more than one database schema. It maps the elements in one schema to the equivalent elements in another.

Crosswalk tables are often employed within or in parallel to enterprise systems, especially when multiple systems are interfaced or when the system includes legacy system data. In the context of Interfaces, they function as an internal extract, transform, load (ETL) mechanism.

For example, this is a metadata crosswalk from MARC standards to Dublin Core:

MARC fieldDublin Core element
$260c (Date of publication, distribution, etc.)Date.Created
522 (Geographic Coverage Note)Coverage.Spatial
$300a (Physical Description)Format.Extent

Crosswalks show people where to put the data from one scheme into a different scheme. They are often used by libraries, archives, museums, and other cultural institutions to translate data to or from MARC standards, Dublin Core, Text Encoding Initiative (TEI), and other metadata schemes. For example, an archive has a MARC record in its catalog describing a manuscript. Suppose the archive makes a digital copy of that manuscript and wants to display it on the web along with the information from the catalog. In that case, it will have to translate the data from the MARC catalog record into a different format, such as Metadata Object Description Schema, that is viewable on a webpage. Because MARC has various fields than MODS, decisions must be made about where to put the data into MODS. This type of "translating" from one format to another is often called "metadata mapping" or "field mapping," and is related to "data mapping", and "semantic mapping".

Crosswalks also have several technical capabilities. They help databases using different metadata schemes to share information. They help metadata harvesters create union catalogs. They enable search engines to search multiple databases simultaneously with a single query.

Challenges for crosswalks

One of the biggest challenges for crosswalks is that no two metadata schemes are 100% equivalent. One scheme may have a field that doesn't exist in another scheme or a field that is split into two different fields in another scheme; this is why data is often lost when mapping from a complex scheme to a simpler one. For example, when mapping from MARC to Simple Dublin Core, the distinction between types of titles is lost:

MARC fieldDublin Core element
210 Abbreviated TitleTitle
222 Key TitleTitle
240 Uniform TitleTitle
242 Translated TitleTitle
245 Title StatementTitle
246 Variant TitleTitle

Simple Dublin Core only has one "Title" element, so all of the different types of MARC titles get lumped together without further distinctions. A future attempt to convert the metadata back into MARC would enter the information in the basic MARC 245 Title Statement field, with none of the original distinctions.[1]

Dublin Core elementMARC field
Title245 Title Statement
Title245 Title Statement
Title245 Title Statement
Title245 Title Statement
Title245 Title Statement
Title245 Title Statement

This is why crosswalks are said to be "lateral" (one-way) mappings from one scheme to another. Separate crosswalks would be required to map from scheme A to scheme B and from scheme B to scheme A.[2]

Difficulties in mapping

Other mapping problems arise when:

Some of these problems are not fixable. As Karen Coyle says in "Crosswalking Citation Metadata: The University of California's Experience,"

"The more metadata experience we have, the more it becomes clear that metadata perfection is not attainable, and anyone who attempts it will be sorely disappointed. When metadata is crosswalked between two or more unrelated sources, there will be data elements that cannot be reconciled in an ideal manner. The key to a successful metadata crosswalk is intelligent flexibility. It is essential to focus on the important goals and be willing to compromise to reach a practical conclusion to projects."[3]

See also

External links

Notes and References

  1. https://www.loc.gov/marc/dccross.html "Dublin Core to MARC Crosswalk,"
  2. Book: Caplan, Priscilla. Metadata fundamentals for all librarians. American Library Association. 2003. 0838908470. Chicago. 39.
  3. in "Metadata in Practice" Diane I. Hillmann and Elaine L. Westbrooks, eds., American Library Association, Chicago, 2004, p. 91.