EIDR explained

Entertainment ID Registry Association
Size:200px
Formation:2010
Type:501(c)(6) not-for-profit membership corporation
Headquarters:Redwood City, CA
Leader Title:Executive Director
Leader Name:Hollie Choi

The Entertainment Identifier Registry, or EIDR, is a global unique identifier system for a broad array of audiovisual objects, including motion pictures, television, and radio programs. The identification system resolves an identifier to a metadata record that is associated with top-level titles, edits, DVDs, encodings, clips, and mashups. EIDR also provides identifiers for video service providers, such as broadcast and cable networks.

As of June 2020, EIDR contains over two million records, including almost 400 thousand movies and almost one million episodes from over 40,000 TV series.[1]

EIDR is an implementation of a digital object identifier (DOI).

History

Media asset identification systems have existed for decades. The common motivation for their creation is to enable the management of media assets through the assignment of a unique id to a set of metadata representing salient characteristics of each asset. Over time such systems tend to proliferate, with each arising to deal with a specific set of issues. As a result, there is considerable variation between systems in terms of which assets are categorized, which metadata is associated with each asset, and the very definition of an asset. To name a few examples, should a "director's cut" of a film be distinct from the original theatrical release? How should regional variations (e.g. translation of the title or dialog into foreign languages) be accounted for? Further complications include the procedures (and required credentials) for adding new assets, editing existing assets, and creating derivative assets.

EIDR was created to address these issues, as well as others encountered in video asset workflows, both in a business-to-business context and the intramural post-production activities of content producers. EIDR has the following characteristics:

EIDR is intended to supplement, not replace, existing asset identification systems. To the contrary, a key feature is to allow an EIDR record to include references to that asset's ID under other systems. This feature is particularly useful for film and television archives, making it easy for them to cross-reference their holdings with other sources for the work and metadata about it. By design, EIDR does not replicate features of other asset ID systems, e.g. commercial systems that seek to add value through enhanced metadata (e.g. plot summaries, production details). It is also a non-goal to track ownership and rights information, which can, however, be implemented as applications that use the EIDR ID.

Content model

EIDR is built on a collection of records (which are further sub-divided into fields) that are stored in a central registry. These records are referenced externally by DOIs, which are assigned when a record is created, and each identifier is immutable thereafter. The identifier resolution system underlying DOIs is the Handle System and so each native EIDR Content ID is a handle formatted, in increasing specificity, to handle, DOI and EIDR standards.

Content ID format

The canonical form of an EIDR Content ID is an instance of a handle and has the format:

10.5240/XXXX-XXXX-XXXX-XXXX-XXXX-Cwhere

There is also a 96-bit compact binary form that is intended for embedding in small payloads such as watermarks. This form is generated from the canonical format as follows:

The Uniform Resource Name form for an EIDR ID is specified in .

For use on the web an EIDR content ID can be represented as a URI in one of these forms:

Record types

There are four types of content records, each associated with a reserved prefix:

The sub-prefixes 5237, 5238, 5239, and 5240 are all assigned to the EIDR Association.

Content Records

Content records are objects categorized by their types and relationships. Each has three different (orthogonal) kinds of type:

Basic metadata

The following fields (taken from a larger set) comprise the base object data of a content record:

Deleted content records

An EIDR ID must be always resolvable, thus under normal circumstances the corresponding Content Record will be permanent. There are two mechanisms available to deal with errors or other unusual circumstances. The preferred one is aliasing, whereby an EIDR ID is transparently redirected to another content record. Aliasing is commonly employed to deal with an asset being registered twice.

The other mechanism is the use of tombstone records. This is employed when the Content Record is corrupted, or an otherwise invalid asset was accidentally registered. In this case the ID will be aliased to a special tombstone record. The tombstone can be recognized by applications because its EIDR ID field will be set to the distinguished value "10.5240/0000-0000-0000-0000-0000-X". Note that "X" means the 24th letter of the Latin alphabet (ASCII 0x58 or Unicode U+0058).

Alternate ID

Having a rich set of alternate IDs for content is one of the primary goals of EIDR. This allows EIDR IDs to be used everywhere in content workflows; if an alternate ID is needed it can be found in the metadata for the EIDR ID. EIDR supports the inclusion both proprietary and other standard (e.g. ISAN) ID references. Additional Alternate IDs can be added when needed (e.g. by parties wanting to support new workflows). Below is an example of alternate IDs for the EIDR asset 10.5240/EA73-79D7-1B2B-B378-3A73-M (the movie Blade Runner). If an alternate ID is resolvable algorithmically, for example by placing it appropriately in a template URL, EIDR makes that link available.

Alternate IDs for 10.5240/EA73-79D7-1B2B-B378-3A73-M
Alternate ID0000-0000-14A9-0000-K-0000-0000-E
Type: ISAN
Alternate ID #289
Type: IVA
Alternate ID #3B000SW4DLM
Type: Proprietary Domain: amazon.com
Alternate ID #412886
Type: Proprietary Domain: flixster.com
Alternate ID #515042
Type: Proprietary Domain: thecinemasource.com
Alternate ID #6tt0083658
Type: IMDB Relation: IsSameAs
Alternate ID #7E0087486000
Type: Proprietary Domain: spe.sony.com/MPM
Alternate ID #83929
Type: Proprietary Domain: spe.sony.com/ProductID
Alternate ID #92002029
Type: Proprietary Domain: warnerbros.com/MPM
Alternate ID #10389785
Type: Proprietary Domain veronicamagazine.nl
Alternate ID #11B001EC2J1G
Type: Proprietary Domain: amazon.com
Alternate ID #12150002645
Type: Proprietary Domain: bfi.org.uk

Alternate IDs are partitioned into non-proprietary and proprietary. The former have distinguished, predefined types (e.g. those issued by ISAN, IMDb, and IVA), whereas proprietary IDs are all of type "Proprietary", and are further distinguished by an associated DNS domain. As of July 2017, there are over 2 million alternate IDs directly available through EIDR.

Relationships between objects

Content objects can be related to each other according to the following table. These relations are expressed as additional fields in the content record and are thus relative to that object. Note that the subject object is the child and the target is the parent (e.g. subject isOf parent). Additional constraints are noted in the table.

Inheritance relationships: The object on which the relationship exists can inherit basic metadata fields from the object to which the relationship refers. Only one inheritance relationship may exist on an object. These relationships produce a tree structure rooted in the EIDR ID for an abstraction.
isSeasonOfA group of series episodes released over a contiguous span of time (e.g. broadcast year) e.g. 10.5240/AB95-8734-5D98-A282-2DF0-C ("Season 9") is a season of 10.5240/C272-DA64-E2B5-0A78-2AC3-Z ("The X-Files")
isEpisodeOfe.g. 10.5240/E008-224D-0397-0560-6300-8 ("Sunshine Days") is an episode of 10.5240/AB95-8734-5D98-A282-2DF0-C ("Season 9").
isEditOfAn instance of a title with unique characteristics that differentiate it from any other version. For example, 10.5240/7290-C8AD-12BA-4F93-3B07-7 ("Blade Runner: The Director's Cut") is an edit of 10.5240/EA73-79D7-1B2B-B378-3A73-M.
isManifestationOfA manifestation is a more specific instance of a work that can be sold, transmitted, transferred or played. The parent of a manifestation should be an edit. For example, 10.5240/9CE1-DE39-5F3E-073D-4307-7 is the Ultraviolet Standard CFF (standard definition, English audio and subtitles) for "Blade Runner: The Director's Cut". It is a manifestation of the abstract work 10.5240/EA73-79D7-1B2B-B378-3A73-M.
isClipOfOne (and only one) contiguous fragment of an asset.
Dependence relationships: The objects to which the relationship refers have a strong bearing on the basic nature of the object on which the relationship exists. This means that the objects referred to in the relationship must be taken into account when checking for duplicates when an object is created or modified. These relationships produce directed graphs within and across trees.
isCompositeOfA single work composed of parts of multiple other records.
isCompilationOfA collection of multiple whole works that is not more precisely describable.
Lightweight relationships: There is no inheritance; the objects to which they refer do not influence the underlying nature of the object on which the relationship exists. These relationships are used primarily when moving around within the object tree and connecting object trees to each other, producing a directed graph across elements of those trees.
isPackagingOfFor creating a collection of assets that are released together e.g. 10.5240/F219-975E-5990-4570-BA75-2 ("Hannah Montana and Miley...") is a packaging of 10.5240/9ABE-2BF1-ACE7-EBA2-8E57-N.
isPromotionOfPromotional objects such as a trailer.
isSupplementToAncillary material that might be found on a DVD, such as an outtake or behind-the-scenes feature.
isAlternateContentForContent that in synchronized to the main asset, such as audio or an alternate camera angle.

Use in standards and applications

EIDR has been incorporated into many standards. A few of the more significant ones are listed here:

EIDR identifiers have found their way into an increasing number of commercial applications. The following are illustrative of some of the advantages of using EIDR:

Operations & Administrative

EIDR is administered by the non-profit EIDR Association, which was founded in October 2010 by MovieLabs, CableLabs, Comcast and Rovi. Membership has grown steadily since then: as of late-2014 it has 79 members divided between the Industry Promoters and Industry Contributor levels. The fastest growing category is non-US companies, which now accounts for about 20% of membership.The EIDR Association operates two EIDR registries: Production and Sandbox. The former is the official site, and the latter is reserved for test and development. Both systems are available publicly online, but the contents of the sandbox are not guaranteed to be correct, complete, or even to refer to assets that exist. Only members of the EIDR association may modify the registry.

Registration

Registration of new assets can be done individually or in bulk (up to 100,000 assets at a time). In either case, the workflow comprises a combination of automated (to perform well-defined but tedious tasks) and manual (where human judgment is called for) processes. It is also iterative, as the initial matching process may identify a variety of gaps and errors that need to be dealt with.

Registering new assets is a complex process that requires some preparation, particularly in the case of bulk submission. The automated processes will check syntax, make sure that the basic metadata is supplied, and that any dependencies (e.g. series records created before constituent episodes) are honored. Manual steps include making sure the correct Parties are associated with the asset. One of the most important steps is ensuring that a new asset does not already exist in the registry: this is covered in the next section.

In order to register a new asset a user must be associated with a party that has been granted the "Registrant" role by the EIDR operator. A registrant may be a principal agent, such as a studio or an encoding house, but it may also be a Party doing bulk registration of back-catalogue items, or a Party acting on behalf of someone else. It is also a requirement that a registrant be an EIDR member. In general, content ownership, metadata authority, and registration capability are separate and unrelated concepts.

Deduplication

This refers to flagging assets being submitted to the registry as falling into one of the following three categories:

This assessment is based on applying a (large) set of rules to the candidate asset, which results a numerical score. Bucketing occurs as the result of comparing the score to two thresholds:

Assets falling between the low and high threshold are deemed to have a high possibility of being a duplicate: the proposed record addition/modification will not proceed until manually reviewed by EIDR operations staff.

Architecture

The components of the EIDR system are shown below.

The principal functional blocks are as follows:

Relation to DOI and Handle System

An EIDR ID is a specialized example of a Digital Object Identifier (DOI), which in turn is built on top of the Handle System developed by the Corporation for National Research Initiatives (CNRI). The EIDR-specific aspects of the lower layers are described in more detail below.

Digital Object Identifier (EIDR Aspects)

A Digital Object Identifier, standardized as ISO 26324,[16] seeks to uniquely identify a wide range of digital artifacts including books, recordings, research data, and other digital content. The goal is not just for the IDs to be unique, but persistent and immutable. As opposed to URLs, DOI identifiers stay the same even if the objects move to another location, or become owned by another organization. Here are some of the characteristics of DOI:

The DOI data model provides the means to associate metadata with each object, as well as policies governing its use. In the words of the DOI Handbook, metadata may include "names, identifiers, descriptions, types, classifications, locations, times, measurements, relationships and any other kind of information related to [an object]." Metadata flows between the following entities:

To foster interoperability between RAs, DOI has the concept of a metadata Kernel. This is a core set of metadata that all objects stored within the DOI framework should have. The full set may be found in the DOI handbook. Interoperability is a large topic extending beyond the scope of EIDR, but the following subset is particularly relevant to EIDR assets:

EIDR metadata is available in standard DOI kernel metadata format as well as EIDR-specific formats. The DOI for the DOI metadata schema is .

Handle System (EIDR Aspects)

DOI is in turn implemented on top of the Handle System, a distributed, highly scalable, name resolution service. A handle is defined as:

::= "/"

The Naming Authority is globally unique and defines both an administrative space and the syntax of the Handle Local Name. For EIDR in the definition above, the "10.5240" is the EIDR Naming Authority, and is responsible for resolving the suffix (including that it conforms to the expected syntax for an EIDR asset). The range of allowable Naming Authorities is more general than is employed by DOI (or EIDR).

The distributed nature of the Handle System allows each local namespace to be hosted on multiple geographically distributed service sites. This is a federated model where each local name space has complete control over the placement and operation of its service sites. Furthermore, each service site may contain multiple resolution servers: requests directed to a particular service site will be dispatched evenly across its constituent servers.

The data model of the Handle System is simple but flexible. An arbitrary number of values may be associated with each handle. Over time, these values may be created, modified, and destroyed. Each such datum has the following attributes:

Accessing the Handle System is done via a wire protocol defined in RFC 3652; EIDR applications don't have to be concerned with this because of the layering of protocols.

See also

Further reading

  1. R. Kroon, R. Drewry, A. Leigh, S. McConnachie. "Content Identification for Audiovisual Archives". International Association of Sound and Audiovisual Archives Journal, Summer 2015 (No. 45).
  2. R. Kroon. "Bringing Order to Digital Identifiers". Media and Entertainment Journal Winter 2014-2015: 148–150.
  3. R. Drewry, D. Dulchinos. "Transforming Entertainment Through Technology". Media and Entertainment Journal Winter 2013-2014: 81–88.
  4. D. Agranoff, W. Michel, T. Wakai. "Streamlined Content Metadata Integration and Management Using Entertainment ID Registry (EIDR)". SCTE Cable-Tec Expo 2012.

External links

References

  1. News: 2010-02-26 . Batman comic fetches $1.075 million, rewrites record . en . Reuters . 2023-08-18.
  2. http://www.iso.org/iso/catalogue_detail.htm?csnumber=31531 ISO/IEC 7064:2003
  3. http://www.w3.org/TR/xmlschema-2/#duration W3C XML Schema Part 2: Datatypes Second Edition
  4. https://kws.smpte.org/kws/public/projects/project/details?project_id=162 SMPTE RP 2079
  5. Advanced Media Workflow Association AS-03 MXF Program Delivery Specification.
  6. Advanced Media Workflow Association AS-11 MFX for Contribution Specification.
  7. http://standards.smpte.org/content/978-1-61482-774-0/rp-2021-5-2013/SEC1.refs SMPTE RP 2021-5:2013
  8. https://tech.ebu.ch/docs/tech/tech3293.pdf EBU TECH 3293
  9. https://www.dvb.org/resources/public/standards/a167-2_dvb-css_part_2-content-id-and-media-sync.zip DVB Document A167-2
  10. http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber=66430 ISO/IEC CD 23000-15
  11. http://www.cablelabs.com/wp-content/uploads/specdocs/MD-SP-AMIv3.0-I02-121210.pdf MD-SP-AMIv3.0-I02-121210
  12. http://www.scte.org/documents/pdf/standards/ANSI_SCTE%2035%202013.pdf ANSI/SCTE 35 2013
  13. http://www.scte.org/documents/pdf/standards/SCTE%20130-10%202013.pdf SCTE 130-10 2013
  14. http://filmstandards.org/fsc/index.php/How_EN_15744_and_EN_15907_came_into_being TC 372 Workshop Compendium
  15. http://eidr.org/swisscom-ppsmedia-press-tv Press Release
  16. http://www.iso.org/iso/catalogue_detail?csnumber=43506 ISO 26324:2012