Annodex is a digital media format developed by CSIRO to provide annotation and indexing of continuous media, such as audio and video.
It is based on the Ogg container format, with an XML language called CMML (Continuous Media Markup Language) providing additional metadata. It is intended to create a Continuous Media Web (CMWeb), whereby continuous media can be manipulated in a similar manner to text media on the World Wide Web, including searching and dynamic arrangement of elements.
The specific design of the elements of the Continuous Media Web project were invented by Silvia Pfeiffer and Conrad Parker at CSIRO Australia in mid-2001. Some of the ideas behind CMML and the generic addressing of temporal offsets were proposed in a 1997 paper by Bill Simpson-Young and Ken Yap.
In January 2002 the Annodex team took on two students, Andrew Nesbit and Andre Pang, along with Simon Lai who became the first person to author meaningful content in CMML. During this time the basics of the Annodex technology were designed, including the design of temporal URI fragments, the basic DTDs, the choice of the Ogg encapsulation format and the initial design of the libraries.
By late 2004, Andre Pang developed the Annodex Plug-in for Mozilla Firefox Browsers, allowing for the playback of Annodex media encoded with the Ogg Theora video codec and the Ogg Vorbis audio codec. Time URIs implemented at the Location Bar provides the server-side seeking functionality on Annodex media and enables hyperlinking into and out of Annodex media through a table of contents clip list for CMML content.
Over time there was increasing development of Annodex technology from the open-source community, starting with Debian packages by Jamie Wilkinson, Python bindings by Ben Leslie, and Perl bindings by Angus Lees. The command-line authoring tools were completed early in 2001, whilst being continually updated to adhere to the current Version 3 of the Annodex annotation standards by 2005.[1]
In November 2005, CSIRO wanted to focus on closed-source research and build existing products on top of the technology, thus losing interest in the open source standard components of it. Therefore, a decision was made to separate out the open-source components into its own organisation by creating an Annodex Foundation similar in spirit to the many other foundations that have been created around other FOSS technologies.[2]
The core technical specification documents on Annodex are being developed through the Annodex community. They consist of the following components as follows:
Continuous Media Markup Language is a XML markup language for time-continuous data such as audio and video. The main principles of CMML are as follows:
<cmml> <stream timebase="0"> <import src="galaxies.mpg" contenttype="video/mpeg"/> </stream> <head> <title>Hidden Galaxies</title> <meta name="author" content="CSIRO"/> </head> <clip id="findingGalaxies" start="15"> <a href="http://www.aao.gov.au/galaxies.anx#radio"> Related video on detection of galaxies </a> <img src="galaxy.jpg"/> <desc>What's out there?</desc> <meta name="KEYWORDS" content="Radio Telescope"/> </clip> </cmml>
The origin of the CMML document, along with further documentation and standards can be found at Annodex CMML Standard Version 2.1
Annodex is an encapsulation format, which interleaves time-continuous data with CMML markup in a streamable manner. The Annodex format is built on the Ogg encapsulation format to allows for internet servers and proxies to manage temporal subparts and reconstruct files from annodexed clips. This introduces the following stream types:
Further information can be found at Annodex Annotation Format for Time-continuous Bitstreams, Version 3.0
To include time-continuous content such as audio and video media into the Web, it is necessary to be able to point hyperlinks into such content to address temporal offsets. Further information can be found at Annodex Time Intervals in URI Queries and Fragments