MSML explained

The Media Server Markup Language (MSML) is used to control and invoke many different types of services on IP Media Servers and is described in RFC 5707.[1] Clients can use it to define how multimedia sessions interact on a Media Server and to apply services to individuals or groups of users. MSML can be used, for example, to control Media Server conferencing features such as video layout and audio mixing, create sidebar conferences or personal mixes, and set the properties of media streams. As well, clients can use MSML to define media processing dialogs, which may be used as parts of application interactions with users or conferences. Transformation of media streams to and from users or conferences as well as IVR dialogs are examples of such interactions, which are specified using MSML. MSML clients may also invoke dialogs with individual users or with groups of conference participants using VoiceXML.

The fundamental model with MSML is that the Media Server is an appliance that is specialized in controlling/manipulating media streams (usually RTP), and the application server is a separate unit that deals with making and breaking call connections, and controlling the application (or business) logic, so for example the application server would deal with the billing engine and logging systems. The application server establishes a control 'tunnel' (through SIP or IP), which it uses to exchange requests/responses with the media server. In the case of MSML media servers, the messages are coded in MSML, which is a control language using the syntax of XML. MSML is designed so that an application server can interact with a number of different media servers at the same time, and of course these can be distributed across a wide geography, as long as they are reachable via IP. The converse is true, that a media server can have more than one application server talking to it, so this allows for resilience to failure.

MSML was originally created by Convedia (now part of RadiSys), and is an open standard, meaning that companies can use the technology without licensing intellectual property. A number of companies have adopted MSML including Intel (now Dialogic), NMS and Audiocodes.

MSML covers some of the same ground as the earlier MSCML markup language (originally from Snowshore), and both languages are important references for the IETF MediaCTRL (media control) working group, that aims to standardize control of media servers. MSML creator Adnan Saleem acknowledged[2] the MSCML had "shown the way" for driving media servers via scripting, and so a family line can be seen from MSCML through MSML to today's MediaCTRL[3] working group at the IETF.

Notes and References

  1. http://tools.ietf.org/html/rfc5707 Media Server Markup Language (MSML), Feb 2010, A.Saleem, Y.Xin, G.Sharratt
  2. http://www.tmcnet.com/ims/0207/ims-feature-article-processing-ip-media-with-msml.htm Processing IP Media with MSML, M.Davies, IMS Magazine, Feb 2007
  3. http://www.ietf.org/html.charters/mediactrl-charter.html MediaCTRL charter, Burger, Dawkins