Lightweight markup language explained

A lightweight markup language (LML), also termed a simple or humane markup language, is a markup language with simple, unobtrusive syntax. It is designed to be easy to write using any generic text editor and easy to read in its raw form. Lightweight markup languages are used in applications where it may be necessary to read the raw document as well as the final rendered output.

For instance, a person downloading a software library might prefer to read the documentation in a text editor rather than a web browser. Another application for such languages is to provide for data entry in web-based publishing, such as blogs and wikis, where the input interface is a simple text box. The server software then converts the input into a common document markup language like HTML.

History

Lightweight markup languages were originally used on text-only displays which could not display characters in italics or bold, so informal methods to convey this information had to be developed. This formatting choice was naturally carried forth to plain-text email communications. Console browsers may also resort to similar display conventions.

In 1986 international standard SGML provided facilities to define and parse lightweight markup languages using grammars and tag implication. The 1998 W3C XML is a profile of SGML that omits these facilities. However, no SGML document type definition (DTD) for any of the languages listed below is known.

Types

Lightweight markup languages can be categorized by their tag types. Like HTML (<b>'''bold'''</b>), some languages use named elements that share a common format for start and end tags (e.g. BBCode [b]'''bold'''[/b]), whereas proper lightweight markup languages are restricted to ASCII-only punctuation marks and other non-letter symbols for tags, but some also mix both styles (e.g. Textile bq. ) or allow embedded HTML (e.g. Markdown), possibly extended with custom elements (e.g. MediaWiki <nowiki><ref>'''source'''</ref></nowiki>).

Most languages distinguish between markup for lines or blocks and for shorter spans of texts, but some only support inline markup.

Some markup languages are tailored for a specific purpose, such as documenting computer code (e.g. POD, reST, RD) or being converted to a certain output format (usually HTML or LaTeX) and nothing else, others are more general in application. This includes whether they are oriented on textual presentation or on data serialization.

Presentation oriented languages include AsciiDoc, atx, BBCode, Creole, Crossmark, Djot, Epytext, Haml, JsonML, MakeDoc, Markdown, Org-mode, POD (Perl), reST (Python), RD (Ruby), Setext, SiSU, SPIP, Xupl, Texy!, Textile, txt2tags, UDO and Wikitext.

Data serialization oriented languages include Curl (homoiconic, but also reads JSON; every object serializes), JSON, and YAML.

Comparison of language features

Comparing language features
HTML export tool	HTML import tool	Tables	Link titles
AsciiDoc				2002-11-25^[1]
BBCode				1998
Creole				2007-07-04^[2]
Djot		^[3]		2022-07-30^[4]
DokuWiki				2004-07-04^[5]
Gemtext				2020
GitHub Flavored Markdown				2011-04-28+
Jira Formatting Notation				2002+^[6]
Markdown				2004-03-19^[7] ^[8]
Markdown Extra			^[9]	2013-04-11^[10]
MediaWiki				2002^[11]
MultiMarkdown				2009-07-13
Org-mode		^[12]		2003^[13]
PmWiki	^[14]			2002-01
POD				1994
reStructuredText				2002-04-02^[15]
setext				1992^[16]
Slack				2013+^[17] ^[18]
TiddlyWiki				2004-09^[19]
Textile				2002-12-26^[20]
Texy				2004^[21]
txt2tags		^[22]	^[23]	2001-07-26^[24]
WhatsApp				2016-03-16^[25]

Markdown's own syntax does not support class attributes or id attributes; however, since Markdown supports the inclusion of native HTML code, these features can be implemented using direct HTML. (Some extensions may support these features.)

txt2tags' own syntax does not support class attributes or id attributes; however, since txt2tags supports inclusion of native HTML code in tagged areas, these features can be implemented using direct HTML when saving to an HTML target.^[26]

DokuWiki does not support HTML import natively, but HTML to DokuWiki converters and importers exist and are mentioned in the official documentation.^[27] DokuWiki does not support class or id attributes, but can be set up to support HTML code, which does support both features. HTML code support was built-in before release 2023-04-04.^[28] In later versions, HTML code support can be achieved through plugins, though it is discouraged.

Comparison of implementation features


				DOC (X)	LMLs	Other	License
	AsciiDoc	Python, Ruby, JavaScript, Java							Man page etc.	GNU GPL, MIT
	BBCode	Perl, PHP, C#, Python, Ruby								Public Domain
	Creole	PHP, Python, Ruby, JavaScript^[29]	Depends on implementation							CC_BY-SA 1.0
	Djot	Lua (originally), JavaScript, Prolog, Rust								MIT
rowspan=2	GitHub Flavored Markdown	Haskell (Pandoc)							OPML	GPL
Java,^[30] JavaScript,^[31] ^[32] ^[33] PHP,^[34] ^[35] Python,^[36] Ruby^[37]								Proprietary
	Markdown	Perl (originally), C,^[38] ^[39] Python,^[40] JavaScript, Haskell, Ruby,^[41] C#, Java, PHP								BSD-style & GPL (both)
	Markdown Extra	PHP (originally), Python, Ruby								BSD-style & GPL (both)
	MediaWiki	Perl, PHP, Haskell, Python								GNU GPL
	MultiMarkdown	C, Perl							OPML	GPL, MIT
	Org-mode	Emacs Lisp, Ruby (parser only), Perl, OCaml					^[42]	Markdown	TXT, XOXO, iCalendar, Texinfo, man, contrib: groff, s5, deck.js, Confluence Wiki Markup,^[43] TaskJuggler, RSS, FreeMind	GPL
	PmWiki	PHP								GNU GPL
	POD	Perl							Man page, plain text	Artistic License, Perl's license
	reStructuredText	Python,^[44] ^[45] Haskell (Pandoc), Java,							man, S5, Devhelp, QT Help, CHM, JSON	Public Domain
	Textile	PHP, JavaScript, Java, Perl, Python, Ruby, ASP, C#, Haskell								Textile License
	Texy!	PHP, C#								GNU GPL v2 License
	txt2tags	Python,^[46] PHP^[47]							roff, man, MagicPoint, Lout, PageMaker, ASCII Art, TXT	GPL

Comparison of lightweight markup language syntax

Inline span syntax

Although usually documented as yielding italic and bold text, most lightweight markup processors output semantic HTML elements em and strong instead. Monospaced text may either result in semantic code or presentational tt elements. Few languages make a distinction, e.g. Textile, or allow the user to configure the output easily, e.g. Texy.

LMLs sometimes differ for multi-word markup where some require the markup characters to replace the inter-word spaces (infix).Some languages require a single character as prefix and suffix, other need doubled or even tripled ones or support both with slightly different meaning, e.g. different levels of emphasis.

Notes and References

Web site: AsciiDoc ChangeLog . 2017-02-24.
Web site: WikiCreole Versions . 2017-02-24.
Web site: djot . 2023-08-26.
Web site: djot 0.1.0 . . 2023-08-26.
Web site: DokuWiki old_changes . 2024-11-26.
Web site: Text Formatting Notation Help. Atlassian. Jira. 2020-12-22.
Web site: Markdown . 2004-03-19 . Aaron Swartz: The Weblog.
Web site: Daring Fireball: Markdown . https://web.archive.org/web/20040402182332/http://daringfireball.net/projects/markdown/ . 2004-04-02 . 2014-04-25.
Web site: PHP Markdown Extra . Michel Fortin . 2013-10-08.
Web site: PHP Markdown: History. Michel Fortin . 2020-12-23.
Web site: MediaWiki history . 2017-02-24.
http://johnmacfarlane.net/pandoc/ Pandoc
Web site: Org mode for Emacs – Your Life in Plain Text . orgmode.org . OrgMode team . 2016-12-09.
Web site: PmWiki Cookbook - Export addons. 7 January 2018.
Web site: An Introduction to reStructuredText . 2017-02-24.
Web site: 1992-01-06 . TidBITS in new format . 2022-07-01 . TidBITS . en.
Web site: Slack Help Center > Using Slack > Send messages > Format your messages . 2018-08-07.
Web site: Slack API documentation: Basic message formatting . 2018-08-07.
Web site: History of TiddlyWiki. tiddlywiki.com.
Web site: Textism › Tools › Textile. https://web.archive.org/web/20021226035527/http://textism.com/tools/textile/. 26 December 2002. textism.com.
Web site: What is Texy . 2017-02-24.
Web site: Html2wiki txt2tags module. cpan.org . 2014-01-30.
Web site: Txt2tags User Guide . Txt2tags.org . 2017-02-24.
Web site: txt2tags changelog . 2017-02-24.
Web site: WhatsApp FAQ: Formatting your messages. 2017-11-21.
Web site: Txt2tags User Guide . Txt2tags.org . 2017-02-24.
Web site: DokuWiki Tips htmltowiki . 2024-11-26.
Web site: DokuWiki FAQ html . 2024-11-26.
Web site: Converters . WikiCreole . 2013-10-08.
https://github.com/sirthias/pegdown pegdown
https://github.com/ypocat/gfms gfms
https://github.com/chjj/marked marked
https://github.com/gagle/Node-GFM node-gfm
http://parsedown.org/ Parsedown
https://github.com/kzykhys/Ciconia Ciconia
https://github.com/joeyespo/grip Grip
https://rubygems.org/gems/github-markdown github-markdown
https://github.com/jgm/peg-markdown peg-markdown
http://www.pell.portland.or.us/~orc/Code/discount/ Discount
Web site: Python-Markdown . Github.com . 2013-10-08.
Web site: Bruce Williams , for Ruby Central . kramdown: Project Info . RubyForge . 2013-10-08 . dead . https://web.archive.org/web/20130807011316/http://rubyforge.org/projects/kramdown . 2013-08-07 .
Web site: Via ox-pandoc and pandoc itself. .
Web site: Atlassian . Confluence 4.0 Editor - What's Changed for Wiki Markup Users (Confluence Wiki Markup is dead) . 2018-03-28.
http://docutils.sourceforge.net/docs/index.html Docutils
[Sphinx (documentation generator)|Sphinx]
Web site: Aurelio Jargas www.aurelio.net . txt2tags . txt2tags . 2012-01-11 . 2013-10-08.
Web site: txt2tags.class.php - online convertor [sic] . Txt2tags.org . 2013-10-08.
Web site: Markdown Syntax . Daringfireball.net . 2013-10-08.
http://textile.thresholdstate.com/ Textile Syntax
http://www.aaronsw.com/2002/atx/intro "atx, the true structured text format" by Aaron Swartz (2002)
Web site: The Org Manual: section "A Cleaner Outline View" . 14 June 2020 . orgmanual-cleaner-outline-view.
Web site: using org-adapt-indentation.
Web site: using org-indent-mode or org-indent.