JMdict explained

JMdict
Company Type:Nonprofit
Type:Japanese dictionary
Language:English
Commercial:No
Current Status:Perpetual work-in-progress

JMdict (Japanese–Multilingual Dictionary) is a large machine-readable multilingual Japanese dictionary. As of March 2023, it contains JapaneseEnglish translations for around 199,000 entries, representing 282,000 unique headword-reading combinations.[1] [2] [3] The dictionary files are free to use with attribution (Creative Commons Attribution-ShareAlike[4]) and have been widely adopted on the Internet and are used in many computer and smartphone applications. The project is considered a standard Japanese–English reference on the Internet and is used by the Unihan Database and several other Japanese–English projects.

History

The JMdict project was started by computational linguist Jim Breen in 1991 with the creation of EDICT (a plain text flat file in EUC-JP encoding), which was later expanded to a UTF-8-encoded XML file in 1999 as JMdict. The XML format allows for multiple surface forms of lexemes and multiple readings, as well as cross-references and annotations. It permits glosses in other languages and contains French, German, Russian, etc. translations for many entries.

The original EDICT format is still being generated for systems that rely on that format.[5] An expanded version, EDICT2, which permits an entry to contain multiple headwords and readings as well as cross-references and additional fields, is also produced and is used by several systems including the server for WWWJDIC, Breen's own online dictionary search tool. Versions of JMdict are also produced in the XML format used by Apple's "Dict" application and in the EPWING/JIS X 4081 format used by many Japanese electronic dictionary systems.

Since 1991, JMdict has been updated and expanded by many contributors. Since 2000, the JMdict project has been managed by the Electronic Dictionary Research and Development Group (EDRDG).[6] In 2010, maintenance of the dictionary was moved to an online database system. The dictionary is managed by an editorial board including Breen and eight other editors.[7]

From June 2021, a version of JMdict includes example sentences selected from the Tatoeba Corpus.[8]

Influence

EDICT has inspired other projects, including the CEDICT Chinese dictionary project started by Paul Denisowski in 1997,[9] and the Japanese–German dictionary .[10]

External links

Notes and References

  1. Web site: JMdict Entry Count . 7 March 2022.
  2. News: Morales . Daniel . At 180,000 entries, Jim Breen's freeware Japanese dictionary is still growing . 11 April 2019 . The Japan Times . 25 June 2018 . japantimes.
  3. Web site: The EDICT Dictionary File . Breen . Jim . Electronic Dictionary Research and Development Group . 8 October 2014.
  4. Web site: General Dictionary Licence Statement . EDRDG . 5 November 2020.
  5. Book: Lunde . Ken . CJKV information processing . 13 January 2009 . O'Reilly Media, Inc . 978-0-596-51447-1 . 674 . 2nd.
  6. Web site: Electronic Dictionary Research and Development Group File . 20 June 2011.
  7. Web site: JMdict Editorial Board . Breen . Jim . . 8 October 2014.
  8. Web site: Tanaka Corpus - EDRDG Wiki . 2023-02-01 . www.edrdg.org.
  9. Web site: CC-CEDICT Home [CC-CEDICT WIKI]]. cc-cedict.org.
  10. Ulrich Apel: Neueste Informationen zum elektronischen japanisch-deutschen Wörterbuch WaDokuJT. In: Referate des 12. Deutschsprachigen Japanologentages, Band III – Sprache, Sprachwissenschaft, Sprachlehrforschung. Bier'sche Verlagsanstalt, Bonn, 2006, pp. 141–159.