Czech National Corpus Explained

The Czech National Corpus (CNC) (Czech : Český národní korpus) is a large electronic corpus of written and spoken Czech language, developed by the Institute of the Czech National Corpus (ICNC) in the Faculty of Arts at Charles University in Prague. The collection is used for teaching and research in corpus linguistics.[1] The ICNC collaborates with over 200 researchers and students (mainly for spoken and parallel data acquisition), 270 publishers (as text providers), and other similar research projects.

Areas of focus

The Czech National Corpus focuses systematically on the following areas:[2]

External links

Notes and References

  1. Web site: Institute of the Czech National Corpus . Institute of the Czech National Corpus . 8 January 2019 . 9 January 2019 . https://web.archive.org/web/20190109062744/https://www.ff.cuni.cz/home/about/organisation/institute-of-the-czech-national-corpus/ . dead .
  2. Web site: Křen . Michal . Recent Developments in the Czech National Corpus . Publication Server of the Institute for German Language . 8 January 2019.
  3. M. Hnátková, M. Křen, P. Procházka, and H. Skoumalová. . 2586912 . The SYN-series corpora of written Czech . Proceedings of LREC2014 . 2014 . 160–164 .
  4. L. Válková, M. Waclawičová, and M. Křen. . Balanced data repository of spontaneous spoken Czech . Proceedings of LREC2012 . 2012 . 3345–3349 . 9 January 2019.
  5. F. Čermák and A. Rosen . The case of InterCorp, a multilingual parallel corpus . International Journal of Corpus Linguistics . 2012 . 13 . 3 . 411–427 . 10.1075/ijcl.17.3.05cer . 9 January 2019.
  6. K. Kučera and M. Stluka. . Corpus of 19th century Czech texts: Problems and solutions . Proceedings of LREC2014 . 2014 . 165–168 . 9 January 2019.