LinguaStream is a generic platform for natural language processing, based on incremental enrichment of electronic documents. LinguaStream is developed at the GREYC (French: Groupe de recherche en informatique, image, automatique et instrumentation de Caen) computer science research group (Université de Caen) since 2001. It is available for free for private use and research purposes.[1] [2]
LinguaStream allows complex processing streams to be designed and evaluated, assembling analysis components of various types and levels: part-of-speech, syntax, semantics, discourse or statistical. Each stage of the processing stream discovers and produces new information, on which the subsequent steps can rely. At the end of the stream, several tools allow analysed documents and their annotations to be conveniently visualised.[3]
LinguaStream is a virtual laboratory targeted to researchers in natural language processing. It allows for complex experiments on corpora to be realised conveniently, using various types of declarative formalisms, and reducing considerably the development costs. Its uses range from corpora exploration to the development of fully functional automatic analysers. An integrated environment is provided with the platform, where all the steps of the realisation of an experiment can be achieved.
As a platform, LinguaStream provides an extensive Java API. For example, it can be integrated with Java EE servers to develop web applications based on processing streams. It is also used for teaching, and provides specific modules dedicated to students.