Presentation on Corpus Linguistics

From HLT@INESC-ID

Revision as of 16:19, 7 December 2006 by Paula (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Max Silberztein
Max Silberztein

Date

  • December 12, 2006
  • Location: Room 336

Speaker

Abstract

NooJ is a linguistic development environment that includes large-coverage dictionaries and grammars, and parses corpora in real time. NooJ includes tools to create and maintain large-coverage lexical resources, as well as morphological and syntactic grammars. Dictionaries and grammars are applied to texts in order to locate morphological, lexical and syntactic patterns and tag simple and compound words. NooJ can build complex concordances, with respect to all types of Finite State and Context-Free patterns. NooJ users can easily develop extractors to identify semantic units in large texts, such as names of persons, locations, dates, technical expressions of finance, etc.