A minimally supervised approach for question generation: what can we learn from a single seed?

From HLT@INESC-ID

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Sérgio Curto
Sérgio Curto

Date

  • 15:00, Friday, October 07th, 2011
  • Room 336

Speaker

Abstract

In this paper, we investigate how many quality natural language questions can be generated from a single question/answer pair (a seed). In our approach we learn patterns that relate the various levels of linguistic information in the question/answer seed with the same levels of information in text. These patterns contain lexical, syntactic and semantic information and when matched against a target document, new question/answer pairs can be generated. Here, we focus specifically on the task of generating questions. Several works, for instance in Question Answering, explore the re-writing of questions to create (usually lexical) patterns; instead, we use several levels of linguistic information – lexical, syntactic and semantic (through the use of named entities). Also, the patterns are commonly hand-crafted, as opposed to our strategy where patterns are automatically learned, based on a single seed. Preliminary results show that with the single question/answer seed pair – “When was Leonardo da Vinci Born?”/1452 – we manage to generate several questions (from documents related with 25 personalities), from which 80% were evaluated as plausible.


Note: This seminar will be held in English.