Learning with Rich Prior Knowledge: Difference between revisions

From HLT@INESC-ID

No edit summary
 
No edit summary
 
(One intermediate revision by the same user not shown)
Line 18: Line 18:
== Abstract ==
== Abstract ==


We possess a wealth of prior knowledge about most prediction problems, and particularly so for many of the fundamental tasks in speech processing and generation.  For example, when learning letter to sound rules, even unlabeled words should obey some phonotactic constraints (such as “each word should contain a vowel”).  Unfortunately, it is often difficult to make use of this type of information during learning, as it typically does not come in the form of labeled examples, may be difficult to encode as a prior on parameters in a Bayesian setting, and may be impossible to incorporate into a tractable model. Instead, we usually have prior knowledge about the values of output variables.   
We possess a wealth of prior knowledge about most prediction problems,
and particularly so for many of the fundamental tasks in natural
language processing.  Unfortunately, it is often difficult to make
use of this type of information during learning, as it typically does
not come in the form of labeled examples, may be difficult to encode
as a prior on parameters in a Bayesian setting, and may be impossible
to incorporate into a tractable model. Instead, we usually have prior
knowledge about the values of output variables.  For example, linguistic
knowledge or an out-of-domain parser may provide the locations of
likely syntactic dependencies for grammar induction.  Motivated by
the prospect of being able to naturally leverage such knowledge, four
different groups have recently developed similar, general frameworks
for expressing and learning with side information about output variables.
These frameworks are Constraint-Driven Learning (UIUC), Posterior
Regularization (UPenn), Generalized Expectation Criteria (UMass Amherst),
and Learning from Measurements (UC Berkley).


For example, we might know that several different speech recognizers should learn to agree on untranscribed data.  Motivated by the prospect of being able to naturally leverage such knowledge, four different groups have recently developed similar, general frameworks for expressing and learning with side information about output variables.  These frameworks are Constraint-Driven Learning (UIUC), Posterior Regularization (UPenn), Generalized Expectation Criteria (UMass Amherst), and Learning from Measurements (UC Berkley).
This tutorial describes how to encode side information about output
 
variables, and how to leverage this encoding and an unannotated
This tutorial describes how to encode side information about output variables, and how to leverage this encoding and an unannotated corpus during learning.  We survey the different frameworks, explaining how they are connected and the trade-offs between them. We also survey some applications to natural language processing that have been explored in the literature including: grammar and part of speech induction, word alignment, information extraction, text classification, and multi-view learning.  Prior knowledge used in these applications ranges from structural information that cannot be efficiently encoded in the model, to knowledge about the approximate expectations of some features, to knowledge of some incomplete and noisy labellings.  These applications show that a wide variety of domain knowledge can be used to guide unsupervised or semi-supervised learning. The variety of applications and diversity of prior knowledge employed so far demonstrates the generality of these approaches.  While previous applications have been in language processing, there are numerous opportunities for applying the frameworks to speech processing, and we highlight several of them.
corpus during learning.  We survey the different frameworks, explaining
 
how they are connected and the trade-offs between them. We also survey
The tutorial will provide the audience with the theoretical background to understand why these methods have been so effective, as well as practical guidance on how to apply them.  Specifically, we discuss issues that come up in implementation, and describe a toolkit that provides “out-of-the-box” support for the applications described in the tutorial, and is extensible to other applications and new types of prior knowledge.
several applications that have been explored in the literature,
including applications to grammar and part-of-speech induction, word
alignment, information extraction, text classification, and multi-view
learning.  Prior knowledge used in these applications ranges from
structural information that cannot be efficiently encoded in the model,
to knowledge about the approximate expectations of some features, to
knowledge of some incomplete and noisy labellings.  These applications
also address several different problem settings, including unsupervised,
lightly supervised, and semi-supervised learning, and utilize both
generative and discriminative models. The diversity of tasks, types of
prior knowledge, and problem settings explored demonstrate the generality
of these approaches, and suggest that they will become an important tool
for researchers in natural language processing.


The tutorial will provide the audience with the theoretical background to
understand why these methods have been so effective, as well as practical
guidance on how to apply them.  Specifically, we discuss issues that come
up in implementation, and describe a toolkit that provides "out-of-the-box"
support for the applications described in the tutorial, and is extensible
to other applications and new types of prior knowledge.






'''Note:''' This seminar will be held in English.




[[category:Seminars]]
[[category:Seminars]]
[[category:Seminars 2011]]
[[category:Seminars 2011]]

Latest revision as of 11:02, 24 May 2011

João Graça
João Graça

Date

  • 14:30, Friday, May 27th, 2011
  • Room 336

Speaker

Abstract

We possess a wealth of prior knowledge about most prediction problems, and particularly so for many of the fundamental tasks in natural language processing. Unfortunately, it is often difficult to make use of this type of information during learning, as it typically does not come in the form of labeled examples, may be difficult to encode as a prior on parameters in a Bayesian setting, and may be impossible to incorporate into a tractable model. Instead, we usually have prior knowledge about the values of output variables. For example, linguistic knowledge or an out-of-domain parser may provide the locations of likely syntactic dependencies for grammar induction. Motivated by the prospect of being able to naturally leverage such knowledge, four different groups have recently developed similar, general frameworks for expressing and learning with side information about output variables. These frameworks are Constraint-Driven Learning (UIUC), Posterior Regularization (UPenn), Generalized Expectation Criteria (UMass Amherst), and Learning from Measurements (UC Berkley).

This tutorial describes how to encode side information about output variables, and how to leverage this encoding and an unannotated corpus during learning. We survey the different frameworks, explaining how they are connected and the trade-offs between them. We also survey several applications that have been explored in the literature, including applications to grammar and part-of-speech induction, word alignment, information extraction, text classification, and multi-view learning. Prior knowledge used in these applications ranges from structural information that cannot be efficiently encoded in the model, to knowledge about the approximate expectations of some features, to knowledge of some incomplete and noisy labellings. These applications also address several different problem settings, including unsupervised, lightly supervised, and semi-supervised learning, and utilize both generative and discriminative models. The diversity of tasks, types of prior knowledge, and problem settings explored demonstrate the generality of these approaches, and suggest that they will become an important tool for researchers in natural language processing.

The tutorial will provide the audience with the theoretical background to understand why these methods have been so effective, as well as practical guidance on how to apply them. Specifically, we discuss issues that come up in implementation, and describe a toolkit that provides "out-of-the-box" support for the applications described in the tutorial, and is extensible to other applications and new types of prior knowledge.


Note: This seminar will be held in English.