<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://www.hlt.inesc-id.pt/wiki/index.php?action=history&amp;feed=atom&amp;title=MSc_defense_practice_20081014</id>
	<title>MSc defense practice 20081014 - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://www.hlt.inesc-id.pt/wiki/index.php?action=history&amp;feed=atom&amp;title=MSc_defense_practice_20081014"/>
	<link rel="alternate" type="text/html" href="https://www.hlt.inesc-id.pt/wiki/index.php?title=MSc_defense_practice_20081014&amp;action=history"/>
	<updated>2026-05-14T22:23:47Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.41.0</generator>
	<entry>
		<id>https://www.hlt.inesc-id.pt/wiki/index.php?title=MSc_defense_practice_20081014&amp;diff=5455&amp;oldid=prev</id>
		<title>Joana at 11:04, 1 June 2009</title>
		<link rel="alternate" type="text/html" href="https://www.hlt.inesc-id.pt/wiki/index.php?title=MSc_defense_practice_20081014&amp;diff=5455&amp;oldid=prev"/>
		<updated>2009-06-01T11:04:05Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 11:04, 1 June 2009&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l47&quot;&gt;Line 47:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 47:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[category:Seminars]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[category:Seminars]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[category:Seminars 2008]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[category:Seminars 2008]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[category:MSc Proposal]]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[category:Rehearsals]]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Joana</name></author>
	</entry>
	<entry>
		<id>https://www.hlt.inesc-id.pt/wiki/index.php?title=MSc_defense_practice_20081014&amp;diff=5064&amp;oldid=prev</id>
		<title>Rdmr at 09:46, 14 October 2008</title>
		<link rel="alternate" type="text/html" href="https://www.hlt.inesc-id.pt/wiki/index.php?title=MSc_defense_practice_20081014&amp;diff=5064&amp;oldid=prev"/>
		<updated>2008-10-14T09:46:52Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;__NOTOC__&lt;br /&gt;
{{infobox|name=Tiago Luís|username=tmcl|contact=tmcl&lt;br /&gt;
|phone=+351-213-100-300 ext. 2514|fax=+351-213-145-843&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== Title ==&lt;br /&gt;
&lt;br /&gt;
* Parallelization of Natural Language Processing Algorithms on Distributed Systems&lt;br /&gt;
&lt;br /&gt;
== Date ==&lt;br /&gt;
&lt;br /&gt;
* 17:30, October 14, 2008&lt;br /&gt;
* Room 336&lt;br /&gt;
&lt;br /&gt;
== Speaker ==&lt;br /&gt;
&lt;br /&gt;
* [[Tiago Luís]]&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
Natural language processing (NLP) is a subfield of artificial intelligence and&lt;br /&gt;
linguistics that studies the problems inherent to the processing of natural&lt;br /&gt;
language. This area deals with large collections of data that require&lt;br /&gt;
significant resources, both in terms of space and processing time. Currently,&lt;br /&gt;
persistent space costs have declined, allowing on the one hand, growth and&lt;br /&gt;
wealth of description of the data, and on the other hand, increase of the&lt;br /&gt;
amount of data to process. Despite the fall of persistent storage costs, the&lt;br /&gt;
processing of these materials is a computation-heavy process and some&lt;br /&gt;
algorithms continue to take weeks to produce their results.&lt;br /&gt;
&lt;br /&gt;
One of the computation-heavy linguistic tasks is annotation. Annotation is&lt;br /&gt;
the process of adding linguistic information to language data or the linguistic&lt;br /&gt;
annotation itself. Moreover, when these tools are integrated, several problems&lt;br /&gt;
related with information flow between these tools may arise. For example, a&lt;br /&gt;
given tool may need an annotation previously produced by another tool but some&lt;br /&gt;
of this linguistic information can be lost in conversions between the&lt;br /&gt;
different tool data formats due to differences in expressiveness.&lt;br /&gt;
&lt;br /&gt;
The developed framework simplifies the integration of independently existing&lt;br /&gt;
NLP tools to form NLP systems without information losses between them. Also, it&lt;br /&gt;
allows the development of scalable and language-independent NLP systems on top&lt;br /&gt;
of the Hadoop framework, offering a friendly programming environment and&lt;br /&gt;
a transparent handling of distributed computing problems, like fault tolerance&lt;br /&gt;
and task scheduling. With this framework we achieved speedup values around 40 on a&lt;br /&gt;
cluster with 80 cores.&lt;br /&gt;
&lt;br /&gt;
[[category:Seminars]]&lt;br /&gt;
[[category:Seminars 2008]]&lt;/div&gt;</summary>
		<author><name>Rdmr</name></author>
	</entry>
</feed>