Detecting new events in press reviews

DSpace Home
→
Università Ca' Foscari Venezia
→
Archivio delle tesi
→
Tesi di laurea (dall'anno accademico 2011/2012)
→
View Item

dc.contributor.advisor	Orlando, Salvatore	it_IT
dc.contributor.author	Pizzolon, Francesco <1988>	it_IT
dc.date.accessioned	2013-02-10	it_IT
dc.date.accessioned	2013-04-30T09:40:39Z
dc.date.available	2014-06-05T11:51:37Z
dc.date.issued	2013-03-01	it_IT
dc.identifier.uri	http://hdl.handle.net/10579/2463
dc.description.abstract	In the last two decades, a huge amount of data are increasingly become available due to the exponential growth of the World Wide Web. Mostly, such data consist of unstructured or semi-structured texts, which often contain references to structured information (e.g., person names, contact records, etc.). Information Extraction (IE) is the discipline aiming at generally discover structured information from unstructured or semi-structured text corpora. More precisely, in this report we focus on two IE-related tasks, namely Named-Entity Recognition (NER) and Relation Extraction (RE). Solutions to these are successfully applied to several domains. As an example, Web search engines have recently started rendering structured answers on their retrieved result pages yet leveraging almost unstructured Web documents. Concretely, we propose a novel method to infer relations among entities, which has been tested and evaluated on a real-world application scenario: entertainment event news, where starting from a generic press review, we try to discover new events hidden in it. Our method is subdivided in two steps, each one specifically addressing an IE task: the first step concerns NER and uses a supervised learning technique to correctly and automatically identify named entities from unstructured text news; the second step, instead, deals with the RE task, and introduces a novel, unsupervised learning strategy to automatically infer relations between entities, as detected during the first step. Finally, well-known measures over a real dataset have been used to evaluate the two parts of the system. Concerning the first part, results highlight the quality of our NER approach, which indeed performs consistently with other existing, state-of-the-art solutions. Regarding the RE approach, experimental results indicate that if enough relevance can be found on the Web (in our case, documents concerning the candidate event), it's possible to infer correct relations which lead to the discovery of new events.	it_IT
dc.language.iso	en	it_IT
dc.publisher	Università Ca' Foscari Venezia	it_IT
dc.rights	© Francesco Pizzolon, 2013	it_IT
dc.title	Detecting new events in press reviews	it_IT
dc.title.alternative	SEED: A Framework for Extracting Social Events from Press Reviews	it_IT
dc.type	Master's Degree Thesis	it_IT
dc.degree.name	Informatica	it_IT
dc.degree.level	Laurea magistrale	it_IT
dc.degree.grantor	Dipartimento di Scienze Ambientali, Informatica e Statistica	it_IT
dc.description.academicyear	2011/2012, sessione straordinaria	it_IT
dc.rights.accessrights	openAccess	it_IT
dc.thesis.matricno	816511	it_IT
dc.subject.miur	INF/01 INFORMATICA	it_IT
dc.description.note		it_IT
dc.degree.discipline		it_IT
dc.contributor.co-advisor		it_IT
dc.provenance.upload	Francesco Pizzolon (816511@stud.unive.it), 2013-02-10	it_IT
dc.provenance.plagiarycheck	Salvatore Orlando (orlando@unive.it), 2013-02-11	it_IT