Detecting new events in press reviews

DSpace/Manakin Repository

Show simple item record

dc.contributor.advisor Orlando, Salvatore it_IT
dc.contributor.author Pizzolon, Francesco <1988> it_IT
dc.date.accessioned 2013-02-10 it_IT
dc.date.accessioned 2013-04-30T09:40:39Z
dc.date.available 2014-06-05T11:51:37Z
dc.date.issued 2013-03-01 it_IT
dc.identifier.uri http://hdl.handle.net/10579/2463
dc.description.abstract In the last two decades, a huge amount of data are increasingly become available due to the exponential growth of the World Wide Web. Mostly, such data consist of unstructured or semi-structured texts, which often contain references to structured information (e.g., person names, contact records, etc.). Information Extraction (IE) is the discipline aiming at generally discover structured information from unstructured or semi-structured text corpora. More precisely, in this report we focus on two IE-related tasks, namely Named-Entity Recognition (NER) and Relation Extraction (RE). Solutions to these are successfully applied to several domains. As an example, Web search engines have recently started rendering structured answers on their retrieved result pages yet leveraging almost unstructured Web documents. Concretely, we propose a novel method to infer relations among entities, which has been tested and evaluated on a real-world application scenario: entertainment event news, where starting from a generic press review, we try to discover new events hidden in it. Our method is subdivided in two steps, each one specifically addressing an IE task: the first step concerns NER and uses a supervised learning technique to correctly and automatically identify named entities from unstructured text news; the second step, instead, deals with the RE task, and introduces a novel, unsupervised learning strategy to automatically infer relations between entities, as detected during the first step. Finally, well-known measures over a real dataset have been used to evaluate the two parts of the system. Concerning the first part, results highlight the quality of our NER approach, which indeed performs consistently with other existing, state-of-the-art solutions. Regarding the RE approach, experimental results indicate that if enough relevance can be found on the Web (in our case, documents concerning the candidate event), it's possible to infer correct relations which lead to the discovery of new events. it_IT
dc.language.iso en it_IT
dc.publisher Università Ca' Foscari Venezia it_IT
dc.rights © Francesco Pizzolon, 2013 it_IT
dc.title Detecting new events in press reviews it_IT
dc.title.alternative SEED: A Framework for Extracting Social Events from Press Reviews it_IT
dc.type Master's Degree Thesis it_IT
dc.degree.name Informatica it_IT
dc.degree.level Laurea magistrale it_IT
dc.degree.grantor Dipartimento di Scienze Ambientali, Informatica e Statistica it_IT
dc.description.academicyear 2011/2012, sessione straordinaria it_IT
dc.rights.accessrights openAccess it_IT
dc.thesis.matricno 816511 it_IT
dc.subject.miur INF/01 INFORMATICA it_IT
dc.description.note it_IT
dc.degree.discipline it_IT
dc.contributor.co-advisor it_IT
dc.provenance.upload Francesco Pizzolon (816511@stud.unive.it), 2013-02-10 it_IT
dc.provenance.plagiarycheck Salvatore Orlando (orlando@unive.it), 2013-02-11 it_IT


Files in this item

This item appears in the following Collection(s)

Show simple item record