An Automatic Tool for Labeling Web Search Missions Using Wikipedia Categories

DSpace/Manakin Repository

Show simple item record

dc.contributor.advisor Orlando, Salvatore it_IT
dc.contributor.author Gezzele, Marco <1987> it_IT
dc.date.accessioned 2014-02-04 it_IT
dc.date.accessioned 2014-03-29T10:42:17Z
dc.date.available 2015-04-07T13:58:25Z
dc.date.issued 2014-03-11 it_IT
dc.identifier.uri http://hdl.handle.net/10579/4143
dc.description.abstract Mission detection aims to identify the sets of queries that users submit to a Web search engine in order to satisfy common information needs. Moreover, it is generally easier to figure out the actual search topic a user is interested in by leveraging signals coming from several queries (i.e., mission) instead of looking at each query separately. In this work, we present a system that automatically labels the search missions previously discovered from a search engine log using a set of predefined semantic categories, as provided by Wikipedia. Those are well-known categories, which have been already proposed in previous work since they cover almost every topic underneath search missions. Our solution consists of the following steps. First, we extract the set of Wikipedia articles (i.e., entities) from each single search mission using a state-of-the-art entity linking technique. This is achieved by representing a search mission as a virtual text document; such document is made of the queries composing the mission as well as the text included in the web page, which the user possibly clicked on in response to each query. Second, we retrieve the set of candidate Wikipedia categories that correspond to the set of entities extracted during the previous step. Finally, we rank the predefined set of target categories with respect to the candidates above using an unsupervised approach, and we therefore assign the highest ranked target category to each search mission. In our experiments, we use a dataset of 8,800 queries sampled from a real-world search engine log. Furthermore, such queries were already manually grouped into individual search missions, which in turn we use as input to our system. To evaluate the quality of our proposed solution, we conduct a user study where users are asked to manually evaluate the correctness of the labels assigned to the missions. This way it is possible to judge the goodness of our approach. it_IT
dc.language.iso en it_IT
dc.publisher Università Ca' Foscari Venezia it_IT
dc.rights © Marco Gezzele, 2014 it_IT
dc.title An Automatic Tool for Labeling Web Search Missions Using Wikipedia Categories it_IT
dc.title.alternative it_IT
dc.type Master's Degree Thesis it_IT
dc.degree.name Informatica - computer science it_IT
dc.degree.level Laurea magistrale it_IT
dc.degree.grantor Dipartimento di Scienze Ambientali, Informatica e Statistica it_IT
dc.description.academicyear 2012/2013, sessione straordinaria it_IT
dc.rights.accessrights openAccess it_IT
dc.thesis.matricno 810845 it_IT
dc.subject.miur INF/01 INFORMATICA it_IT
dc.description.note it_IT
dc.degree.discipline it_IT
dc.contributor.co-advisor it_IT
dc.provenance.upload Marco Gezzele (810845@stud.unive.it), 2014-02-04 it_IT
dc.provenance.plagiarycheck Salvatore Orlando (orlando@unive.it), 2014-02-17 it_IT


Files in this item

This item appears in the following Collection(s)

Show simple item record