dc.contributor.advisor |
Orlando, Salvatore |
it_IT |
dc.contributor.author |
Tolomei, Gabriele <1980> |
it_IT |
dc.date.accessioned |
2012-07-07T08:13:20Z |
it_IT |
dc.date.accessioned |
2012-07-30T16:05:45Z |
|
dc.date.available |
2012-07-07T08:13:20Z |
it_IT |
dc.date.available |
2012-07-30T16:05:45Z |
|
dc.date.issued |
2011-11-17 |
it_IT |
dc.identifier.uri |
http://hdl.handle.net/10579/1231 |
it_IT |
dc.description.abstract |
Il World Wide Web è la più grande sorgente dati mai realizzata dall’uomo. Ciò ha fatto sì che il Web divenisse sempre più il “luogo” di riferimento per accedere a qualsiasi tipo di informazione, attraverso l’uso dei motori di ricerca. Infatti, gli utenti tendono a rivolgersi ai motori di ricerca non solo per consultare pagine Web ma per eseguire vere e proprie attività (ad es., per organizzare vacanze, ottenere un visto, organizzare una festa, etc.). In questa tesi di dottorato, si descrivono e affrontano due sfide fondamentali tese a migliorare l’esperienza di ricerca sul Web offerta dagli attuali motori di ricerca, ovvero la scoperta e la raccomandazione di cosiddetti “Web tasks”. Entrambe queste sfide si basano su una reale comprensione dei comportamenti di ricerca degli utenti, che può essere raggiunta mediante l’applicazione di tecniche di query log mining. I processi di ricerca degli utenti sono analizzati ad un più alto livello di astrazione, ovvero da una prospettiva “task-by-task” anziché “query-by-query”. In questo modo è possible realizzare un modello di attività di ricerca che fornisca adeguato supporto alla “vita sul Web” degli utenti. |
it_IT |
dc.description.abstract |
The World Wide Web is the biggest and most heterogeneous database that humans have ever built, making it the place of choice where people search for any sort of information through Web search engines. Indeed, users are increasingly asking Web search engines for performing their daily tasks (e.g., "planning holidays", "obtaining a visa", "organizing a birthday party", etc.), instead of simply looking for Web pages. In this Ph.D. dissertation, we sketch and address two core research challenges that we claim next-generation Web search engines should tackle for enhancing user search experience, i.e., Web task discovery and Web task recommendation. Both these challenges rely on the actual understanding of user search behaviors, which can be achieved by mining knowledge from query logs. Search processes of many users are analyzed at a higher level of abstraction, namely from a "task-by-task" instead of a "query-by-query" perspective, thereby producing a model of user search tasks, which in turn can be used to support people during their daily "Web lives". |
it_IT |
dc.format.medium |
Tesi cartacea |
it_IT |
dc.language.iso |
en |
it_IT |
dc.publisher |
Università Ca' Foscari Venezia |
it_IT |
dc.rights |
© Gabriele Tolomei, 2011 |
it_IT |
dc.subject |
Web search |
it_IT |
dc.subject |
Web mining |
it_IT |
dc.subject |
Query log mining |
it_IT |
dc.subject |
Task recommendation |
it_IT |
dc.title |
Enhancing web search user experience : from document retrieval to task recommendation |
it_IT |
dc.type |
Doctoral Thesis |
it_IT |
dc.degree.name |
Informatica |
it_IT |
dc.degree.level |
Dottorato di ricerca |
it_IT |
dc.degree.grantor |
Scuola di dottorato in Scienze e tecnologie (SDST) |
it_IT |
dc.description.academicyear |
2009/2010 |
it_IT |
dc.description.cycle |
23 |
it_IT |
dc.degree.coordinator |
Salibra, Antonino |
it_IT |
dc.location.shelfmark |
D001161 |
it_IT |
dc.location |
Venezia, Archivio Università Ca' Foscari, Tesi Dottorato |
it_IT |
dc.rights.accessrights |
openAccess |
it_IT |
dc.thesis.matricno |
955515 |
it_IT |
dc.format.pagenumber |
[10], VI, 148 p. |
it_IT |
dc.subject.miur |
INF/01 INFORMATICA |
it_IT |
dc.description.tableofcontent |
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Web Search Engines . . . . . . . . . . . . . . . . . . . . . . 7
2.1 The Big Picture . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Crawling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 The Crawling Algorithm . . . . . . . . . . . . . . . . . 11
2.2.2 The Crawl Frontier . . . . . . . . . . . . . . . . . . . . . 12
2.2.3 Web Page Fetching . . . . . . . . . . . . . . . . . . . . . 13
2.2.4 Web Page Parsing . . . . . . . . . . . . . . . . . . . . . . 14
2.2.5 Web Page Storing . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.1 Text-based Indexing . . . . . . . . . . . . . . . . . . . . 16
2.3.2 Link-based Indexing . . . . . . . . . . . . . . . . . . . . 18
2.4 Query Processing . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.1 Text-based Ranking . . . . . . . . . . . . . . . . . . . . 20
2.4.2 Link-based Ranking . . . . . . . . . . . . . . . . . . . . 22
3 Query Log Mining . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.1 What is a Query Log? . . . . . . . . . . . . . . . . . . . . . 28
3.2 A Characterization of Web Search Queries . . . . . 30
3.3 Time Analysis of Query Logs . . . . . . . . . . . . . . . 35
3.4 Time-series Analysis of Query Logs . . . . . . . . . . 41
3.5 Privacy Issues in Query Logs . . . . . . . . . . . . . . . . 44
3.6 Applications of Query Log Mining . . . . . . . . . . . . 45
3.6.1 Search Session Discovery . . . . . . . . . . . . . . . . . 46
3.6.2 Query Suggestion . . . . . . . . . . . . . . . . . . . . . . 49
4 Search Task Discovery . . . . . . . . . . . . . . . . . . . . . . 57
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.1.1 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.1.2 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3 Query Log Analysis . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.1 Session Size Distribution . . . . . . . . . . . . . . . . . . 65
4.3.2 Query Time-Gap Distribution . . . . . . . . . . . . . . . 66
4.4 Task Discovery Problem . . . . . . . . . . . . . . . . . . . . . 67
4.4.1 Theoretical Model . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5 Ground-truth: De nition and Analysis . . . . . . . . . . . 69
4.6 Task-based Query Similarity . . . . . . . . . . . . . . . . . 74
4.6.1 Time-based Approach . . . . . . . . . . . . . . . . . . . . 75
4.6.2 Unsupervised Approach . . . . . . . . . . . . . . . . . . . 75
4.6.3 Supervised Approach . . . . . . . . . . . . . . . . . . . . . 78
4.7 Task Discovery Methods . . . . . . . . . . . . . . . . . . . . 83
4.7.1 TimeSplitting-t . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.7.2 QueryClustering-m . . . . . . . . . . . . . . . . . . . . . . 85
4.8 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.8.1 Validity Measures . . . . . . . . . . . . . . . . . . . . . . . 88
4.8.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5 Search Task Recommendation . . . . . . . . . . . . . . . . 101
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.1.1 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.1.2 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.3 Anatomy of a Task Recommender System . . . . . . 106
5.4 Task Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.4.1 Basic Task Representation . . . . . . . . . . . . . . . . . 109
5.4.2 Task Document Clustering . . . . . . . . . . . . . . . . 109
5.5 Task Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.5.1 Random-based (baseline) . . . . . . . . . . . . . . . . . 110
5.5.2 Sequence-based . . . . . . . . . . . . . . . . . . . . . . . . 111
5.5.3 Association-Rule based . . . . . . . . . . . . . . . . . . . 111
5.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.6.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . 112
5.6.2 Evaluating Recommendation Precision . . . . . . . . 121
5.6.3 User Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.6.4 Anecdotal Evidences . . . . . . . . . . . . . . . . . . . . . 127
5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 |
it_IT |
dc.identifier.bibliographiccitation |
Tolomei, Gabriele. "Enhancing web search user experience : from document retrieval to task recommendation", Università Ca' Foscari Venezia, Tesi di Dottorato, XXIII Ciclo, 2011 |
it_IT |