On Determining the Number of Dominant-Set Clusters

DSpace Home
→
Università Ca' Foscari Venezia
→
Archivio delle tesi
→
Tesi di laurea (dall'anno accademico 2011/2012)
→
View Item

dc.contributor.advisor	Pelillo, Marcello	it_IT
dc.contributor.author	Dulecha, Tinsae Gebrechristos <1985>	it_IT
dc.date.accessioned	2014-10-08	it_IT
dc.date.accessioned	2014-12-13T10:18:09Z
dc.date.available	2014-12-13T10:18:09Z
dc.date.issued	2014-10-31	it_IT
dc.identifier.uri	http://hdl.handle.net/10579/5393
dc.description.abstract	Cluster Analysis (Clustering) is the process of finding group of objects where, objects in the same group will be similar (related) to one another and dissimilar from objects in other groups. A fundamental and the major problem in cluster analysis is how many clusters are appropriate for the description of a given system, which is a basic input for many clustering algorithm. In this thesis we build a new method called “On Determining the Number of Dominant-Set Clusters” for automatically estimating the number of clusters in unlabeled data sets, based on the Motzkin-Straus theorem, they were able to show a connection between clique number (ω(G)) and the global optimal value of a certain quadratic function over the standard simplex. Moreover, they have used the definition of stability number and they have shown that this maximization is equal to stability number in unweighted scenario. In our work, we have inspired by this theorem so we have extended to the weighted case to detect the number of maximal cliques (clusters). Finally we came to design a two steps method to determine the number of clusters. In the first step, we use dissimilarity matrix as an input and by minimizing it with replicator, we are able to detect the minimum number of clusters based on our defined stability number. And then, we examine the existence of undetected cluster based on the idea of efficient out-of –sample extension of dominant-set clusters paper. After determining the number of clusters(cluster representatives) in order to check whether our approach determine the right number of clusters or not we propagate the class label using graph transduction ,a popular semi-supervised learning algorithm, to unlabeled instances and we evaluate the accuracy of clusters formed. In order check the performance of our approach we performed several test on computer generated (toy) dataset, real-world data set which are taken from UCI data repository. We also test our approach using some social network data sets to further extend our work. The experiments have done on those data sets shows promising good results.	it_IT
dc.language.iso	en	it_IT
dc.publisher	Università Ca' Foscari Venezia	it_IT
dc.rights	© Tinsae Gebrechristos Dulecha, 2014	it_IT
dc.title	On Determining the Number of Dominant-Set Clusters	it_IT
dc.title.alternative		it_IT
dc.type	Master's Degree Thesis	it_IT
dc.degree.name	Informatica - computer science	it_IT
dc.degree.level	Laurea magistrale	it_IT
dc.degree.grantor	Dipartimento di Scienze Ambientali, Informatica e Statistica	it_IT
dc.description.academicyear	2013/2014, sessione autunnale	it_IT
dc.rights.accessrights	openAccess	it_IT
dc.thesis.matricno	843867	it_IT
dc.subject.miur	INF/01 INFORMATICA	it_IT
dc.description.note	master thesis on determining number of Dominant-set clusters	it_IT
dc.degree.discipline		it_IT
dc.contributor.co-advisor		it_IT
dc.date.embargoend		it_IT
dc.provenance.upload	Tinsae Gebrechristos Dulecha (843867@stud.unive.it), 2014-10-08	it_IT
dc.provenance.plagiarycheck	Marcello Pelillo (pelillo@unive.it), 2014-10-20	it_IT