On Determining the Number of Dominant-Set Clusters

DSpace/Manakin Repository

Show simple item record

dc.contributor.advisor Pelillo, Marcello it_IT
dc.contributor.author Dulecha, Tinsae Gebrechristos <1985> it_IT
dc.date.accessioned 2014-10-08 it_IT
dc.date.accessioned 2014-12-13T10:18:09Z
dc.date.available 2014-12-13T10:18:09Z
dc.date.issued 2014-10-31 it_IT
dc.identifier.uri http://hdl.handle.net/10579/5393
dc.description.abstract Cluster Analysis (Clustering) is the process of finding group of objects where, objects in the same group will be similar (related) to one another and dissimilar from objects in other groups. A fundamental and the major problem in cluster analysis is how many clusters are appropriate for the description of a given system, which is a basic input for many clustering algorithm. In this thesis we build a new method called “On Determining the Number of Dominant-Set Clusters” for automatically estimating the number of clusters in unlabeled data sets, based on the Motzkin-Straus theorem, they were able to show a connection between clique number (ω(G)) and the global optimal value of a certain quadratic function over the standard simplex. Moreover, they have used the definition of stability number and they have shown that this maximization is equal to stability number in unweighted scenario. In our work, we have inspired by this theorem so we have extended to the weighted case to detect the number of maximal cliques (clusters). Finally we came to design a two steps method to determine the number of clusters. In the first step, we use dissimilarity matrix as an input and by minimizing it with replicator, we are able to detect the minimum number of clusters based on our defined stability number. And then, we examine the existence of undetected cluster based on the idea of efficient out-of –sample extension of dominant-set clusters paper. After determining the number of clusters(cluster representatives) in order to check whether our approach determine the right number of clusters or not we propagate the class label using graph transduction ,a popular semi-supervised learning algorithm, to unlabeled instances and we evaluate the accuracy of clusters formed. In order check the performance of our approach we performed several test on computer generated (toy) dataset, real-world data set which are taken from UCI data repository. We also test our approach using some social network data sets to further extend our work. The experiments have done on those data sets shows promising good results. it_IT
dc.language.iso en it_IT
dc.publisher Università Ca' Foscari Venezia it_IT
dc.rights © Tinsae Gebrechristos Dulecha, 2014 it_IT
dc.title On Determining the Number of Dominant-Set Clusters it_IT
dc.title.alternative it_IT
dc.type Master's Degree Thesis it_IT
dc.degree.name Informatica - computer science it_IT
dc.degree.level Laurea magistrale it_IT
dc.degree.grantor Dipartimento di Scienze Ambientali, Informatica e Statistica it_IT
dc.description.academicyear 2013/2014, sessione autunnale it_IT
dc.rights.accessrights openAccess it_IT
dc.thesis.matricno 843867 it_IT
dc.subject.miur INF/01 INFORMATICA it_IT
dc.description.note master thesis on determining number of Dominant-set clusters it_IT
dc.degree.discipline it_IT
dc.contributor.co-advisor it_IT
dc.date.embargoend it_IT
dc.provenance.upload Tinsae Gebrechristos Dulecha (843867@stud.unive.it), 2014-10-08 it_IT
dc.provenance.plagiarycheck Marcello Pelillo (pelillo@unive.it), 2014-10-20 it_IT


Files in this item

This item appears in the following Collection(s)

Show simple item record