Abstract:
This thesis is focused on semi-supervised learning (SSL) algorithms, a family of methods lying in between supervised and unsupervised learning. The main characteristic of SSL algorithms is that they exploit at the same time the structure of the data (their features) and the available labeling information to estimates the boundaries of the classes/clusters. For this reason, they are particularly suitable in a regime of scarcity of labeled data or in the cases whether the data annotation is expensive or time-consuming. Here, we will exploit a recent algorithm, rooted in the evolutionary game-theory, named “Graph Transduction Games”. The GTG algorithm explicitly models an SSL problem as a non-cooperative game where players represent the data and the strategies the possible labels. A player chooses a strategy and receives a payoff which is proportional to the choice of the other players and to their similarities. The game is iterated until all the players have chosen their best strategy, and no one has any incentive to change his/her choice. The final labeling is then a property that emerges by the players interactions, hence from the data. During the labeling process, the similarities between all the data are taken into account, creating a context in which similar points affect each other in deciding the final labeling assignment. The neighboring players (data), hence the context, help in situations in which intrinsic ambiguities in the data may lead to inconsistent class assignments. Within this thesis, the GTG algorithm and the context in which players are playing will be explored into applications like bioinformatics, natural language processing, computer vision, and pure machine learning problems.