Abstract:
With the advancement of technologies and logistic capabilities, e-commerce companies have also drastically improved customer behavior analysis from classical store purchases to an elective and comparative purchase strategy. In addition, to improve their internal processes for customer satisfaction, companies exploit intelligent and well-designed ways to manage their customers by analyzing their behavior and revealing purchasing patterns. To this end, companies adopt customer segmentation algorithms to partition their customers into homogeneous segments for better planning and forecasting marketing campaigns.
Customer segmentation is usually performed through clustering algorithms, a class of unsupervised machine learning methods that discover regularities in the data without needing prior information. For this reason, clustering algorithms play a central role in company decisions making, carrying a lot of responsibilities.
This thesis aims to determine the robustness of these algorithms against adversarial/malicious attacks that alter the clustering results by injecting specially crafted samples into the dataset. A successful attack might have profound implications in critical company decisions.
All the experiments have been carried out on a real dataset of customers and products provided to me during my internship at WWG Srl. After data preparation, we segmented customers using DBSCAN and K-means. And finally, we tested the robustness of these algorithms by poisoning the dataset using a Bridge-based strategy.