Load balancing and fault early detection for Apache Kafka clusters

DSpace Home
→
Università Ca' Foscari Venezia
→
Archivio delle tesi
→
Tesi di laurea (dall'anno accademico 2011/2012)
→
View Item

dc.contributor.advisor	Marin, Andrea	it_IT
dc.contributor.author	Burato, Dario <1993>	it_IT
dc.date.accessioned	2019-06-20	it_IT
dc.date.accessioned	2019-11-20T07:07:47Z
dc.date.available	2020-12-02T08:02:14Z
dc.date.issued	2019-07-10	it_IT
dc.identifier.uri	http://hdl.handle.net/10579/15159
dc.description.abstract	Apache Kafka is a publish-subscribe message system, producers publish data to a cluster and clients subscribes to receive data. The messages are sent by their producers and stored in partitions, the load balancing is performed thanks to their distribution between the cluster's nodes. The component which assign a message to a partition is called partitioner, located inside every producer. When partitions lacks intrinsic meaning, and are used purely for load-balancing purposes, the default partitioners available with Apache Kafka aim only to get the same amount of messages shared between partitions. The most common Apache Kafka cluster configuration is based on multiple identical systems, when a cluster is updated with new more performing components the old ones are usually removed. Even if re-balancing tools exists, it would take time to properly adapt to an hybrid cluster configuration, this is caused by partitioners focus on data amount rather than node performance. The problem could be solved by changing the amount of partitions in each old and new system, matching their performance ratio, thus tricking the default partitioner logic, but this actually could hurt client performance. A proper partitioner which knows the performance of each cluser's node is a correct solution, this document will present a formal method to detect problematic scenarios and a custom partitioner that adapts to them.	it_IT
dc.language.iso	en	it_IT
dc.publisher	Università Ca' Foscari Venezia	it_IT
dc.rights	© Dario Burato, 2019	it_IT
dc.title	Load balancing and fault early detection for Apache Kafka clusters	it_IT
dc.title.alternative	Load balancing and early fault detection for Apache Kafka clusters	it_IT
dc.type	Master's Degree Thesis	it_IT
dc.degree.name	Informatica - computer science	it_IT
dc.degree.level	Laurea magistrale	it_IT
dc.degree.grantor	Dipartimento di Scienze Ambientali, Informatica e Statistica	it_IT
dc.description.academicyear	2018/2019_sessione_estiva	it_IT
dc.rights.accessrights	embargoedAccess	it_IT
dc.thesis.matricno	843238	it_IT
dc.subject.miur	INF/01 INFORMATICA	it_IT
dc.description.note		it_IT
dc.degree.discipline		it_IT
dc.contributor.co-advisor		it_IT
dc.provenance.upload	Dario Burato (843238@stud.unive.it), 2019-06-20	it_IT
dc.provenance.plagiarycheck	Andrea Marin (marin@unive.it), 2019-07-08	it_IT