Load balancing and fault early detection for Apache Kafka clusters

DSpace/Manakin Repository

Show simple item record

dc.contributor.advisor Marin, Andrea it_IT
dc.contributor.author Burato, Dario <1993> it_IT
dc.date.accessioned 2019-06-20 it_IT
dc.date.accessioned 2019-11-20T07:07:47Z
dc.date.available 2020-12-02T08:02:14Z
dc.date.issued 2019-07-10 it_IT
dc.identifier.uri http://hdl.handle.net/10579/15159
dc.description.abstract Apache Kafka is a publish-subscribe message system, producers publish data to a cluster and clients subscribes to receive data. The messages are sent by their producers and stored in partitions, the load balancing is performed thanks to their distribution between the cluster's nodes. The component which assign a message to a partition is called partitioner, located inside every producer. When partitions lacks intrinsic meaning, and are used purely for load-balancing purposes, the default partitioners available with Apache Kafka aim only to get the same amount of messages shared between partitions. The most common Apache Kafka cluster configuration is based on multiple identical systems, when a cluster is updated with new more performing components the old ones are usually removed. Even if re-balancing tools exists, it would take time to properly adapt to an hybrid cluster configuration, this is caused by partitioners focus on data amount rather than node performance. The problem could be solved by changing the amount of partitions in each old and new system, matching their performance ratio, thus tricking the default partitioner logic, but this actually could hurt client performance. A proper partitioner which knows the performance of each cluser's node is a correct solution, this document will present a formal method to detect problematic scenarios and a custom partitioner that adapts to them. it_IT
dc.language.iso en it_IT
dc.publisher Università Ca' Foscari Venezia it_IT
dc.rights © Dario Burato, 2019 it_IT
dc.title Load balancing and fault early detection for Apache Kafka clusters it_IT
dc.title.alternative Load balancing and early fault detection for Apache Kafka clusters it_IT
dc.type Master's Degree Thesis it_IT
dc.degree.name Informatica - computer science it_IT
dc.degree.level Laurea magistrale it_IT
dc.degree.grantor Dipartimento di Scienze Ambientali, Informatica e Statistica it_IT
dc.description.academicyear 2018/2019_sessione_estiva it_IT
dc.rights.accessrights embargoedAccess it_IT
dc.thesis.matricno 843238 it_IT
dc.subject.miur INF/01 INFORMATICA it_IT
dc.description.note it_IT
dc.degree.discipline it_IT
dc.contributor.co-advisor it_IT
dc.provenance.upload Dario Burato (843238@stud.unive.it), 2019-06-20 it_IT
dc.provenance.plagiarycheck Andrea Marin (marin@unive.it), 2019-07-08 it_IT


Files in this item

This item appears in the following Collection(s)

Show simple item record