String analysis for software verification

DSpace/Manakin Repository

Show simple item record

dc.contributor.advisor Cortesi, Agostino it_IT
dc.contributor.author Olliaro, Martina <1991> it_IT
dc.date.accessioned 2021-02-10 it_IT
dc.date.accessioned 2021-06-22T06:37:43Z
dc.date.available 2021-06-22T06:37:43Z
dc.date.issued 2021-03-24 it_IT
dc.identifier.uri http://hdl.handle.net/10579/18470
dc.description.abstract This thesis aims to investigate string manipulation with security implications in different programming languages and to improve the state-of-the-art by applying the abstract interpretation theory to string analysis. Erroneous string manipulation is a challenging problem in software verification and, in fact, it is one of the major cause of program vulnerabilities that can be exploited by malicious users, leading to severe consequences for the affected systems. By string analysis we mean statically computing the set of string values that are possibly assigned to a variable. Like for other analysis issues, this is undecidable. Thus a certain degree of approximation is necessary in order to find evidence of bugs and vulnerabilities in string manipulating code. We take advantage of the Abstract Interpretation theory, i.e., a powerful mathematical theory that enables us to define and prove the soundness of approximations. The five main contributions of this thesis are: We introduce a new sophisticated abstract domain for C strings. The way the domain (called M-String) is conceived allows it to be tailored for specific verification tasks (e.g., detection of buffer overflows). We describe the concrete and the abstract semantics of basic string operations and prove their soundness formally. Furthermore, we provide an executable implementation of abstract operations. Using a tool that automatically lifts existing programs into the M-String domain along with an explicit-state model checker, we evaluate the accuracy of the proposed domain experimentally on real-case test programs. We combine abstract domains resulting from the reduced product between string shape abstraction and string content abstraction, in order to improve the ability to detect inconsistent states leading to program errors without a major impact with respect to efficiency. In particular, the combinations involve some string abstract domains introduced in the literature with the segmentation domain that we instantiate for string analysis. Completeness, in Abstract Interpretation, ensures that the analysis does not lose information with respect to the property of interest. We provide a systematic and constructive approach for generating the completion of string domains for dynamic languages, and we apply it to the refinement of existing string abstractions. Indeed, for dynamic languages, lack of string analysis completeness is a key security issue, as poorly managed string manipulation code may easily lead to significant security flaws. We also provide an effective procedure to measure the precision improvement obtained when lifting the analysis to complete domains. Almost all the existing string abstract domains tracks information of single variables in a program (e.g., if a string contains a certain character), without inspecting their relationship with other values, causing the loss of relevant knowledge about their possible values. Thus, we introduce a generic framework that allows to formalize relational string abstract domains based on ordering relationship, and we instantiate such a framework to several domains built upon different well-known string orders (e.g., substring relationships). We implemented the domain based on substring ordering, and we provide an experimental evaluation about its effectiveness on some case studies. We manipulate string values in the context of relational database watermarking. We propose a semantic-driven watermarking approach of relational textual databases, which marks multi-word textual attributes, exploiting the synonym substitution technique for text watermarking together with notions in semantic similarity analysis, and dealing with the semantic perturbations provoked by the watermark embedding. We show the effectiveness of our approach through an experimental evaluation. We also prove the resilience of our approach with respect to the random synonym substitution attack. it_IT
dc.language.iso en it_IT
dc.publisher Università Ca' Foscari Venezia it_IT
dc.rights © Martina Olliaro, 2021 it_IT
dc.title String analysis for software verification it_IT
dc.title.alternative it_IT
dc.type Doctoral Thesis it_IT
dc.degree.name Informatica it_IT
dc.degree.level Dottorato di ricerca it_IT
dc.degree.grantor Dipartimento di Scienze Ambientali, Informatica e Statistica it_IT
dc.description.academicyear Dottorato_appello_150321_33 con proroga it_IT
dc.description.cycle 33 it_IT
dc.degree.coordinator Cortesi, Agostino it_IT
dc.location.shelfmark D002118 it_IT
dc.location Venezia, Archivio Università Ca' Foscari, Tesi Dottorato it_IT
dc.rights.accessrights openAccess it_IT
dc.thesis.matricno 834397 it_IT
dc.format.pagenumber [18], 217 p. it_IT
dc.subject.miur INF/01 INFORMATICA it_IT
dc.description.note Cotutela con Masarykova Univerzita it_IT
dc.degree.discipline it_IT
dc.contributor.co-advisor Matyas, Vashek it_IT
dc.provenance.upload Martina Olliaro (834397@stud.unive.it), 2021-02-10 it_IT
dc.provenance.plagiarycheck Agostino Cortesi (cortesi@unive.it), 2021-03-15 it_IT


Files in this item

This item appears in the following Collection(s)

Show simple item record