Bringing Model Checking Closer to Practical Software Engineering

Remenska, Daniela

Repository landing page

oai:cds.cern.ch:2130792

Bringing Model Checking Closer to Practical Software Engineering

Authors: Daniela Remenska
Publication date: 10 February 2016
Publisher

Abstract

Software grows in size and complexity, making it increasingly challenging to ensure that it behaves correctly. This is especially true for distributed systems, where a multitude of components are running concurrently, making it dicult to anticipate all the possible behaviors emerging in the system as a whole. Certain design errors, such as deadlocks and race-conditions, can often go unnoticed when testing is the only form of verication employed in the software engineering life-cycle. Even when bugs are detected in a running software, revealing the root cause and reproducing the behavior can be time consuming (and even impossible), given the lack of control the engineer has over the execution of the concurrent components, as well as the number of possible scenarios that could have produced the problem. This is especially pronounced for large-scale distributed systems such as the Worldwide Large Hadron Collider Computing Grid. Formal verication methods oer more rigorous means of determining whether a system satises certain behavioral requirements, and one highly eective veri- cation technique is model checking. Model checking is a mathematically-rooted algorithmic procedure, nowadays automated by many actively-maintained and mature tools. The drawback, however, is the necessity to be procient in formal language notations such as process algebras and temporal logics, essential for describing a model of the system and the behavioral requirements to be veried by these tools. In this thesis we address the question of how to integrate model check- ing into the common software development cycle of realistic-scale, distributed, data-driven software, by automating the aspects that require formal methods expertise. For this purpose, we chose a formalism powerful enough for modeling and addressing design errors that are common for such software. We are mo- tivated by the distributed grid framework called DIRAC, developed and used by a physics community at the European Organization for Nuclear Research (CERN). We examine the feasibility of using the mCRL2 language, by system- atically abstracting the source code and modeling the behavior of two DIRAC subsystems. The case studies indicate that mCRL2 has the necessary concur- rency, data abstraction and manipulation mechanisms to faithfully model the subsystems. By simulating, visualizing, and model checking the resulting mod- els with the mCRL2 toolset, we were also able to gain a better insight into the system behavior, and replaying the counter-example traces helped in localizing the detected problems. Although some of the faulty behaviors already mani- fested in real life before the subsystems were formally modeled, they were not localized within the timespan of the modeling and verication we performed. Manually constructing a formal model based a software implementation can be time consuming. In addition to the risk of making modeling mistakes, in or- der to be useful, the formal model must be kept up to date with the continuous software updates. Software implementations contain too many language-specic details to serve as footprints for deriving formal models. In software engineer- ing, higher-level visual designs are created for communicating and validating the requirements, before the implementation and testing takes place. UML is generally accepted as a visual modeling language for this purpose. We present 1 a transformation approach for automatically deriving mCRL2 formal models, based on UML sequence and activity diagrams. We discuss the semantic choices we made with respect to some ambiguities in the ocial semi-formal UML se- mantics, and compare our approach to existing ones, in the context of a well- known classication. The transformation preserves the object-oriented structure of the system, facilitating a straightforward round-trip approach in which the verication counter-examples can also be visually presented as sequence dia- grams. To provide some empirical evidence and condence in the correctness of the translation, we apply the tool-supported approach on DIRACs new work- load system functionality. We discover the root cause of a logical aw leading to no-progress, which had been observed earlier in the testing phase of this functionality. In model checking, behavioral properties must be expressed as formulas in temporal logic. The mCRL2 toolset requires the use of the modal -calculus for this; a very expressive logic, but not very intuitive nor accessible. Behavioral requirements for software are typically expressed in natural language, and as such can be subjects to dierent interpretations. In the context of the main re- search question, to bring the process of correctly eliciting behavioral properties closer to software engineers, we introduce a property assistant tool - PASS, as part of a UML-based front-end to the mCRL2 toolset. The tool provides assis- tance to non-experts in eliciting properties which capture the required behavior precisely and unambiguously. It is based on a well-known property pattern classication, which we extend with new pattern variations for the event-based modal -calculus. Besides a -calculus formula, the tool outputs a natural language summary, and a UML sequence diagram depicting the property. In addition, for a subclass of properties, PASS automatically generates monitoring structures which can be used for a (potentially) more ecient property-driven runtime verication. We demonstrate the usage of PASS on DIRACs behavior. The property pattern classication is based on a large literature survey of how specication formalisms are used in practise. Since we did not encounter any -calculus formulas in the collection repository, we were curious whether its usage diers signicantly from other formalisms. We surveyed 25 published works that use -calculus to express system properties from dierent domains. From the 178 properties in the survey, our pattern extensions improve the stan- dard patterns classication coverage by 10compared to the more commonly used logics (like LTL- Linear Temporal Logic, en CTL - Computation Tree Logic), is able to capture certain property patterns which either LTL or CTL cannot. We observed a distribution of patterns which is rather consistent with similar sur- veys focusing on other formalisms. The results also indicate that subtle mistakes can easily be made when constructing temporal logic formulas manually. As one possible future direction, it would be interesting to extend this work beyond discovery of behavioral problems, in particular in the area of perfor- mance assessment of UML models. Although this type of analysis is a primary concern for the domain of real-time and embedded systems, valuable insight can also be gained about communication costs, delays, and various bottlenecks in distributed systems

Similar works

Full text

CERN Document Server

oai:cds.cern.ch:2130792

Last time updated on 09/08/2016

This paper was published in CERN Document Server.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.