Published: 6th August 2021
DOI: 10.4204/EPTCS.338
ISSN: 2075-2180


Proceedings of the 6th Workshop on
Formal Integrated Development Environment
Held online, 24-25th May 2021

Edited by: José Proença and Andrei Paskevich

Invited Presentation: An Overview of the mCRL2 Modelling and Verification Toolset
Jan Friso Groote
Invited Presentation: Challenges of a Point & Click Mathematical Proof Builder Interface
Benoit Rognier
The Specification Language Server Protocol: A Proposal for Standardised LSP Extensions
Jonas Kjær Rask, Frederik Palludan Madsen, Nick Battle, Hugo Daniel Macedo and Peter Gorm Larsen
Dezyne: Paving the Way to Practical Formal Software Engineering
Rutger van Beusekom, Bert de Jonge, Paul Hoogendijk and Jan Nieuwenhuizen
Analysis of Source Code Using UPPAAL
Mitja Kulczynski, Axel Legay, Dirk Nowotka and Danny Bøgsted Poulsen
Plotting in a Formally Verified Way
Guillaume Melquiond
A Logic Theory Pattern for Linearized Control Systems
Andrea Domenici and Cinzia Bernardeschi
Implicit and Explicit Proof Management in KeYmaera X
Stefan Mitsch
Verifying Time Complexity of Binary Search using Dafny
Shiri Morshtein, Ran Ettinger and Shmuel Tyszberowicz
Explaining Counterexamples with Giant-Step Assertion Checking
Benedikt Becker, Cláudio Belo Lourenço and Claude Marché
Deductive Verification via the Debug Adapter Protocol
Gidon Ernst, Johannes Blau and Toby Murray
How the Analyzer can Help the User Help the Analyzer
Yannick Moy
VeriFly: On-the-fly Assertion Checking with CiaoPP (extended abstract)
Miguel A. Sanchez-Ordaz, Isabel Garcia-Contreras, Víctor Pérez, Jose F. Morales, Pedro Lopez-Garcia and Manuel V. Hermenegildo


F-IDE 2021 is the sixth international workshop on Formal Integrated Development Environment, held online on May 24-25, 2021, as part of the 13th NASA Formal Methods Symposium, NFM 2021.

High levels of safety, security and also privacy standards require the use of formal methods to specify and develop compliant software (sub)systems. Any standard comes with an assessment process, which requires a complete documentation of the application to ease the justification of design choices and the review of code and proofs. Ideally, an F-IDE dedicated to such developments should comply with several requirements. The first one is to associate a logical theory with a programming language, in a way that facilitates the tightly coupled handling of specification properties and program constructs. The second is to offer a language/environment simple enough to be usable by most developers, even if they are not fully acquainted with higher-order logics or set theory, in particular by making development of proofs as easy as possible. The third is to offer automated management of application documentation. It may also be expected that developments done with such an F-IDE are reusable and modular. Tools for testing and static analysis may be embedded within F-IDEs to support the assessment process. The workshop is a forum of exchange on different features related to F-IDEs.

We solicited multiple kinds of contributions: research papers providing new concepts and results, position papers and research perspectives, experience reports, tool presentations. The workshop was open to contributions on all aspects of a system development process, including specification, design, implementation, analysis, and documentation. The current edition is a two-day online workshop with eleven communications, offering a large variety of approaches, techniques, and tools. Each submission was reviewed by three reviewers.

We had the honour of welcoming as keynote speakers Jan Friso Groote of the Eindhoven University of Technology and Benoit Rognier of the Edukera company. The abstracts of their talks open the present proceedings.

The committees of F-IDE 2021 were composed of:

Steering Committee PC Chairs Program Committee Invited Reviewers

We would like to thank all authors who submitted their work to F-IDE 2021. We are grateful to the PC members and invited reviewers for writing high-quality reviews and participating in the subsequent discussion. We thank the organizers of the 13th NASA Formal Methods Symposium who hosted our workshop and provided all the technical support for the online presentations and discussions. Our job as Program Chairs was facilitated by the EasyChair platform and we thank the EPTCS editors who accepted the proceedings of F-IDE 2021 for publication.

An Overview of the mCRL2 Modelling and Verification Toolset

Jan Friso Groote (Eindhoven University of Technology, the Netherlands)

mCRL2 is a process algebraic language to specify the behaviour of communicating systems. It allows to specify parallel behaviour with data, time and probabilities. It provides various behavioural reduction and visualisation algorithms to help understand behaviour. It uses the modal mu-calculus with data and time to specify properties. Both languages are restricted to their essential necessities, and mathematical elegance was more a driving factor in their design than user acceptance. But they are both extremely expressive and versatile. mCRL2 and its modal logic are very effective in modelling and analysing core protocols and distributed algorithms. But — due to its powerful algorithms — they are also used as the verification backend for various industrial software development environments. mCRL2 comes with the Boost license meaning that it is open source and free to use for any purpose, both commercially and researchwise.

Challenges of a Point & Click Mathematical Proof Builder Interface

Benoit Rognier (Edukera, France)

One of the key features of Edukera is the ergonomic design of the numerical paper which allows students to build a mathematical proof with point and click interactions. This numerical paper has been developed and improved over several years of experiment with students' and teachers' feedback. This session will address the key challenges a formal proof interface may pose:

This session will show the experiments that have been conducted over the years and their outcomes.

VeriFly: On-the-fly Assertion Checking with CiaoPP (extended abstract)

Miguel A. Sanchez-Ordaz (IMDEA Software Institute and Universidad Politecnica de Madrid (UPM))
Isabel Garcia-Contreras (IMDEA Software Institute and Universidad Politecnica de Madrid (UPM))
Víctor Pérez (IMDEA Software Institute and Universidad Politecnica de Madrid (UPM))
Jose F. Morales (IMDEA Software Institute and Universidad Politecnica de Madrid (UPM))
Pedro Lopez-Garcia (IMDEA Software Institute and Spanish Council for Scientific Research (CSIC))
Manuel V. Hermenegildo (IMDEA Software Institute and Universidad Politecnica de Madrid (UPM))

The tight integration of global analysis and verification at early stages of the software development cycle can greatly help developers detect high-level, semantic errors in programs or certify their absence. Particularly useful is the application of such tools during code development, simultaneously with the code writing process. This requires fast response times and source-level presentation of the results within the code development environment, in order to provide timely and useful feedback to the programmer. However, performing a full and precise semantic analysis every time a change is made in a project can be prohibitively expensive, specially for complex properties and large, realistic code bases.

CiaoPP (Hermenegildo et al. 2005; Garcia-Contreras, Morales, and Hermenegildo 2021) is a program development framework which performs combined static and dynamic program analysis, assertion checking, and program transformations, based on computing provably safe approximations of properties, generally using the technique of abstract interpretation (Cousot and Cousot 1977). CiaoPP can be applied to programs expressed as (constrained) Horn clauses (and in particular, in the Ciao programming language1), as well as in other high- and low-level languages, using the now well-understood technique of semantic translation into intermediate Horn clause-based representation (Peralta, Gallagher, and Sağlam 1998; Henriksen and Gallagher 2006; Méndez-Lojo, Navas, and Hermenegildo 2007; Gómez-Zamalloa, Albert, and Puebla 2009; Grebenshchikov et al. 2012; Gurfinkel et al. 2015; De Angelis et al. 2015; Kahsai et al. 2016; Gallagher et al. 2020).2 The framework has many uses, but one of the main ones is precisely as an aid for the programmer during program development, since it can capture semantic errors that are significantly higher-level than those detected by classical compilers, as well as produce certificates that programs do not violate their assertions, eliminate run-time assertion tests, etc. In CiaoPP, the requirements for fast response time stemming from interactive use pointed out above are addressed through a number of techniques, and in particular by an efficient fixpoint engine, which performs context- and path-sensitive inter-procedural analysis incrementally. I.e., it reacts to fine-grain editions, avoiding reanalyses where possible, both within modules and across the modular organization of the code into separate compilation units.

In this extended abstract we present "VeriFly," an integration of the CiaoPP static analysis and verification framework within an integrated development environment (IDE) in which analysis and verification results are reflected directly on the program text as colorings and tooltips. The concrete implementation described builds on an existing Emacs-based development environment for the Ciao language. Emacs was chosen because it is a solid platform and preferred by many experienced users. The integration reuses off-the-shelf "on-the-fly" syntax checking capabilities offered by the Emacs flycheck package.3 However, this low-maintanance approach should be easily reproducible in other modern extensible editors. We also discuss how our integration takes advantage of the incremental and modular features of CiaoPP to achieve a high level of reactivity.

The CiaoPP Framework

We start by providing an informal overview of the components and operation of the Ciao/CiaoPP framework (represented by the yellow shaded part of Figure 1).

Integration of the CiaoPP framework in the Emacs-based
Figure 1. Integration of the CiaoPP framework in the Emacs-based IDE.

In order to support different input languages, CiaoPP translates input programs to a language-independent intermediate representation, which in this case is (constrained) Horn clause-based (Méndez-Lojo, Navas, and Hermenegildo 2007) -an approach used nowadays in many analysis and verification tools. We refer to this intermediate representation as the "CHC IR." This CHC IR is handled uniformly by the analyzers, independently of the input language. The translation is performed by the "Front-end" (Figure 1). Techniques such as partial evaluation and program specialization offer powerful methods to obtain such translations with provable correctness. Using this approach, CiaoPP has been applied to the analysis, verification, and optimization of a number of languages (besides Ciao) ranging from very high-level ones to bytecode and machine code, such as Java, XC (C like) (Lopez-Garcia et al. 2018), Java bytecode (Navas, Méndez-Lojo, and Hermenegildo 2009; Navas, Méndez-Lojo, and Hermenegildo 2008), ISA (Liqat et al. 2014), LLVM IR (Liqat et al. 2016), Michelson (Perez-Carrasco et al. 2020), ..., and properties ranging from pointer aliasing and heap data structure shapes to execution time, energy, or smart contract "gas" consumption (Mera et al. 2008; Liqat et al. 2018). However, for simplicity and concreteness, in our examples herein we will use logic programs, because of their proximity to the CHC IR. In particular, the examples will be written in Ciao's (Hermenegildo et al. 2012) logic programming subset, with Prolog-style syntax and operational semantics.4 The framework itself is also written in Ciao.

An important element of the framework is its assertion language (Bueno et al. 1996; Hermenegildo, Puebla, and Bueno 1999; Puebla, Bueno, and Hermenegildo 2000a). Such assertions can express a wide range of properties, including functional (state) properties (e.g., shapes, modes, sharing, aliasing, ...) as well as non-functional (i.e., global, computational) properties such as resource usage (energy, time, memory, ...), determinacy, non-failure, or cardinality. The set of properties is extensible and new abstract domains (see the later discussion of the analysis) can be defined as "plug-ins" to support them. Assertions associate these properties to different program points, and are used for multiple purposes, including writing specifications, reporting static analysis and verification results to the programmer, providing assumptions, describing unknown code, or generating test cases automatically.

We will use for simplicity (a subset of) the "pred"-type assertions, which allow describing sets of preconditions and conditional postconditions on the state for a given procedure (predicate). The syntax that we present below is that of the CHC IR(and the Ciao language); assertions in the different input languages are translated by the front end to and from this format (left of Figure 1). A pred assertion is of the form:

:- [ Status ] pred Head [: Pre ] [=> Post ].

where Head is a predicate descriptor that denotes the predicate that the assertion applies to, and Pre and Post are conjunctions of property literals, i.e., literals corresponding to predicates meeting certain conditions (Puebla, Bueno, and Hermenegildo 2000a). There can be multiple pred assertions for a predicate. Pre expresses properties that hold when Head is called, namely, at least one Pre must hold for each call to Head. Post states properties that hold if Head is called in a state compatible with Pre and the call succeeds. Both Pre and Post can be empty conjunctions (meaning true), and in that case they can be omitted. Status is a qualifier of the meaning of the assertion. Here we consider (in the context of static assertion checking (Puebla, Bueno, and Hermenegildo 2000b)) the following Statuses:

For example, the following assertions describe different behaviors of the pow(X,N,P) predicate, that computes \(P=X^N\): (1) is stating that if the exponent of a power is an even number, the result (P) is non-negative, (2) states that if the base is a non-negative number and the exponent is a natural number the result P also is non-negative:

:- pred pow(X,N,P) : (int(X), even(N)) => P >= 0.  % (1)
:- pred pow(X,N,P) : (X >= 0,  nat(N)) => P >= 0.  % (2)
pow(_, 0, 1).
pow(X, N, P) :-
    N > 0,
    N1 is N - 1, 
    pow(X, N1, P0), 
    P is X * P0.

In order to perform assertion verification, the CiaoPP framework uses analyses based on abstract interpretation (the "Static Analyzer" in Figure 1), in order to statically compute safe approximations of the program semantics at different relevant program points. Given a program, a number of abstract domains are automatically selected, as determined by their relevance to the properties that appear in the assertions written by the programmer or present in libraries, and a combined analysis is performed using this set of abstract domains. The resulting safe approximations are compared directly with these assertions (the "Static Checker" in Figure 1) in order to prove the program correct or incorrect with respect to them. For each assertion originally with status check, the result of this process (boxes on the right of Figure 1) can be: that it is verified (the new status is checked), that a violation is detected (the new status is false), or that it is not possible to decide either way, in which case the assertion status remains as check. In such cases, optionally, a warning may be displayed and/or a run-time test generated for (the part of) the assertion that could not be discharged at compile-time, test cases generated, etc. We refer the reader to (Bueno et al. 1997; Puebla, Bueno, and Hermenegildo 2000b; Lopez-Garcia et al. 2018; Hermenegildo et al. 2005), for a detailed description of sufficient conditions used to prove correctness (or incorrectness) w.r.t. some specification, or intended behavior. Note that, in our framework, the intended semantics are also specified by using predicates. This means that some of the properties are undecidable and also not exactly representable in the abstract domains.

As to performing the static analysis, CiaoPP builds an analysis result in the shape of an abstract graph, representing how the clauses are executed. Nodes in this graph abstract how predicates are called (i.e., how they are used in the program). A predicate may have several nodes if there are different calling situations (also known as context-sensitivity). For each calling situation, properties that hold if the predicate succeeds are also inferred; these are represented by (true) assertions. The values for the call and success properties are abstractions of the state in the concrete execution of the program and are defined in terms of the abstract domain(s) selected.5 The edges in the graph capture how predicates call each other. Hence this analysis graph also provides an abstraction of the paths explored by the concrete executions through the program (also known as path-sensitivity). The analysis graph thus embodies two different abstractions (two different abstract domains): the graph itself is a regular approximation of the paths through the program, using a domain of regular structures. Separately, the abstract values (call and success patterns) contained in the graph nodes are finite representations of the states occurring at each point in these program paths, by means of one or more data-related abstract domains.

Finally, in order to support incrementality, the analysis graph produced by the static analyzer is made persistent ("Analysis DB" in Figure 1), storing an abstraction of the behavior of each predicate and predicate (abstract) call dependencies. In turn, the "Front-end" (Figure 1) keeps track of and translates source code changes into, e.g., clause and assertion additions, deletions, and changes in the intermediate representation (\(\Delta\) CHC in the figure). These changes are transmitted to the static analyzer, which performs incremental fixpoint computation. This process consists in finding the parts of the graph that need to be deleted or recomputed, following their dependencies, and updating the fixpoint (Garcia-Contreras, Morales, and Hermenegildo 2021, 2020; Puebla and Hermenegildo 1996; Hermenegildo et al. 2000). The key point here is that a tight relation between the analysis results and the predicates in the program is kept, allowing reducing the re-computation to the part of the analysis that corresponds to the affected predicates, and only propagating it to the rest of the analysis graph if necessary.

VeriFly: The On-the-fly IDE Integration

Figure 1 shows the overall architecture of the CiaoPP verification framework integrated with the new IDE components, represented by the new box to the left, and the communication to and from CiaoPP. As mentioned before, the tool interface is implemented within Emacs and the on-the-fly support is provided by the Emacs "flycheck" package. Flycheck is an extension developed for GNU Emacs originally designed for on-the-fly syntax checking, but we use it here in a semantic context. However, as also mentioned before, a similar integration is possible with any reasonably extensible IDE.

The CiaoPP framework runs in the background in daemon mode. When a file is opened, modified (and after some small period of inactivity), or saved, an editor checking event is triggered. Edit events notify CiaoPP(via a lightweight client) about program changes. The CiaoPP daemon will receive these changes, and, behind the scenes, transform them into changes at the CHC IR level (also checking for syntactic errors), and then incrementally (re-)analyze the code and (re-)check any reachable assertions. This can be code and assertions in libraries, other modules, language built-in specifications, or user-provided assertions. The results (errors, verifications, and warnings), from static (and possibly also dynamic) checks are returned to the IDE and presented as colorings and tooltips directly on the program text. This overall behavior is what we have called in our tool the "VeriFly" mode of operation.

In general, CiaoPP can be run fully automatically and does not require the user to change the configuration or provide invariants or assertions, and, for example, selects automatically abstract domains, as mentioned before. However, the system does have a configuration interface that allows manual selection of abstract domains, and of many parameters such as whether passes are incremental or not, the level of context sensitivity, error and warning levels, the type of output, etc.

The option browsing interface of the tool is shown in (Sanchez-Ordaz et al. 2021), as well as some options (abstract domain selections) in the menus for the cost analysis, shape and type analysis, pointer (logic variable) aliasing analysis, and numeric analysis.

An Example

Static verification of (moded) types, determinacy, termination, and cost (errors detected).
Figure 2. Static verification of (moded) types, determinacy, termination, and cost (errors detected).

As an example of the system in action.6 Figure 2 shows naive reverse written using the functional notation library and illustrates the detection of an error regarding unintended behavior w.r.t. cost. The assertion in lines 5 and 6 of Figure 2 (left) states that predicate nrev should be linear in the length of the (input) argument A. This is expressed by the property steps_o(length(A)), meaning that the cost, in terms of resolution steps, of any call to nrev(A, B) with A bound to a list and B a free variable, is in \(O(\)length(A)\()\). However, this worst case asymptotic complexity stated in the user-provided assertion is incompatible with a safe, quadratic lower bound on the cost of such calls (\(\frac{1}{2} \ length{(A)}^2 + \frac{3}{2} \ length(A) + 1\)) inferred by the static analyzer. In contrast, assertion in lines 5 and 6 of Figure 3 (left) states that predicate nrev should have a quadratic worst case asymptotic complexity in the length of A, which is proved to hold by means of the upper-bound cost function inferred by analysis (which coincides with the lower bound above). Figure 3 also shows the verification of (moded) types, determinacy, non-failure, and termination properties. Additional examples can be found in (Sanchez-Ordaz et al. 2021).

Static verification of (moded) types, determinacy, termination, and cost (verified).
Figure 3. Static verification of (moded) types, determinacy, termination, and cost (verified).

Some Performance Results and Conclusions

Our initial experience with the integrated tool shows quite promising results, with low latency times that provide early, continuous, and precise "on-the-fly" semantic feedback to programmers during the development process. This allows detecting many types of errors including swapped variables, property incompatibilities, illegal calls to library predicates, violated numeric constraints, unintended behaviour w.r.t. termination, resource usage, determinism, covering and failure (exceptions), etc.

As an example, on the well-known chat-80 program7, which contains \(5.2\)k lines of Prolog code across 27 files, and uses a number of system libraries containing different Prolog built-ins and library predicates, we performed experiments consisting in opening a specific module in the IDE, and activating the checking of assertions with global analysis, i.e., analyzing the whole application as well as the libraries, and then performing a series of small edits, observing the total, roundtrip response time, i.e., the time from edit to graphical update of assertion coloring in the IDE. To study whether incrementality improved response times significantly, we included experiments enabling and disabling it. The experiments were performed in a MacBook Air with the Apple M1 chip and 16 GB of RAM. We evaluated the tool with sharing/groundness domains because they are known to be costly and at the same time necessary to ensure correctness of most other analyses in (C)LP systems that support logical variables, and furthermore in any language that has pointers (aliasing).

In all of the experiments incrementality provided significant speedups. When performing edits both in predicates and assertions, incrementality allowed keeping the response time, crucial for the interactive use case of the analyzer that we propose, under 2 s, achieving speedups greater than \(\times 3.5\). When performing edits only in assertions, the tool detects that no changes are required in the analysis results, and the assertions can be rechecked w.r.t. the available (previous) analysis, improving the performance significantly (more than \(\times 9\)) but, more importantly, the response times are around 2 s. All in all, the incremental features allow using many domains and, at least in some cases, even the most expensive domains. The experiments also revealed that an interesting configuration of this tool is to run different analyses in a portfolio, where which analyses to run is decided depending on the kind of change occurred. In addition, we observed a constant overhead of \(0.4\) s for loading the code-parsing and prior transformations-in the tool. This is currently still not fully incremental and has not been optimized yet to load only the parts that change. Verification times are negligible w.r.t. analysis times and are approx. \(0.1\) s; this is also non incremental, since we found that it is not currently a bottleneck, although we plan to make it incremental in the future to push the limits of optimizations forward. We also plan to continue to work to achieve further reactivity and scalability improvements, enhanced presentations of verification results, and improved diagnosis.



  2. Space in this tool presentation extended abstract does not permit covering other related work properly (such as, e.g., (Rustan, Leino, and Wüstholz 2015; Tschannen et al. 2011)), but additional references can be found in the CiaoPP overview papers and the other citations provided.


  4. Ciao is a general-purpose programming language that integrates a number of programming paradigms including functional, logic, and constraint programming.

  5. As mentioned before, abstract domains are defined as plug-ins which provide the basic abstract domain lattice operations and transfer functions, and are made accessible to the generic fixpoint computation component.

  6. The system is demonstrated in the workshop talk, showing additional examples.


  8. This document is an extended abstract of Technical Report CLIP-1/2021.0 (Sanchez-Ordaz et al. 2021). Partially funded by MICINN PID2019-108528RB-C21 ProCode and Madrid P2018/TCS-4339 BLOQUES-CM. We would like to thank the anonymous reviewers for their very useful comments.