I am interested in learning if anyone has experience with using Coq for code verification for high-assurance/safety-critical/high-code-complexity software code verification. If so, are there levels of software assurance and of code complexity for which Coq would not be recommended relative to software code verification.

This is a large and complex question, but I see at least three levels of assurance, with each level increasing effort substantially from the previous one:

- Verification in Coq of algorithms/protocols abstractly
- Verification in Coq of executable code, extracted to OCaml or Haskell
- Verification in Coq of executable code down to assembly level

Note that 3 is possible because real-world programming languages and processor ISAs and their semantics can be (and have been) embedded in Coq.

For level 1, there is a lot of work on, e.g., cryptography and security.

Level 2 has been done for C compilers, fault-tolerant distributed key-value stores, and crash-safe file systems.

Level 3 has been done for cryptographic code, e.g., elliptic curves, SHA-256, and the OpenSSL HMAC.

To my knowledge, Coq and other proof assistants with small proof checkers can give higher assurance than all alternative forms of software verification regardless of which abstraction level verification is applied at.

Karl:

Thank you very much for this detailed reply. This is definitely helpful.

My interest is primarily regarding what you have referred to as levels 1 and 3. The executable code for our applications include several types of high-assurance code.

The techniques used for the verification of the software types you listed may though be applicable to our area. So, I would welcome any further information you may have on use of Coq for code verification of software types beyond those you listed for levels

1 and 3.

It sounds as if from your description that substantially more work has been done for level 3, so perhaps it will be easier to focus on level 3 as regarding applications for other areas beyond those you listed. In that vein, I also wanted to ask which programming

languages have been embedded in Coq.

Again, thank you very much for this excellent information and assistance.

Gerald Prichard

Level 3, i.e., program assembly verification, is actually the least explored option in the research literature. This is both related to the novelty and for the significant effort required. The most developed Coq framework that gives assembly-level guarantees, VST, is built for verification of (only) C programs. VST uses CompCertâ€™s embedding of C into Coq. Other languages with realistic subsets embedded in Coq include Rust.

Karl:

Thank you sincerely for all of this insight. So, given that Level 3 is program assembly verification and given that substantially more work has been done for Level 1 than Level 2 and more for Level 2 than Level 3 (note: so sorry I had interpreted your original â€śin reverseâ€ť relative to the amount of work for verification of high-assurance SW and that I had misunderstood the definition of Level 3), it looks like my main interest is Level 1 and Level 2.

Given this, can you re-address my question as it would apply to Level 2 code verification? Specifically, what other areas beyond listed in original reply has Coq been applied to for Level 2 code verification?

Thank you!

Gerald

A seemingly popular application of Coq extraction is for various forms of distributed systems:

- Verdi Raft, a verified implementation of the Raft distributed consensus protocol, including a strongly consistent, fault-tolerant, distributed key-value store (mentioned above)
- Disel: includes a verified implementation of a distributed two-phase commmit protocol
- Velisarios: implementation of Byzantine fault-tolerant state machine replication via the PBFT algorithm
- Chapar: verified implementations of causally consistent key-value store algorithms

Outside of distributed systems, verification of various Haskell libraries, and runtime monitors come to mind.

Karl:

This is very helpful.

Our applications are largely physics-based software, though the types of SW you have listed as being verified via Coq are of some interest to us.

It seems that we should first focus on those areas with more proven success (cyber and distributed systems) and then use that base of knowledge to extend to our physics-based applications.

Thank you sincerely for your very kind and detailed notes, Karl.

Gerald

Karl:

I have begun learning about Why3 and wanted to ask if you think Coq has significant advantages compared to Why3 for formal-methods code verification. It appears that Why3 works with C, Java, and Ada, which are the three main production-level codes we use, though technically C is actually C++ for us. {Note: For C++ I am investigating the Microsoft development called Lean.}

Thank you,

Gerald

Why3 is an intermediate verification language with imperative features, designed mainly by a previous Coq developer. Its specification language is, as far as I can recall, first-order logic with some extensions such as inductive predicates (intuitively, less powerful than Coqâ€™s). The selling point of Why3 is its automation for verification-condition generation and output to solvers (which are mostly automatic SMT solvers, but may still include Coq). Why3 gives more out-of-the-box automation for verification of imperative programs, but similar machinery for imperative verification is available in Coq with a much smaller trusted base, e.g., via Iris.

Lean is a proof assistant quite similar to Coq (same logical foundations and high-level design principles). As a new system, there arenâ€™t all that many developments out there for most domains, although Leanâ€™s mathematics library is quite impressive. Lean doesnâ€™t really give any C++ verification capabilities directly, and to my knowledge doesnâ€™t yet support extraction or a similar mechanism for obtaining executable code.

Karl:

This is very helpful â€“ truly!

I had read something different about Lean for C++, but am glad you provided this insight as well.

Does Coq work with Java and Ada?

Gerald

I would look at the formal SPARK project for Ada program verification. Various fragments of Java have been formalized in Coq, but the best current formal approximation to Java is Jinja for the Isabelle/HOL proof assistant, which I believe has a toolchain for converting real Java programs to Jinja equivalents.

Donâ€™t use Lean, unless you are a researcher or a hobbyist. All the feedback I got on Lean is that it is *not mature*. Relying on it for industrial use would be suicidal until it has stabilized. No attention is yet given to backward compatibility. Lean 3 is currently frozen and the development of Lean 4 is happening behind closed doors. On the other hand, Coq does have industrial users and the Coq development team cares very much about providing a good compatibility story.

Besides, I think calling Lean a Microsoft development is abusive. It is a Microsoft Research project. In fact, it is a single man project and this man happens to be a Microsoft Research employee. But Microsoft Research is like a lab. You shouldnâ€™t be fooled by the name. There are no guarantees made about the future of Lean. Another Microsoft Research project for program verification is F*. It has had industrial applications. But even F* is far from stable, and less mature than Why3 for instance.

To come back to Lean, it has a number of very nice features which make it very appealing, and less clunky than Coq. But there is a real risk that after trying it and finding out how difficult it is in practice, people conclude that formal proofs are not a mature domain. It is not formal proofs who are not a mature domain, it is Lean which is not a mature proof assistant.

Theo:

Thank you sincerely for this information.

Our community is fairly committed to trying Coq, and I think it seems to provide the best chance of us implementing formal methods for our domain, though I am still looking for others who have applied Coq to the type of technical domain we have for software

(which is very high-assurance, highly physics-based code).

Very best,

Gerald

@gprichard can you say something more about what â€śphysics-basedâ€ť means in this context?

For example, there are several lines of work on hybrid systems, and some people involved in the Mathematical Components project have explored robot manipulators. But hard to say whether those would be considered physics proper.

Karl:

Most of our applications involve RF/IR/communications sensing.

Thank you for all of your help, Karl.

Gerald

For anyone using Coq for modeling continuous systems, Iâ€™d recommend taking a look at MathCompâ€™s analysis subproject and infotheo, which includes relevant definitions from analysis and probability theory and beyond. These projects are set to be merged in the future.

You may want to look at

This is a collection of C functions and data types whose behavior has been formally specified with ACSL and formally verified with Frama-C/WP.