CoqHammer 1.1.1 for Coq 8.9

We are happy to announce the release of CoqHammer 1.1.1 for Coq 8.9.

CoqHammer is a general-purpose automated reasoning tool that combines learning from previous proofs with the translation of problems to the logics of automated provers and the reconstruction of successfully found proofs via tactics.

In a typical workflow, the user first tries the hammer tactic which invokes one or more automated provers, and, if successful, prints a sequence of calls to custom reconstruction tactics bundled with CoqHammer. The user can then substitute this sequence for the hammer invocation. The reconstruction tactics can also be used standalone.

Previous work has shown that around 40% of the lemmas in the Coq standard library and around 25% of the lemmas in a large Coq dataset can be automatically proven by CoqHammer.

The key changes in this version are as follows:

  • Separate packaging in OPAM of the reconstruction tactics (coq-hammer-tactics) and the Coq plugin (coq-hammer).
  • Quick plugin and tactics tests which do not require automated provers installed (make quicktest, make test-plugin, make test-tactics).
  • Machine-learning features now take into account the polarity (positive/negative) of symbol occurrences (opt_feature_polarity).
  • Opaqueness information now taken into account with constant unfolding.

The easiest way to install CoqHammer is via OPAM:

opam repo add coq-released https://coq.inria.fr/opam/released
opam install coq-hammer

Note that for the best experience using the hammer, users should install all supported automated provers (Vampire, CVC4, Eprover, and Z3) according to the instructions.

For more information about hammers and CoqHammer, see the papers Hammering towards QED and Hammer for Coq.

7 Likes

Interesting!
Is there a reason that big developments like math-components and corn/math-classes are missing in coq-gym?

MathComp was included in the CoqGym dataset. This probably lowered the overall CoqHammer success rate though, since boolean reflection is currently not handled well (doing this in the TODO).

However, I don’t know why Math Classes and Corn were omitted. @yangky11 could you comment?

We only manually build a few Coq projects and tried to build the rest via simple commands such as make && make install. The reason that CoqGym doesn’t include math-classes may be that it failed to compile due to an unsatisfied dependency or some miscellaneous errors.

@palmskog Our numbers in the paper are results on the testing set of CoqGym while math-comp is in the training set. So I don’t think math-comp will affect the reported success rate for CoqHammer. But indeed, we can evaluate CoqHammer on the entire CoqGym to provide more informative results.

I would definitely be interested in the results of CoqHammer on the whole training set (minus MathComp, which CoqHammer authors already tried to some extent). I think it should be reasonably straightforward for someone familiar with a project to tell whether CoqHammer has a chance of a high success rate (e.g., not much use of complicated inductive reasoning).

That’s strange. math-classes and corn have been in the coq testing suite for years.
They are also in the coq-community.

It would be very interesting to see how coq-hammer behaves on such big libraries.

Corn depends on math-classes, and math-classes build steps require an initial call to ./configure.sh. If the proess the authors of the paper tried is just to run make && make install (instead of, e.g., reusing the build instructions from opam), then it makes sense that it didn’t work for math-classes (and thus corn).

@yangky11 does that help?

We actually tried not only make && make install but also some variants including ./configure.sh && make && make install. Here the reason is that we crawled urls to Github repos from the Coq Package Index. However, unlike many other projects, the links for corn and math-classes do not point to their Git repos directly. But it shouldn’t be hard to include them using the CoqGym code, just build these two projects manually after the cd coq_projects && make in the installation step, and extract proofs from them.

How much work is required to update CoqHammer for new releases of Coq? Coq 8.9 shipped in January, about 5 months before CoqHammer 1.1.1, while Coq 8.10 is almost ready to ship. What could Coq do to shorten the delay?

CoqHammer did have a release for Coq 8.9 on Feb 11, 2019. Additionally, CoqHammer is part of Coq’s CI and receives steady updates from Coq developers. However, the initial release wasn’t advertised widely, which I tried to rectify here with this minor release. I also think the separate packaging of tactics is a notable qualify-of-life improvement.

1 Like

That’s interesting. I’m not entirely sure how the package index is supposed to be used.
I would expect there to be a possibility to add two urls: one for information about the project, one for the sources.
@Zimmi48 do you know more ?

Crawling the website for source URLs sure doesn’t look like a good idea. All packages in opam have meta-data pointing to the source (given that they are built from sources). I think @palmskog and @maximedenes can better answer than me how to get this meta-data. One possibility will soon be to extract the data from https://gitlab.com/coq/opam-coq-archive/-/jobs/artifacts/master/raw/coq-packages.json?job=json-data (this link is a 404 and will stay so until https://github.com/coq/opam-coq-archive/pull/798 gets merged).

EDIT: actually this still doesn’t contain the URLs of the source. See https://github.com/coq/opam-coq-archive/issues/804.

1 Like

For anyone following the discussion on the OPAM archive metadata, the current state as of Coq 8.11.0 is as follows.

A reasonably complete JSON version of the archive with URLs is continuously available: https://coq.inria.fr/opam/coq-packages.json

However, a few data items necessary for full automation (e.g., when creating machine learning datasets as above) is currently not present in the JSON file, as outlined in an issue on the OPAM archive.

2 Likes

Did anyone rerun the experiments on the enhanced dataset?