What does it mean that Isabelle has better automation than Coq?

I’ve read before (and heard before) that Isabelle has better automation than Coq. But I don’t really know what that means. Can someone clarify this to me?

My guesses for what this might means:

  • Isabelle’s sledgehammer is more powerful than coq’s coqhammer (which is intuitively weird to me since Coq could just use the same ATPs as Isabelle so I don’t see why this should be the case)
  • Isabelle’s tactic’s do more simplifying by default

besides that I can’t see why that should be true.

Thanks in advance.

1 Like

Others can give deeper reasons, but I believe a key reason for the difference is that Isabelle is strictly for classical logic, and it doesn’t support dependent types. Coq is for constructive logic, and has dependent types. Of course, one can take excluded middle as an axiom in Coq, but a lot of people prefer not to. And having dependent types means that Coq can deal with many types of proofs that Isabelle cannot. So Isabelle has a more “powerful” logical core (classical logic eg excluded middle, in the sense of a powerful axiom to work with esp for automation), and can reason about fewer things.

Duly noting the quotes, I should nonetheless say that, if anything, it’s rather the opposite. Classical HOL can embedded into Coq, but not the converse. The fact that the Isabelle logical system is simpler is precisely the reason it’s easier to work with.

Thanks for the feedback.

What do you mean that it’s “simpler to work with”?

Is automation power have anything to do with the power of it’s ATP and tactics? What does better automation really mean?

@ppedrot there is an embedding of lean in Coq. Does that extend to an embedding of isabelle in Coq?
I don’t think anyone has tried that seriously since the introduction of sProp (?)

It should be noted that Sledgehammer is now well implanted in the Isabelle ecosystem, being distributed by default with Isabelle and used by most of its users. On the other hand, Coqhammer is still quite recent and not as easily usable. So when you hear people say Isabelle has better automation, they might in fact mean “out of the box”.

Why is Coqhammer not easily usable?

Because it’s not trivial to install it together with its dependencies (ATPs). Once you get over this step, it should be easy to use. In the future, we hope to distribute it as part of the Coq Platform and therefore to make it easy to rely on.

2 Likes

does the Tactician paper (https://arxiv.org/abs/2008.00120) solve this usability problem (+ provide their own sort of hammer)?

The authors of Tactician took a lot of care to make their plugin easier to install. They also had impressive results which were comparable to those of CoqHammer. But I was very surprised by this, given that CoqHammer rely on very advanced ATPs and Tactician only imitates what humans have done while proving previous results. In the end, it is likely that these tools will be complementary and non-mutually-exclusive.

1 Like

Because it’s not trivial to install it together with its dependencies (ATPs).

Last time I checked, bundling the ATPs may be difficult because of their licences (especially Vampire, which is the most effective one).

They also had impressive results which were comparable to those of CoqHammer. But I was very surprised by this, given that CoqHammer rely on very advanced ATPs and Tactician only imitates what humans have done while proving previous results.

I am not surprised. The current implementation of the “hammer” tactic in CoqHammer has many limitations - it is really a first try at a “hammer for Coq”. Among others, it doesn’t deal properly with “non-first-order” features of Coq, so it won’t work very well with developments that heavily depend on them (this includes boolean reflection, higher-order functions/predicates, dependently typed data structures). Also, “hammer” insists on using intuitionistic logic for proof reconstruction, which makes it more likely to fail (ultimately, classical logic should be optionally used in reconstruction, with a first-order reconstruction prover comparable in strength to Isabelle’s metis).

Another thing is that by design CoqHammer will never try induction, leaving this to the user. I think Tactician can use the induction tactic, so when evaluated on any reasonably large collection of Coq developments it will automatically prove many “trivial” inductive lemmas that CoqHammer won’t have any hope of proving (by design).

The main issues with CoqHammer and rough ideas of how to solve them are listed in the TODO.md file in CoqHammer sources. Tackling them would improve the usability substantially, but many require substantial engineering work and/or extensive experimentation, some with relatively low “research interest” to “software development time” ratio. Which is why I probably won’t solve these issues in any near future, and there currently don’t seem to be any other people seriously interested in getting their hands dirty with CoqHammer development.

It would be interesting to isolate “first-order” problems without induction and compare Tactician and CoqHammer on them. Then I think CoqHammer would unequivocally win. Though even on some “first-order” problems CoqHammer may fare worse than Tactician, for many reasons which I all believe to be purely technical and relatively easily solvable: the rough machine-learning premise preselection might not be good enough (it hasn’t been changed much from HOLyHammer to better adapt it to Coq), the translation might be suboptimal even for some first-order problems, intuitionistic proof reconstruction may fail with dependencies for a classical proof.

In the end, it is likely that these tools will be complementary and non-mutually-exclusive.

To some extent they already are. The advantage of the “Tactician” approach is that it’s not limited by the translation to first-order ATPs. And the mismatch between the logic of Coq and pure classical first-order logic seems to be greater than between classical higher-order logic of Isabelle and classical first-order logic.

It may be the case that CoqHammer will ever be very good only for a “first-order-like” fragment of CIC (that can still be substantially improved), and trying to adapt the “translation to ATPs” approach to broader fragments of CIC, while possible, might quickly reach a point where you get relatively little benefit for a lot of work. I think we’re still far from this point, though, and solving some of the issues in the TODO.md file could improve CoqHammer substantially.

1 Like

On this note, the recent “sauto” tactic in CoqHammer 1.3 is not (in practice) limited to a “first-order” fragment and it does not depend on external ATPs. It’s not entirely push-button in that it sometimes requires specifying appropriate options for effective use. But when used proficiently, I think it comes close to the relatively strong automation (in comparison to Coq’s native tactics) provided by “auto”/“blast” in Isabelle/HOL, sometimes even surpassing it, while still staying within intuitionistic logic (disclaimer: I have used Isabelle only sporadically - this is just my impression when trying to redo some Isabelle/HOL proofs and replacing “auto”/“blast” by “sauto” with appropriate options; I would expect “blast” to be generally better on purely first-order problems).

Last time I checked, bundling the ATPs may be difficult because of their licences (especially Vampire, which is the most effective one).

The best way to enable regular Coq users to use the hammer and sauto is likely by making it part of the Coq Platform. I have already opened an issue about this on GitHub, which may be a good place to continue the discussion on getting CoqHammer to a general audience.

@MSoegtropIMC has previously confirmed that building and installing the TPTP version of Z3 (which is open source) can be scripted as part of the Coq Platform. Since we have no strict rule against non-open source projects, there could probably be installation scripts for Vampire as well.

As far as I know the Vampire licence is not open source and it explicitly forbids redistribution. It seems then that packaging it with anything or even putting on a webpage is not allowed.

Providing an installation script (which downloads the software from its original location) is usually not considered redistribution. But even if we end up including only, say, z3 and Eprover in the Coq Platform, this would still be an improvement on the current Coq automation packaging situation.

1 Like

Good news!

Since November 2020 we have moved Vampire to a permissive Modified BSD licence (see below). This supersedes all previous licences.

1 Like