"Qed." takes a long time

I am writing a chess program in Coq and encountered a strange problem. The “Qed.” command after one of my proofs seemed not to terminate. The proof itself is ok (all green in CoqIDE).

I simplified the code to isolate the problem and found that the time it takes for the “Qed.” to finish depends on the size of the chess board. If the board is 4x4, it takes a few seconds. If the board is 5x5, it takes 16 minutes. How long it would take for the actual code with an 8x8 board I couldn’t begin to imagine. I used Coq version 8.15.2.

If anyone wants to take a look at the code, here are the two versions for a 4x4 and a 5x5 board:

The entry for the Qed command in the manual cautions that “one may have to wait a while when the proof is large”, but my proof is rather simple.

Any hints as to what causes this behavior and what I might do to solve the problem would be greatly appreciated.

It has trouble rechecking the first unfold find_king in * (which is doing find_kind in H)
Indeed if you do

Lemma find_king_correct : forall pp c king_rank king_file,
  In (Loc king_rank king_file) (find_king pp c) <->
  (location_valid (Loc king_rank king_file)) /\
  (get_square_by_index pp king_rank king_file =
  Occupied c King).
Proof.
  intros pp c king_rank king_file. split.
  - unfold find_king;intro H. apply find_piece_correct in H. HdAnd. split; auto.
    apply valid_squares_suf in H. auto.
  - intro H;unfold find_king. HdAnd. apply valid_squares_nec in H.
    apply find_piece_correct. split; auto.
Qed.

(intro H after the unfold in the first bullet) then it is fast, but if you do


Lemma find_king_correct : forall pp c king_rank king_file,
  In (Loc king_rank king_file) (find_king pp c) <->
  (location_valid (Loc king_rank king_file)) /\
  (get_square_by_index pp king_rank king_file =
  Occupied c King).
Proof.
  intros pp c king_rank king_file. split.
  - intro H;unfold find_king in H. apply find_piece_correct in H. HdAnd. split; auto.
    apply valid_squares_suf in H. auto.
  - intro H;unfold find_king. HdAnd. apply valid_squares_nec in H.
    apply find_piece_correct. split; auto.
Qed.

(intro H before unfold in the first bullet) then it is slow.
This is because unfold inserts a trace of the goal change in the produced term, but there is no trace when unfolding in an hypothesis.

Specifically it needs to do


Check
  fun (pp : PiecePlacements) (c : Color)
    (king_rank king_file : nat)
    (H : @In SquareLocation (Loc king_rank king_file)
             (find_king pp c))
  =>
    (H : (@In SquareLocation (Loc king_rank king_file)
              (find_piece pp c King valid_locations))).

which is slow. It compares find_king ... and find_piece ... valid_locations, and decides to unfold find_piece (and then valid_locations) first. This is a mistake as the size of the unfolded find_piece ... valid_locations is about 2^(n^2): find_piece looks like

Fixpoint find_piece ... locations := match locations with 
| [] -> []
| elt :: rest -> if cond then elt :: find_piece ... rest else find_piece ... rest
end

and we are not unfolding the cond (they are on variables so unfolding it won’t pick a side).

So these ways make this work:

  • delay the intro until after the unfold: the unfold trace makes Qed check the conversion in the opposite order so it unfolds find_king first and finds out that it’s convertible quickly
  • put Strategy expand [find_king]. which makes conversion always unfold find_king first
  • put Strategy opaque [valid_locations]. (which is where the n^2 part appears) so that valid_locations is unfolded last
  • define valid_locations using combinators, eg Definition valid_locations := concat (map (fun l => map (fun r => Loc l r) (seq 0 5)) (seq 0 5)). (for the 5x5 case), this allows unfolding to be less eager

Thank you very much. Even the original 8x8 case works instantly when I unfold first and introduce the hypothesis afterward.

I am not quite sure I understand the cause completely. I guess I need to be wary of unfolding in hypotheses. Couldn’t Coq keep track of unfolds in hypotheses in the same way it does for unfolds in the goal?

The way it keeps track of unfolds in the goal is by doing refine (_ : OLD GOAL), you can see eg if you do
unfold find_king (without having introd) then Show Proof says

(fun (pp : PiecePlacements) (c : Color) (king_rank king_file : nat) =>
 conj
   (?Goal0
    :
    In (Loc king_rank king_file) (find_king pp c) ->
    location_valid (Loc king_rank king_file) /\
    get_square_by_index pp king_rank king_file = Occupied c King) ?Goal)

then if you intro H Show Proof says

(fun (pp : PiecePlacements) (c : Color) (king_rank king_file : nat) =>
 conj
   ((fun H : In (Loc king_rank king_file) (find_piece pp c King valid_locations) => ?Goal0)
    :
    In (Loc king_rank king_file) (find_king pp c) ->
    location_valid (Loc king_rank king_file) /\
    get_square_by_index pp king_rank king_file = Occupied c King) ?Goal)

The fun H has the unfolded type.

I guess @SkySkimmer 's answer means Coq cannot do that in exactly the same way; but (EDIT) it seems unfold foo in H could be implemented as revert H; unfold foo at <occurrences in H>; intro H, aka
refine (((fun H : new_type) => _) : OLD GOAL) H. Is there some blocker to that? The usual backward compatibility questions don’t seem huge here…

Today, we encountered the same issue in our project. Here are some additional observations:

  1. Coq’s kernel unification algorithm is not symmetric. The following proof terms from the derivation in the chess example show it:
Definition foo1 :=
(fun (pp : PiecePlacements) (c : Color) (king_rank king_file : nat)
    (H : In (Loc king_rank king_file) (find_king pp c)) =>
  let H0 : In (Loc king_rank king_file) (find_piece pp c King valid_locations) := H in
  True).

Definition foo2 :=
(fun (pp : PiecePlacements) (c : Color) (king_rank king_file : nat)
    (H0 : In (Loc king_rank king_file) (find_piece pp c King valid_locations)) =>
  let H : In (Loc king_rank king_file) (find_king pp c) := H0 in
  True).

foo1 and foo2 only differ by that foo1 is casting the folded form H to the unfolded form H0 while foo2 is casting in the opposite direction. foo1 takes a long time to typecheck (if it does), while foo2’s typechecking is instant. The same phenomenon happens for foo3 and foo4:

Definition foo3 :=
(fun (pp : PiecePlacements) (c : Color) (king_rank king_file : nat)
    (H : In (Loc king_rank king_file) (find_king pp c)) =>
  H : In (Loc king_rank king_file) (find_piece pp c King valid_locations)).

Definition foo4 :=
(fun (pp : PiecePlacements) (c : Color) (king_rank king_file : nat)
    (H0 : In (Loc king_rank king_file) (find_piece pp c King valid_locations)) =>
  H0 : In (Loc king_rank king_file) (find_king pp c)).
  1. Manual unfolding may be not even necessary. Directly apply after intros works, while Qed is still slow:
Lemma find_king_correct_if : forall pp c king_rank king_file,
  In (Loc king_rank king_file) (find_king pp c) -> 
  (location_valid (Loc king_rank king_file)) /\
  (get_square_by_index pp king_rank king_file = 
  Occupied c King).
Proof.
  intros pp c king_rank king_file.
  - intros H. apply find_piece_correct in H. HdAnd. split; auto.
    apply valid_squares_suf in H. auto.
Qed.

That implies the unification algorithm in apply is powerful enough to figure out the right way, while the algorithm in Qed cannot. I understand that a more conservative algorithm is used in Qed, but it is worth thinking about improvements.
3. The issue in our project is more complicated and it seems not achievable by tuning the unification procedure by Strategy expand [find_king] and Strategy opaque [valid_locations]. We are still trying to find a nice solution.

Another example similar to foo4, but from a different proof script, the second Qed takes a long time:

Lemma find_king_correct_if : forall pp c king_rank king_file,
  In (Loc king_rank king_file) (find_king pp c) -> 
  (location_valid (Loc king_rank king_file)) /\
  (get_square_by_index pp king_rank king_file = 
  Occupied c King).
Proof.
  intros pp c king_rank king_file.
  - unfold find_king. intros H. apply find_piece_correct in H. HdAnd. split; auto.
    apply valid_squares_suf in H. auto.
Qed.

Lemma find_king_correct_if' : forall pp c king_rank king_file,
  In (Loc king_rank king_file) (find_piece pp c King valid_locations) -> 
  (location_valid (Loc king_rank king_file)) /\
  (get_square_by_index pp king_rank king_file = 
  Occupied c King).
Proof.
  intros *.
  apply find_king_correct_if.
Qed.

I find replacing simpl in H with the following tactic solves our problem:

(* simpl in H with a easy-to-check proof term. *)
Ltac simpl_in H :=
  let type_of_H := type of H in
  let type_of_H' := eval simpl in type_of_H in
  lazymatch goal with
  | |- ?Goal =>
      revert H;
      refine ((_ : type_of_H' -> Goal) : type_of_H -> Goal);
      intros H
  end.

Note that (1) simpl can be replaced by other evaluation tactics; (2) it doesn’t work when H is used in other assumptions.