Experiment: Coq Code Interpreter in ChatGPT

Hello!

I created a Custom GPT that is able to interact with a Coq runtime environment. This is a new way to interact with Coq. It is very similar to the built-in python code interpreter: ChatGPT is able to check the correctness first and fix problems before presenting the result to the user.

It is possible to ask for help like:

Help me understand this Coq code [COQ]Check S (S 0).[\COQ]

But also to support you in definitions, examples and (small) proofs.

Define the hamming distance.
A codeword should be a list nat.
Define three examples.
Execute everything to check for correctness.

It is as of now very limited because ChatGPT has some shortcomings using Coq: it often did not even get the syntax correct.

As of now, this is more an idea than a tool, but I’m very interested in your opinion and experiments.

The Custom GPT can be found here - a ChatGPT plus plan is needed:
Math Formal Proof Assistant

A short description which also includes the encountered shortcomings:
Introduction to GPT-Coq Assistant

Kind regards,

Andre

1 Like

Very nice! Is this running through the coq-lsp?

One thing I’ve noticed when trying to use it is that it doesn’t recognize when it’s mislabelled its variables/hypotheses. Consider this attempted proof of the division theorem:

From Coq Require Import Arith.Arith.
From Coq Require Import Psatz.
From Coq Require Import Bool_nat.

Theorem division_theorem : forall a b : nat, b > 0 ->
  exists q r : nat, a = b * q + r /\ 0 <= r < b.
Proof.
  intros a b Hb.
  induction a as [| a' IH].
  - exists 0, 0.
    lia.
  - destruct IH as [q [H1 H2]].
    destruct (lt_ge_dec (S a') b).
    + exists 0, (S a').
      split; [ simpl; lia | split; auto with arith; lia ].
    + exists (S q), (a' - b + 1).
      split; [ rewrite H1; simpl; lia | lia ].
Qed.

It accidentally named the remainder H1 (or rather, it left out r and hence H1 is a nat rather than an equality) and wasn’t able to correct after attempting to rewrite with H1 and repeatedly getting

Error: Cannot find a relation to rewrite.

Is there a way for it to inspect the proof state? I’ve asked it to try Show 0 (it kept trying Show Goal, which sadly doesn’t do what you would expect) which has helped a little, but shouldn’t be necessary. (If it is needed, it’s worth including “Try Show 0 when you get an unexplained error” as an instruction to the GPT.)

Thanks a lot for your reply.

Currently a coqc on the complete source code is executed and the output and errors are pushed back to GPT. The problem is, that GPT misses sometimes the location and the exact problem. I was already looking for a way to provide more and detailed information, like the current hypothesis and goals, but did not yet find an easy way. LSP cannot be used as the actions in GPT are REST and to my knowledge rate-limited (and very slow).

I updated the custom GPT which now uses coqtop and therefore has much more detailed information about hypothesis, goals and errors on a line-by-line base.
IMHO from Coq and integration into a custom GPT side cannot be done much more as of now. ChatGPT reacts not that predictable and often is not able to get to a (good) proof strategy - and sometimes it runs even in circles. But for small proofs and support it is already helpful, I think.

1 Like