Experiment: Coq Code Interpreter in ChatGPT

florath · January 9, 2024, 6:58am

Hello!

I created a Custom GPT that is able to interact with a Coq runtime environment. This is a new way to interact with Coq. It is very similar to the built-in python code interpreter: ChatGPT is able to check the correctness first and fix problems before presenting the result to the user.

It is possible to ask for help like:

Help me understand this Coq code [COQ]Check S (S 0).[\COQ]

But also to support you in definitions, examples and (small) proofs.

Define the hamming distance.
A codeword should be a list nat.
Define three examples.
Execute everything to check for correctness.

It is as of now very limited because ChatGPT has some shortcomings using Coq: it often did not even get the syntax correct.

As of now, this is more an idea than a tool, but I’m very interested in your opinion and experiments.

The Custom GPT can be found here - a ChatGPT plus plan is needed:
Math Formal Proof Assistant

A short description which also includes the encountered shortcomings:
Introduction to GPT-Coq Assistant

Kind regards,

Andre

rnrand · January 10, 2024, 1:45am

Very nice! Is this running through the coq-lsp?

One thing I’ve noticed when trying to use it is that it doesn’t recognize when it’s mislabelled its variables/hypotheses. Consider this attempted proof of the division theorem:

From Coq Require Import Arith.Arith.
From Coq Require Import Psatz.
From Coq Require Import Bool_nat.

Theorem division_theorem : forall a b : nat, b > 0 ->
  exists q r : nat, a = b * q + r /\ 0 <= r < b.
Proof.
  intros a b Hb.
  induction a as [| a' IH].
  - exists 0, 0.
    lia.
  - destruct IH as [q [H1 H2]].
    destruct (lt_ge_dec (S a') b).
    + exists 0, (S a').
      split; [ simpl; lia | split; auto with arith; lia ].
    + exists (S q), (a' - b + 1).
      split; [ rewrite H1; simpl; lia | lia ].
Qed.

It accidentally named the remainder H1 (or rather, it left out r and hence H1 is a nat rather than an equality) and wasn’t able to correct after attempting to rewrite with H1 and repeatedly getting

Error: Cannot find a relation to rewrite.

Is there a way for it to inspect the proof state? I’ve asked it to try Show 0 (it kept trying Show Goal, which sadly doesn’t do what you would expect) which has helped a little, but shouldn’t be necessary. (If it is needed, it’s worth including “Try Show 0 when you get an unexplained error” as an instruction to the GPT.)

florath · January 10, 2024, 10:24am

Thanks a lot for your reply.

Currently a coqc on the complete source code is executed and the output and errors are pushed back to GPT. The problem is, that GPT misses sometimes the location and the exact problem. I was already looking for a way to provide more and detailed information, like the current hypothesis and goals, but did not yet find an easy way. LSP cannot be used as the actions in GPT are REST and to my knowledge rate-limited (and very slow).

florath · January 12, 2024, 7:45pm

I updated the custom GPT which now uses coqtop and therefore has much more detailed information about hypothesis, goals and errors on a line-by-line base.
IMHO from Coq and integration into a custom GPT side cannot be done much more as of now. ChatGPT reacts not that predictable and often is not able to get to a (good) proof strategy - and sometimes it runs even in circles. But for small proofs and support it is already helpful, I think.

Topic		Replies	Views
Proofchat: A TCP client/server chat application written in Coq Announcements	10	115	October 3, 2024
ChatGPT can help translate between Coq and Lean Miscellaneous	0	242	February 13, 2024
Trakt 1.0 released Announcements	0	668	February 10, 2022
Proof engineering survey published Announcements	3	1080	March 17, 2020
Coq LSP release 0.1.2 Announcements	1	521	January 9, 2023

Experiment: Coq Code Interpreter in ChatGPT

Related topics