Is there a full documentation of Coq's grammar?

Looking for the grammar at the doc requires too much time. Is there a nice page or pdf that full grammar of coq is listed?

1 Like

The closest thing is in the files fullGrammar and orderedGrammar in the source code. The first shows the grammar without the actions and minimal changes compared to the .mlg files. The second shows a grammar that has been edited for readability. These are big files, over a thousand lines each. But neither is intended to be read by end users.

We’re in the midst of updating the grammar in the doc, which was very much out of date. Look at the Terms section of the Gallina chapter to see what the updated doc looks like. I think the hyperlinks of nonterminals make that easier to read.

The updates to the Terms section that I mentioned don’t appear in the HTML doc linked from the website. You would need the next release or to build from the source code to see them.

The updates Jim is talking about appear in the “development” version of the reference manual, which is continuously deployed at https://coq.github.io/doc/master/refman/.

But more importantly the question is: why do you need to see the full Coq grammar? If this is to write your own parser of Coq’s grammar, that’s something that is very much advised against. I could write more about why it is so and what are the alternative ways of parsing a Coq document if you want.

It should also be mentioned that the Coq grammar is extensible, some more information about this here.

2 Likes

I was just hoping if I could get better understandings in coq with the grammar, but I would like to try what others don’t try at all. Why do you not recommend it, and what are the alternative ways?

Apologies, I read it but 100% understand it. Does it mean that Coqs language is ever evolving?

I am not a FM/PL expert but somehow had the impressions grammars where fixed a prior.

First, I should mention that the manual now contains some warning regarding parsing your own grammar (cf. the note and warning at https://coq.inria.fr/refman/language/core/basic.html#syntax-and-lexical-conventions).

Now, to answer Brando’s question, the language of Coq is not like any standard programming language. Standard programming languages do indeed generally have a small grammar that is rarely extended and most of the features are provided, not through special parsing constructs, but through primitive functions. On the contrary, the grammar of Coq is very large and constantly expanding. Most commands / tactics have their own parsing constructs, plugins add new ones, and the notation system can be used to expand the grammar even further (and that’s what most Coq libraries do). [Even the simple task of splitting a Coq document into sentences is something complex that most Coq editors do not actually handle properly unless they delegate this task to Coq.]

3 Likes

Thanks Theo! That was very useful.

I have two question:

  1. I am very curious to understand why Coq has such a seemingly complicated grammar lifestyle. Why does it seem so different?
  2. How do you “parse” a Coq file into it’s individual sentences? I was interested in extracting all tactics of a Coq script (and execute them with serapi, basically execute the script with serapi). I know tactics end with a .. Is that enough to parse a proof script?

Thanks for the informative feedback!

I could explain why it’s useful to have such complicated grammar but the question of “why” should rather be answered by people that were there in the beginnings of Coq. But for instance, the very powerful notation system is generally considered to be an ingredient of the success of Coq.

If you are planning to rely on SerAPI, then it is best that you let SerAPI do the splitting into sentences for you (this is something that it supports). Splitting sentences based on . is an approximation that works relatively well but there will be edge cases. An example (which I didn’t invent) is the following:

Notation "( a . b )" := (a, b).
Check (1 . 2).

It is an extreme example because most IDEs (including CoqIDE) won’t accept it (because they try to do the sentence splitting by themselves but are not aware of the notation system) but coqc will. And yet, the sentence splitting code in CoqIDE is already quite complex (cf. https://github.com/coq/coq/blob/51ecccef0308eceec1ddd9776a03fd993b3ea71a/ide/coqide/coq_lex.mll#L102-L154).

2 Likes

Thanks Theo. Let’s hope we someone can chip in for the rich lifestyle that Coq’s grammar has. I am very curious about it!

Also, is there a difference between the (evolving) tactic grammar and the (evolving) proof script grammar and (evolving) formula Coq term grammar? Or are they all the same under the (evolving) proof script grammar?

Thanks again!