Use notations to define an embedded language inside Coq

Hello everyone,
I’m trying to use the notation system to build an embedded simple programming language inside Coq that could be translated to an inductive tree structure that represent the terms of my language.
The dream would be something like :

Check (
def fibo(n):
    if (n < 2) return 1
    else return fibo(n-1) + fibo(n-2)

That would be understood by Coq as a Tree of my language.
The problem that I have is that a lot of notations I would like to use actually conflict with Coq notations.
Does that sound possible to you ?
What advice would you give for an intelligent approach to building the set of notations? (if something more clever than just writing all of them at the same place)
Would you recommend using custom entries (and in such case could you explain how it works) ?

Thank you for your help !

1 Like

I don’t know much about the topic but custom entries are used for a similar task in the ConCert project to help writing Acorn programs (the deeply embedded language they use). You can find an example of the result here, the file defining the notations and the AST. (Maybe @spitters can say a bit more on this particular example)

1 Like

It seems that you’re after a shallow embedding of a language in Coq but with some bells and whistles for notations. See the general discussion on deep vs. shallow embeddings here.

My view is that there is no general advantage to having a shallow embedding instead of a deep one. Coq supports printing-only notations that allow deeply embedded languages to be shown to the user as in the example code. MetaCoq also makes it possible to do automatic verified conversions between shallow and deep embeddings of programs - Acorn is one example of this.

If you are willing to mess with Coq internals, it is definitely possible to parse arbitrary syntax by writing a plugin. Unfortunately, it is not an easy task for a non-expert, and this is a vast understatement.

“Inductive tree structure” sounds like a deep embedding.

If you already have a deep embedding, why would you need an explicit translation to an “inductive tree structure”? In any case, instead of fiddling endlessly with notations for defining something, one can always define a real parser for surface syntax and then map that to a deep embedding, like CompCert’s clightgen, and then use a printing-only notation for printing back the surface syntax for deeply embedded programs inside Coq.

Hi cbl,

As @kyod pointed out, we use Custom Entries in ConCert to embed the syntax of a functional language into Coq. The idea is precisely to isolate parsing rules from the usual notations of Coq.

For example, one can write something like

Definition x := "x".
Definition y := "y".
Definition fact := "fact".

Definition factorial_syn :=
 [| fix fact (x : Nat) : Nat :=
      case x : Nat return Nat of
      | Zero -> 1
      | Suc y -> x * (fact y)

It will be parsed into an inductive type expr representing the AST. But Custom Entries are quite experimental and there are many open issues at the Coq issue tracker. Some of them should be fixed already in Coq 8.11 (like recursive notations with the “…” pattern).

You can find simple examples here. Notations , AST and more involved examples

Also, see the official Coq documentation on Custom Entries.

I have simpler example that might help you to get started. Here I define notations for simply-typed lambda calculus. It consists of notations for types and for terms. The idea is to declare required custom entries and then define notations living in particular custom entry.

Declare Custom Entry ty.

Notation "'*'" := tU (in custom ty).
Notation "'ℕ'" := tNat (in custom ty at level 1).
Notation " A -> B" := (tArr A B) 
                        (in custom ty at level 4, right associativity,
                            A custom ty,
                            B custom ty at level 4).

Declare Custom Entry lambda.
Notation "[! e !]" := e (e custom lambda at level 2).
Notation "()" := (Star) (in custom lambda).
Notation "'v0'" := (Var (here _)) (in custom lambda).
Notation " 'λ' e : τ -> σ" := (Lam τ σ e) (in custom lambda at level 1,
                                              e custom lambda at level 2,
                                              τ custom ty at level 2,
                                              σ custom ty at level 2).
Notation "e1  e2" := (App _ _ e1 e2) (in custom lambda at level 1,
                                                e1 custom lambda,
                                                e2 custom lambda at level 2).

Now, everything in the brackets “[! … !]” will be parsed as a lambda term. E.g. an identity function of type * -> * (meaning unit to unit)

Definition id_unit : Exp [] (tU :-> tU) :=
  [! λ v0 : * -> * !].

@cbl the Iris library defines notations for a fancy language without custom notations but with usable results. To avoid conflicts, they (1) use scopes a bit, bound to types so you needn’t use %scope all the time and (2) add : to some Coq keywords; (3) for readability, they use strings for variable names.

@cbl just wants their example to produce a Tree by using notations — so the “translation” is in fact at parse-time (and neither of us would use the word “translation”).

Actually the idea is to be able to test easily and in a more “user-friendly” way the functions that I have defined on my language (like a typechecker for example). The aim is not to build a real usable language, but more of a toy one so I can improve the rest of the tools around it.

Thank you very much ! I used what I saw in the Notations.v file on the github to build the language, and ended up close to what your presenting. Even though the topic might still be open to new ideas, it seems to me that this is a solution to the problem that I presented. Thanks!