I was working on defining formal semantics. I would like to write a function, that computes the list of the variables used in an expression (for the sake of simplicity, in the example code, only empty list, variable and map constructions are present). I’m having problems with the map constructor. When I try to define this function with fmap/map or mutual recursion (as seen in variables2 below) I get an ill-formed recursion error. The only way I was able to construct such a function was with inline fix (as seen in variables).

Inductive Expression : Type :=
| EEmptyList
| EVar (s : string)
| EMap (l : list (Expression * Expression)).
Fixpoint variables (e : Expression) : list string :=
match e with
| EEmptyList => []
| EVar x => [x]
| EMap l => (fix fp l := match l with
| [] => []
| (a,b)::xs => app (app (variables a) (variables b)) (fp xs)
end) l
end.
Fixpoint variables2 (e : Expression) : list string :=
match e with
| EEmptyList => []
| EVar x => [x]
| EMap l => fp l
end
with fp l := match l with
| [] => []
| (a, b)::xs => app (app (variables2 a) (variables2 b)) (fp xs)
end.

How should I modify the definition of variables2 not to get this error?

The problem you are encountering is that Fixpoint requires its body to do all its recursive calls on an argument that is syntactically a subterm of its main argument. This is called the guard condition and there is some documentation here. In your first variable Coq can determine that all recursive call in fp are indeed done syntactically on a subterm of EMap l but in the second it cannot (I don’t understand the guard condition checker well enough to try to explain this behaviour).

However if you really want to have the second version, it is possible to do so by explaining to Coq why it is a well-defined recursive definition. In this case you are doing an induction on the subterms of your main argument e : Expression. You could define the relation between subterms of type Expression, prove that this relation is well founded and finally do an induction on this proof of well-foundedness. There are tools to help you to do that such as Program Fixpoint or Equations. Here is the solution using the latter (that needs to be installed separately):

From Coq Require Import String List.
From Equations Require Import Equations.
Import ListNotations.
Inductive Expression : Type :=
| EEmptyList
| EVar (s : string)
| EMap (l : list (Expression * Expression)).
Equations variables2 (e : Expression) : list string :=
variables2 EEmptyList := [] ;
variables2 (EVar x) := [x] ;
variables2 (EMap l) := fp l
with
fp (l : list (Expression * Expression)) : list string :=
fp [] := [] ;
fp ((a, b) :: xs) := variables2 a ++ variables2 b ++ fp xs.

EDIT: The example above actually does not use the subterm relation on Expression (the one derived by Equations with Derive Subterm for Expression. is actually empty so useless…)

That’s already a correct solution; for extra convenience, you can also use recursive functions like fold_right there; Coq’s termination checker will inline fold_right producing equivalent code. Example (and equivalence proof) below.

More examples appear in Cpdt.InductiveTypes under the heading “Nested Inductive Types”, which recommends this approach.

While @kyod is correct that you can use Equations, and that’s great, it seems overkill in this scenario, as proofs about the resulting definitions learning require special (and slower) tactics (discussed in the Equations documentation).

(* Your code, self-contained *)
From Coq Require Import String List.
Import ListNotations.
Inductive Expression : Type :=
| EEmptyList
| EVar (s : string)
| EMap (l : list (Expression * Expression)).
Fixpoint variables (e : Expression) : list string :=
match e with
| EEmptyList => []
| EVar x => [x]
| EMap l => fold_right (fun '(a, b) r => app (app (variables a) (variables b)) r) [] l
end.
Lemma equiv l :
fold_right (fun '(a, b) r => app (app (variables a) (variables b)) r) [] l =
(fix fp l :=
match l with
| [] => []
| (a,b)::xs => app (app (variables a) (variables b)) (fp xs)
end) l.
Proof. now induction l; [|destruct a; rewrite <-IHl]. Qed.

Thank you very much for your replies. Maybe for our purpose the second solution is better, because we have some proofs about the existing functions with fix, but both are useful to us.