Defining and working with trivial finite sets like {x, y, z} easily


Not sure of the best name for these kinds of sets but it’d be nice if there was a way of defining very trivial finite sets like {x, y, z} easily.

These sort of sets are useful for handling free variables. The free polynomial on two variables x, y N[{x, y}] is an example of the sort of thing I’m interested in. I’m really interested in figuring out variable binders and quantifiers in programming languages and I think this could be useful.

Not sure how to work with these kind of sets easily in Coq though.

From a theoretical perspective just using a nameless representation such as Fin from the vector package works but it’s kind of “icky” to work with directly. Also it just devolves into de Bruijn levels which I wanted to avoid.

Are there any packages for working with/declaring these kinds of sets?

Any advice?

IIRC you can use tactics to automate a little of this but I still don’t understand that sort of thing.

Ideally something like

Definition types: {x, y, z} -> typ := fun e =>
  match e with
  | x => pt
  | y => exp N N
  | z => N

Would be possible but obvs that’s a lot to ask for

Definitively you want to get familiar with the state of the art first, for that, see the multinomials package which is based on math-comp and the fset library.


Thanks. I’ll take a closer look later but I think this isn’t really what I’m interested in.

I am interested in indeterminates/free variables not in polynomials. I just gave polynomials as an example of where one might use indeterminates.

From what it initially appears the package uses a finite type indexed by size for variables. This is perfectly reasonable but not what I’m interested in myself.

From a theoretical perspective there is no difference between something like

Definition poly2 : poly (xs 2) := v 0 * v 0 + v 1.


Definition poly_named: poly (xs ["x", "y"]) := v "x" * v "x" + v "y".

But names are qualitatively different to work with than indexes.

It’s a bit like the difference between de Bruijn levels and variable names. In fact, that’s one of the cases I want to figure out. A lambda term indexed by N variable names feels different than a term indexed by a finite type of size N.

I should really explain more what I’m trying to figure out. I’ve been trying to figure out algebraic approaches to variables for use with notations for the language of a presheaf topos.

So I looked at cylindrical algebras and polyadic algebras but of course the formalization is a nameless style. So what I need is a toolkit to convert back and forth between nameless and named finite sets. Because I can’t read a serious of numbers at all when working with the code.

If you’re really interested it’s all a mess and I haven’t sorted out variance issues but what I’m thinking is that for intermediate manipulation I’ll need a presheaf over a finite groupoid of N-indeterminates. I believe non-trivial morphisms would correspond to homotopy type theory style equivalences. Of course if you don’t have nontrivial morphisms than that is just a discrete fibration over Set and then you get something like the category of presheaves over Set. It’s all kind of a mess and I really am developing an allergy to numbers.

I’d love to hear a good answer to this question. Here’s what I’ve done in the past:

  • When you have a known, constant set of cases, use an inductive type; this is what we do for defining the registers of a hardware design in Kôika: koika/uart.v at da174b5215ddb6c151efc589ab0c1f3ea4b72242 · mit-plv/koika · GitHub

    Inductive reg_t :=
    | state
    | in_byte
    | in_byte_offset
    | out_bit
    | delay
    | last_write_ack.

    The problem with this approach is that you don’t have a nice way to enumerate the elements of the type. We use a FiniteType typeclass for this with automatically derived instances, but the derivation process is very inefficient (it’s hard to balance good reduction behavior and fast derivation).

  • When you need have different sets of variables, what I’ve done is a mix of:

    1. Avoid the problem entirely by using a mixed embedding with shallow binders;

    2. Not enforcing that variables be bound (you give a default value to unbound variables). If you can do this you avoid a lot of pain with dependent types.

    3. Using dependent De Bruijn indices; this is what we do for typed ASTs in Kôika: koika/TypedSyntax.v at da174b5215ddb6c151efc589ab0c1f3ea4b72242 · mit-plv/koika · GitHub . See in particular the definition of the member and context in koika/Member.v at master · mit-plv/koika · GitHub and koika/Environments.v at da174b5215ddb6c151efc589ab0c1f3ea4b72242 · mit-plv/koika · GitHub . This is basically a dependent typed list of pairs, and you index into it by giving a proof that the key you’re looking up belongs to the list of keys in the type. Your function in this style would look like this:

      Require Import Koika.Frontend.
      Inductive typ := pt | N | exp: typ -> typ -> typ.
      Definition types {v} (m: member v ["x"; "y"; "z"]): typ :=
        cassoc m #{ "x" => pt; "y" => exp N N; "z" => N }#.

      … and of course you can use any type that has decidable equality:

      Definition types' {v} (m: member v [1; 14; 39]): typ :=
        cassoc m #{ 1 => pt; 14 => exp N N; 39 => N }#.

      If you want you can make things a touch nicer by inferring the “member” instances:

      Inductive error := Error (err: string).
      Definition types_ (v: string) :=
        match mem v _ as m return (if m then typ else error) with
        | inl m => types m
        | inr _ => Error "not found"
      Compute (types_ "x"). (* = pt *)
      Compute (types_ "v"). (* = Error "not found" *)

      Beware that this particular implementation really kills the performance of simpl. I’m sure there’s a better way to do this, but I don’t know what it is.

Hope you get better answers ^^

1 Like


This looks very interesting.

I’ve been meaning to take a look at Koika for a while but I don’t know exactly why I haven’t really yet.

On point 2 it’s funny because that’s a very similar solution to one I came up with.

If you index into a variable context and you don’t find an entry the simplest solution is to return something like a tt value of unit type.

About 3, I don’t store the scoping information in the term structure itself I keep the variable context stuff in the separately specced typing rules. I tried this style initially on suspicion keeping the binding information in the term AST wasn’t appropriately “mathy” enough but I’m still not sure if it was a good idea or not.

I’ve been trying to figure out the STLC as kind of like the free functor to something, the same way a polynomial is a free commutative semiring? I still haven’t really figured this out yet but it strikes me as less ad-hoc.

I never use simpl myself but really only use cbn. A random guess but maybe using a different tactic than simpl might give better performance for some of your cases?

cbn tends to be slower, in my experience.

Would make sense. cbn is usually slow.

Could you do some kind of custom reflection based simplifier maybe?

I kind of want to figure out how to do a custom evaluation scheme (especially for setoids.)

Given something like

Class Stlc := {
  s: Type ;
  t: s -> t ;
  exp: s -> s -> s ;
  lam {a b}: (t a -> t b) -> t (exp a b) ;
  app {a b}: t (exp a b) -> t a -> t b ;

  Stlc_Setoid a: Setoid (t a) ;
  lam_app {a b} (f: t (exp a b)): lam (app f) == f ;
  app_lam {a b} (f: t a -> b): app (lam f) x == f x ;

It should be possible to use reflection to simplify everything.
I didn’t want to bother with an explicit environment context but obviously a real implementation would use one.
This sort of tagless final higher order abstract syntax at first looks attractive but in practice is hell to work with.

Anyway it would be really nice if this sort of thing was better supported in Coq or maybe I just need to get better at reflection/solvers.

I can figure out how to simplify simple things like categories but more complicated stuff is way beyond me.