[Very basic] stack overflow and computability

I’m new to Coq. I have two questions for the following code.

Fixpoint plus (n : nat) (m : nat) : nat :=
  match n with
  | O => m
  | S n' => S (plus n' m)

Notation "x + y" := (plus x y)
                       (at level 50, left associativity)
                       : nat_scope.

Check 10000 + 10000 = 20000. (** stack overflow occurs **)

This is the simple implementation of addition and checks a unit test via Check 10000 + 10000 = 20000.

  1. The computation 100 + 100 = 200 works, but for large numbers, a stack overflow error occurs. Is there a way to compute deeply nested types without encountering this issue?

  2. In some programming languages, 10000 + 10000 is executed efficiently with a few instructions. However, in Coq, it seems the compiler unpacks types one by one (e.g., S (S ...) -> S (...)) leading to potential overhead. Does the Coq compiler perform this unpacking sequentially, or does it have optimizations based on mathematical or algorithmic techniques?

Thank you.

I am not particularly a Coq expert myself, anyway this is one I have asked at some point, so I will try and share at least a first approximation.

for large [nat] numbers, a stack overflow error occurs

nat indeed is not meant for efficient computation, on the contrary, it is the “natural” and simplest definition we can give (it’s Peano’s definition), and we would mostly use it for specification as opposed to implementation, since we (tend to) trust our judgment as to whether e.g. a function in nat computes what its name says, while an efficient implementation is harder to assess and indeed it is rather proven correct by proving equivalence with the “natural” definition.

Is there a way to compute deeply nested types without encountering this issue?

I think the short answer is no, although for the technical details I can only refer you to the “Core language” chapter of the reference documentation, which I’d say is a must-read (to begin with) for the Coq programmer: Core language — Coq 8.18.0 documentation

I cannot say more since I am myself not sure whether the “stack overflow” is due to excess recursion or the plain size of the term(s): just maybe notice that 10000 in nat is notation for a literal term that has 10000 Ss in it, so computation is not even involved there (AFAICT).

In some programming languages, 10000 + 10000 is executed efficiently

For efficient computation, we have to go with different constructions than nat. For example, the standard library has NArith, where the natural numbers are represented by N which is a binary encoding, akin to the usual machine representations except it has arbitrary size. Other libraries offer types that are even nearer to machine types, e.g. Uint63 still in the standard library, and of course many more.

The “drawback” is that writing recursive functions on (for example) N is not as straightforward as writing recursions on nat. So, we would then prove, not just test, that our implementation in N is equivalent to a corresponding implementation in nat, whence correct, here mostly courtesy of the Nnat library, still in the NArith group, that provides the bridge.

But eventually, as I get it, where proper efficiency is a requirement, the goal anyway is not to let Coq do heavy work, the implementation itself is rather extracted to code that can be compiled for some target architecture: OCaml, Haskell, Wasm, C/C++…

1 Like