Very good questions. I hope the thread continue to receive answers as obviously no single answer to this will be sufficient.
Let me try to answer what I can but a big disclaimer is that I’m not a kernel expert at all:
It evolves a lot: Recent changes — Coq 8.19+alpha documentation
- release 8.10 introduced the new sort
SProp
, primitive integers and a new unfolding heuristic in termination checking. The last one is not supposed to have any effect on the logical foundations. The primitive integers are primarily there for performance: technically, they add a new logical object, but this object could be simulated in the previous version of the logic. Finally, theSProp
sort is really novel from the theoretical point of view and comes with an associated paper which justifies consistency relative to the previous theory: https://dl.acm.org/citation.cfm?id=3290316 - release 8.9 introduced mutually recursive records as you noted. In theory, this expands the logical foundations, but records can be emulated using inductive types, so in practice, it does not change the expressive power. There could still be subtle bugs due to the implementation of reduction rules that could differ between inductive types and records…
- release 8.8 restricted what is accepted in the kernel to recover a property that was lost (subject reduction)
- release 8.7 introduced universe cumulativity for inductive types. There is a paper about it explaining why it is supposed to be consistent: https://pdfs.semanticscholar.org/78d3/c32c9db513fcf15368fd377e688bcd84d9e4.pdf
- release 8.6 contained a much more efficient (and much more complex) cycle detection algorithm for checking the consistency of universe constraints: this does not change anything on the foundational level but it could have introduced bugs. This algorithm has recently been formalized http://gallium.inria.fr/~agueneau/publis/gueneau-jourdan-chargueraud-pottier-2019.pdf and the formally verified version has been integrated in Dune: Use a formally verified incremental cycle detection algorithm by Armael · Pull Request #1955 · ocaml/dune · GitHub. It is probably going to be integrated in Coq too, or at least in MetaCoq.
This is all without counting the many critical bug fixes (that for the most part were related to implementation problems).
Some developers have branches adding new stuff to the kernel (such as Inductive-Inductive) and some users are really pressing the developers to merge them, but they like to take their time to be sure of that it doesn’t introduce problems. On the other hand, there are already very complex stuff in the kernel, that have been there for a long time and that threaten its consistency (see also my answer here).
The CIC chapter of the reference manual should be the canonical reference: Typing rules — Coq 8.18.0 documentation. As you mentioned, there are also a lot of papers, none of which tries to be complete with what is actually implemented (and if they were doing that, they wouldn’t last before it became outdated).
The most promising effort (but a hard one) is the MetaCoq project that was mentioned above. There was an attempt at providing an alternative checker in Rust but it hasn’t seen much progress recently: GitHub - ppedrot/kravanenn: A set of tools for Coq written in Rust
Currently there is not nearly enough documentation of the vo
format for someone external to the development team to implement a new kernel from scratch I think. And it would be a huge effort.
You can disable the VM and the native compute machinery, you can turn off several options related to universe polymorphism and to primitive projections (actually most are already off by default), and you can use the Coq checker, which defunctorialize the Coq code it is fed (which can lead to performance blow up in some cases). However, there currently is no generic mechanism for doing this.
What some researchers would like to do is to build the kernel in layers and to provide transformations that allow you to use one advanced feature without relying on its implementation in the kernel, but instead compiling it down to more basic stuff (at the price of lost of performance), like what the checker does with functors.
They use mostly the same code, but with some differences in the way some stuff is handled (such as functors). The checker loads object-files and does not need to know about plugins, nor about notations, etc.
Don’t worry. At the current time, it would be unfortunately very wrong to have 100% trust in the Coq kernel. However, that doesn’t make Coq unsafe for verification challenges (in non-adversarial contexts). The efforts going on towards a formally verified checker should help lift these limitations.