Coq Community Survey 2022 Results: Part I

Coq Community Survey 2022 Results: Part I

In this post, we give an introduction to the survey and then provide the first part of a result summary. This part is about who is using Coq and in what context. See also the second part on how people are using Coq, the third part on usage of Coq features, plugins, tools, and libraries, and the fourth and last part with the rest of the results, the links to the associated research paper and Zenodo artifact.

We invite the Coq community to provide feedback on our summary as replies to this forum post. In particular, we are interested in hearing if something was unclear about survey questions and plots or if you have suggestions for improvements or questions you’d like to see answered with the survey data. For those who prefer, we also welcome feedback in this Zulip topic.

Introduction to the Survey

The Coq Community Survey 2022 was an online public survey conducted during February 2022. Its main goal was to obtain an updated picture of the Coq user community and inform future decisions taken by the Coq team and other Coq ecosystem participants. In particular, the survey aimed to enable effective decision making about the development of the Coq software, and also about matters pertaining to Coq ecosystem software maintained by users in academia and industry.

Survey Design, Deployment, and Analysis Process

The survey was designed, deployed, managed, and analyzed by members of a working group (WG), which was formed following a public call for participation by the Coq core team.

Survey questions were created by WG members collaboratively. The WG took inspiration from the previous Coq Community Survey in 2014, and similar more recent surveys for OCaml and other programming languages. Efforts were made to adhere to survey best practices, e.g., on ordering demographic questions late to avoid their answers affecting other questions.

The deployed survey was advertised to Coq users primarily via the following Coq community forums: Discourse, Zulip chat, Coq-club mailing list, Coq Twitter, and the Coq website. The WG also advertised in more general venues and other related communities, e.g., the Types-announce mailing list, Agda and Lean Zulip chats, and on language-specific forums in Chinese and in French (mailing lists of GDR IM and GPL).

After the survey closed, WG members collaboratively analyzed the response data, primarily using Jupyter. For GDPR conformance, the raw survey data was kept within the EU, using only Inria-hosted tools.

Scope of the survey

The survey - available in English and Chinese - has 109 questions, some of which were only visible depending on answers to previous questions. Broadly, the survey asked about (i) use of Coq features, IDEs, libraries, plugins, and tools, (ii) views on renaming Coq, on Coq improvements, and issues such as inclusion, and (iii) demographic data, such as age, gender, and experience with Coq and other proof assistants or programming languages. Answering was optional for all survey questions, and most questions allowed free-text answers. In particular, it was possible to answer the survey without having used Coq, albeit with many questions hidden.

Summary Part I: Who is using Coq and in what context?

In this summary part, the WG looked at the demographics of the Coq users who took the survey, and in what context they are using Coq. Context here means users’ experience before they started to use Coq, their reasons for using Coq, and their experience with similar languages and tools.

Demographics of Coq users

Where do survey respondents live?

Top seven answers

This plot shows the top respondent locations, with a breakdown for two subpopulations: newcomers and long-time users. Newcomers are defined as respondents who have declared they have used Coq for less than 2 years and less than 1 year ago, long-time users are defined as respondents who have declared they have used Coq for more than 5 years.

WG members expected, based on interactions in the various Coq community forums, the two top locations of respondents to be France (which is where Coq was initially created and where most of its core developers are still located) and the United States. However, some were surprised by the relatively high number of respondents from the United Kingdom, which already hosts separate communities around other proof assistants based on the HOL logic and Lean. The subpopulation breakdown highlights that the United Kingdom is indeed not a historical stronghold of Coq but hosts quite a number of newcomers (and similarly for China, Canada, and Russia).

The number of respondents from China mainland was likely boosted by the availability of a Chinese version of the survey. 44 respondents used the Chinese version, including the 30 respondents from China mainland, but also several outside China mainland, along with some who did not provide their location.

We provide a separate bar chart with the full list of locations of respondents, i.e., not just the top seven.

Age of respondents

Age of respondents

Coq is a demanding tool to learn, and people are primarily exposed to it during (university) classes. Hence, it is in line with the WG’s expectations that the majority of users are in the 20-39 age range. Having many young respondents (20-29) may be a result of community growth in that age group.

Gender of respondents

Gender of respondents

The gender distribution confirms the striking gender imbalance in the Coq community, something that is also observed in other software-related communities, like JavaScript or StackOverflow.

Employment of respondents

Employment of respondents

Employment of respondents overlap

Based on the WG’s observations of Coq community channels such as the Zulip chat, the Coq-club mailing list, and participation at academic conferences such as POPL, the expectation was that the majority of respondents would be academic researchers and students. However, we observe that the private sector represents a quite significant share as well.

Highest education completed by respondents

Education of respondents

Historically, Coq has been used mainly by researchers and graduate students, which is reflected in a high number of respondents with PhD degrees. In many universities, courses that use or teach Coq are considered advanced, and may thus not be available to undergraduate students (with high school or equivalent as highest obtained education).

Students and academics among respondents

Students and academics

By combining the responses on education and employment, it is possible to categorize respondents into “undergraduate students” (students with a high school diploma), “graduate students” (students with bachelor’s or master’s degrees) and “PhD academics” (respondents employed in academia who have a PhD degree). Note that the percentages are relative to the total number of respondents that fit in one of these three categories.

Research topics

Research topics

Research topics overlap

The 13 other answers were:

  1. Computer aided geometric design
  2. Quantum Computing
  3. digital hardware design
  4. Computational Choice theory
  5. Software Security
  6. Operating Systems, Compilers
  7. AI / Machine Learning
  8. Geometry
  9. Geometric Algebra
  10. real-time systems
  11. Formal security analysis
  12. Distributed computing
  13. Molecular biology

Research areas

Research areas

Research areas overlap

By combining research topics into larger areas, we obtain the plots above, which highlight that the Coq research community is mainly made of computer scientists, something that is later confirmed by looking at the programming experience and the Coq application domains of respondents.

Context of Coq use

Prior programming experience

Programming experience

Programming styles

Programming styles overlap

Most respondents have experience with all the three major programming styles. Logic programming (which can be emulated in Coq using inductive predicates and other techniques and tools) is a minority style, but still prevalent among respondents.

Prior functional programming languages experience

Functional programmming language experience

Functional programmming language experience overlaps

OCaml was developed in part as the implementation language of Coq, is primarily developed in France, and is taught to French undergraduate students in several French universities. Consequently, the WG expected many respondents would have experience with OCaml. However, Haskell experience turned out to be more prevalent among respondents. Nevertheless, many respondents have experience with both Haskell and OCaml, along with other languages such as Scheme.

The other answers were the following (note that the question asked about functional programming languages, yet some of the languages below are generally not considered functional programming languages):

Lisp 8
C 4
Clean 3
Java 3
Prolog 2
Caml light 2
Lean4 2
Mathematica 1
C++ 1
Rust 1
Kotlin 1
Pie 1
Miranda 1
The lambda calculus 1
K 1
R 1

Hearing about Coq

Hear about Coq

Starting to learn Coq

Learning Coq

Learning Coq overlaps

Length of Coq experience

Length of Coq experience

Last use of Coq

Last use of Coq

Purpose of using Coq

Purpose of using Coq

Purpose of using Coq overlaps

Coq application domains

Coq application domains

Coq application domains overlaps

Coq use in the workplace

Coq use in the workplace

The follow-up second question below was only shown to people who answered that Coq was well-established in their workplace.

Coq use trend in the workplace

Experience with other proof assistants

Other proof assistants experience

Other proof assistants experience share

Other proof assistants experience overlaps

Agda is the most-used other proof assistant among respondents. This may be due to several factors, such as (i) shared heritage with Coq, (ii) shared foundations with Coq, and (iii) Agda community forum outreach for the survey.

The other answers are the following:

Arend 7
Metamath 5
Dafny 4
Dedukti 3
Abella 3
Easycrypt 3
Why3 2
Hol 1
Cedille 1
Verifast 1
Pie (the little typer) 1
Cubicaltt 1
Anders 1
Granule 1
3 Likes