Coq Community Survey 2022 Results: Part I
In this post, we give an introduction to the survey and then provide the first part of a result summary. This part is about who is using Coq and in what context. See also the second part on how people are using Coq, the third part on usage of Coq features, plugins, tools, and libraries, and the fourth and last part with the rest of the results, the links to the associated research paper and Zenodo artifact.
We invite the Coq community to provide feedback on our summary as replies to this forum post. In particular, we are interested in hearing if something was unclear about survey questions and plots or if you have suggestions for improvements or questions you’d like to see answered with the survey data. For those who prefer, we also welcome feedback in this Zulip topic.
Introduction to the Survey
The Coq Community Survey 2022 was an online public survey conducted during February 2022. Its main goal was to obtain an updated picture of the Coq user community and inform future decisions taken by the Coq team and other Coq ecosystem participants. In particular, the survey aimed to enable effective decision making about the development of the Coq software, and also about matters pertaining to Coq ecosystem software maintained by users in academia and industry.
Survey Design, Deployment, and Analysis Process
The survey was designed, deployed, managed, and analyzed by members of a working group (WG), which was formed following a public call for participation by the Coq core team.
Survey questions were created by WG members collaboratively. The WG took inspiration from the previous Coq Community Survey in 2014, and similar more recent surveys for OCaml and other programming languages. Efforts were made to adhere to survey best practices, e.g., on ordering demographic questions late to avoid their answers affecting other questions.
The deployed survey was advertised to Coq users primarily via the following Coq community forums: Discourse, Zulip chat, Coq-club mailing list, Coq Twitter, and the Coq website. The WG also advertised in more general venues and other related communities, e.g., the Types-announce mailing list, Agda and Lean Zulip chats, and on language-specific forums in Chinese and in French (mailing lists of GDR IM and GPL).
After the survey closed, WG members collaboratively analyzed the response data, primarily using Jupyter. For GDPR conformance, the raw survey data was kept within the EU, using only Inria-hosted tools.
Scope of the survey
The survey - available in English and Chinese - has 109 questions, some of which were only visible depending on answers to previous questions. Broadly, the survey asked about (i) use of Coq features, IDEs, libraries, plugins, and tools, (ii) views on renaming Coq, on Coq improvements, and issues such as inclusion, and (iii) demographic data, such as age, gender, and experience with Coq and other proof assistants or programming languages. Answering was optional for all survey questions, and most questions allowed free-text answers. In particular, it was possible to answer the survey without having used Coq, albeit with many questions hidden.
Summary Part I: Who is using Coq and in what context?
In this summary part, the WG looked at the demographics of the Coq users who took the survey, and in what context they are using Coq. Context here means users’ experience before they started to use Coq, their reasons for using Coq, and their experience with similar languages and tools.
Demographics of Coq users
Where do survey respondents live?
This plot shows the top respondent locations, with a breakdown for two subpopulations: newcomers and long-time users. Newcomers are defined as respondents who have declared they have used Coq for less than 2 years and less than 1 year ago, long-time users are defined as respondents who have declared they have used Coq for more than 5 years.
WG members expected, based on interactions in the various Coq community forums, the two top locations of respondents to be France (which is where Coq was initially created and where most of its core developers are still located) and the United States. However, some were surprised by the relatively high number of respondents from the United Kingdom, which already hosts separate communities around other proof assistants based on the HOL logic and Lean. The subpopulation breakdown highlights that the United Kingdom is indeed not a historical stronghold of Coq but hosts quite a number of newcomers (and similarly for China, Canada, and Russia).
The number of respondents from China mainland was likely boosted by the availability of a Chinese version of the survey. 44 respondents used the Chinese version, including the 30 respondents from China mainland, but also several outside China mainland, along with some who did not provide their location.
We provide a separate bar chart with the full list of locations of respondents, i.e., not just the top seven.
Age of respondents
Coq is a demanding tool to learn, and people are primarily exposed to it during (university) classes. Hence, it is in line with the WG’s expectations that the majority of users are in the 20-39 age range. Having many young respondents (20-29) may be a result of community growth in that age group.
Gender of respondents
The gender distribution confirms the striking gender imbalance in the Coq community, something that is also observed in other software-related communities, like JavaScript or StackOverflow.
Employment of respondents
Based on the WG’s observations of Coq community channels such as the Zulip chat, the Coq-club mailing list, and participation at academic conferences such as POPL, the expectation was that the majority of respondents would be academic researchers and students. However, we observe that the private sector represents a quite significant share as well.
Highest education completed by respondents
Historically, Coq has been used mainly by researchers and graduate students, which is reflected in a high number of respondents with PhD degrees. In many universities, courses that use or teach Coq are considered advanced, and may thus not be available to undergraduate students (with high school or equivalent as highest obtained education).
Students and academics among respondents
By combining the responses on education and employment, it is possible to categorize respondents into “undergraduate students” (students with a high school diploma), “graduate students” (students with bachelor’s or master’s degrees) and “PhD academics” (respondents employed in academia who have a PhD degree). Note that the percentages are relative to the total number of respondents that fit in one of these three categories.
Research topics
The 13 other answers were:
- Computer aided geometric design
- Quantum Computing
- digital hardware design
- Computational Choice theory
- Software Security
- Operating Systems, Compilers
- AI / Machine Learning
- Geometry
- Geometric Algebra
- real-time systems
- Formal security analysis
- Distributed computing
- Molecular biology
Research areas
By combining research topics into larger areas, we obtain the plots above, which highlight that the Coq research community is mainly made of computer scientists, something that is later confirmed by looking at the programming experience and the Coq application domains of respondents.
Context of Coq use
Prior programming experience
Most respondents have experience with all the three major programming styles. Logic programming (which can be emulated in Coq using inductive predicates and other techniques and tools) is a minority style, but still prevalent among respondents.
Prior functional programming languages experience
OCaml was developed in part as the implementation language of Coq, is primarily developed in France, and is taught to French undergraduate students in several French universities. Consequently, the WG expected many respondents would have experience with OCaml. However, Haskell experience turned out to be more prevalent among respondents. Nevertheless, many respondents have experience with both Haskell and OCaml, along with other languages such as Scheme.
The other answers were the following (note that the question asked about functional programming languages, yet some of the languages below are generally not considered functional programming languages):
Lisp | 8 |
C | 4 |
Clean | 3 |
Java | 3 |
Prolog | 2 |
Caml light | 2 |
Lean4 | 2 |
Mathematica | 1 |
C++ | 1 |
Rust | 1 |
Kotlin | 1 |
Pie | 1 |
Miranda | 1 |
The lambda calculus | 1 |
K | 1 |
R | 1 |
Hearing about Coq
Starting to learn Coq
Length of Coq experience
Last use of Coq
Purpose of using Coq
Coq application domains
Coq use in the workplace
The follow-up second question below was only shown to people who answered that Coq was well-established in their workplace.
Experience with other proof assistants
Agda is the most-used other proof assistant among respondents. This may be due to several factors, such as (i) shared heritage with Coq, (ii) shared foundations with Coq, and (iii) Agda community forum outreach for the survey.
The other answers are the following:
Arend | 7 |
Metamath | 5 |
Dafny | 4 |
Dedukti | 3 |
Abella | 3 |
Easycrypt | 3 |
Why3 | 2 |
Hol | 1 |
Cedille | 1 |
Verifast | 1 |
Pie (the little typer) | 1 |
Cubicaltt | 1 |
Anders | 1 |
Granule | 1 |