Coq Community Survey 2022 Results: Part IV and ITP paper announcement

Coq Community Survey 2022 Results: Part IV and ITP paper announcement

TL;DR: This post completes a series of three previous posts (part I, part II, part III) with plots of answers to questions not yet analyzed in previous posts, and with the announcement of an article at ITP about the survey and the associated dataset.

The Coq Community Survey 2022 was organized by an ad hoc working group and conducted during February 2022. It contained 109 questions (in English and Chinese) and received 466 answers. See the first post in this series for more detailed information on the survey.

During the year that followed the survey, members of the Coq Community Survey 2022 working group analyzed the results and published a series of three blog posts. The first part focused on who respondents are, while the second part focused on how they use Coq. The third part then covered usage of Coq features, plugins, tools, and libraries.

Results contained in these three blog posts as well as results about renaming Coq were presented at the Coq workshop 2022. The abstract and the slides are available.

Finally, members of the working group published a peer-reviewed research paper in the proceedings of ITP 2023 (the Fourteenth Conference on Interactive Theorem Proving): https://doi.org/10.4230/LIPIcs.ITP.2023.12. This article contains novel analyses of survey results based on multiple regressions to uncover how different Coq users use Coq differently and express different needs. Some highlights from this paper: being an experienced user (> 5 years of experience) is associated with using the Coq package ecosystem and features more, but also with expressing lower satisfaction in IDEs and requesting more technical improvements to Coq; applying Coq to do software verification (67% of the respondents to the survey) is also associated with using the Coq package ecosystem and features more (these respondents are more likely to use external general libraries, more external tools, automation, SSReflect and extraction).

As an artifact to this paper, we have published a dataset on Zenodo that contains:

  • the Limesurvey structure (in a format that makes it easy to reuse, and under a CC-BY license),
  • the analysis code that was used to generate the plots and the regression analyses,
  • the plots for the answers to all closed questions (most of the plots that were not presented in our previous posts also appear below),
  • sanitized exports of all the answers to open text questions,
  • and a manual analysis of the answers to some open text questions.

The artifact does not contain the raw data that would be required to reproduce the plots and analysis results. As announced in the survey front page, these data have been deleted after one year. In retrospect, we believe that this retention period was too short and it has hindered doing the most with these data.

In the rest of this post, we provide plots summarizing the answers to the questions that were not analyzed in previous posts. It has 4 parts:

  1. Improvements to Coq and its documentation
  2. Contributing to Coq
  3. Code of conduct and community health
  4. Renaming Coq

Improvements to Coq and its documentation

Coq improvements

This question asked “In order to make you more productive in Coq and to encourage others to learn and use Coq, how important are improvements in the following areas (relative to their current state)? (Skip any item you don’t have an opinion on, or if you have no idea of what the current state is.)”

coq-improvements-majorityjudgmentplots.svg

The percentages correspond, in order, to the proportion of “Unwise”, “Unimportant”, “Worthwhile”, and “Essential” answers. Improvement areas are ordered according to the majority judgment methodology.

We can see that three improvement areas (fixing bugs, documentation, and refining existing features) are considered essential by a majority of respondents, while the rest of the improvement areas are considered worthwhile by a large majority, with a varying proportion of respondents considering them essential. The “major new features” area stands out as the one having the most respondents answering “unwise” (4%) even though they are a lot more respondents answering “essential”. Only three areas have more respondents answering “unwise” or “unimportant” than respondents answering “essential”, with “performance improvements” standing out as having the larger number of “unimportant” answers (26%), while also have a rather large number of “essential” answers (24%).

Documentation improvements

This question asked “How important are improvements to the following aspects of the Coq documentation (relative to their current state)? (Skip any item you don’t have an opinion on, or if you have no idea of what the current state is.)”

doc-improvements-majorityjudgmentplots.svg

Only one area of documentation improvement has a majority of respondents answering essential: “Examples of tactics and commands”. The rest of the areas have a majority of “worthwhile” answers, with the “essential” answer always coming second.

This question was complemented by an open text question “Feel free to elaborate on any of the items listed above, their importance, etc. Are there other improvements that you think would be important?” for which a manual analysis is available as part of the Zenodo artifact.

Languages to support

This question asked “Do you agree that the availability of more written materials in languages other than English would encourage more people to use Coq? (Skip if you don’t know.)”

non-english-doc-divergentplots.svg

This plot shows the responses as divergent bars centered in the middle of the “Neutral” answer, for various groups of respondents (based on demographics questions) whose size is given in parentheses. We can see that there is a larger share of respondents who agree or strongly agree with this statement in anglophone locations (the anglophone locations that were represented among our respondents are USA, UK, Canada, Australia and India) compared to other locations (but there are also less respondents from anglophone locations who answered this question). When we analyse (in the ITP paper) which population is most likely to provide answers to the next question (asking which language to support), then the respondents that are outside Europe and North America are the most likely to provide answers. In this plot, we can also see that newcomers are more numerous to agree or strongly agree with the need for more written materials in non-English languages compared to long-time Coq users.

The next question asked “What languages would be the most useful to support?”. This was an open text question, but which is suitable to quantitative analysis.

languages-barplot.svg

Chinese is a common answer, but recall that the survey was available in English and Chinese (and in no other language), so Chinese-speaking users who do not speak English are very likely over-represented when compared with non-English-speaking and non-Chinese-speaking users.

Contributing to Coq

The questions below were also part of the section on the survey on Coq improvements, but they asked more specifically about contributing to Coq and the contribution experience.

tried-contributing-multiple-barplot.svg

We can see that newcomers have very rarely tried contributing to the Coq GitHub project, while a large minority of long-time users have tried contributing.

For respondents who answered “Yes” to the previous question, a follow-up question was asked:

contributing-result-barplot.svg

For respondents who answered “No” to the question about whether they had every tried to contribute changes, a follow-up question was asked:

why-not-contribute-multiple-barplot.svg

The only reason not to contribute where we can see a large difference between newcomers and long-time users is “programming skills.”

why-not-contribute-upset.svg

For both contributors and non-contributors, a final question asked “How much do the following hinder your ability or desire to contribute changes to Coq?”

contributing-hindrances-divergentplots.svg

Of the six proposed answers, only two (insufficient assistance and review process) have a (small) majority of respondents answering “Does not hinder”. On the other hand, the lack of internal documentation, the complexity of the code, the difficulty to assess which tasks would be most helpful and less difficult are all cited as hindering noticeably the contributing experience by a majority of respondents.

The survey also contained an open text question “Feel free to elaborate on the contributing experience, what we can do better, or why you do not contribute.”, for which a manual coding of the answers is available as part of the Zenodo artifact. Combined with the other answers to the question “Why not (contribute to Coq)?”, we can highlight a few recurring themes of why people do not contribute changes to the Coq project:

  • Feel underqualified / intimidated (9 answers to the long text question and 10 other answers to the why not contribute question).
  • Unclear which changes are needed / which changes would be likely to be accepted (5 answers to the long text question and 2 other answers to the why not contribute question).
  • The need for more / better documentation for developers / contributors was highlighted by 7 respondents to the long text question. 2 respondents additionally highlighted the need for a clear roadmap.
  • Finally, in the other answers to the why not contribute question, some respondents clarify that they plan to contribute (3 answers) or that they are already contributing in other ways (by opening bug reports, by contributing to the ecosystem), 2 respondents feel like the project is unwelcoming, and 8 respondents have not encountered the need to improve anything yet.

Code of conduct and community health

Coq has had a Code of Conduct since 2018. This code, while being available in the Coq repository, applied to “all spaces managed by the Coq development team” (including “the GitHub repository, the Discourse forum, the Zulip chat, the mailing lists, physical events like Coq working groups and workshops”).

coc-awareness-multiple-barplot.svg

A majority of respondents to the survey were not aware of the existence of the Code of Conduct. This is particularly true of newcomers. While a small majority of long-time users were aware of the Code of Conduct, many of them had not read it. While the Code of Conduct was initially created to respond to a need that arised in the Coq GitHub project, its scope is supposed to be larger, but the answers to this question clearly show that it is not advertised broadly enough yet (e.g., until recently, there was no direct link to the Code of Conduct from the Coq website).

The next question asked “Have you ever, while contributing to Coq or participating in the Coq community (online or offline, on official or non-official channels), encountered situations in which you have felt unwelcome or treated unfairly?”.

unfairness-multiple-barplot.svg

Very few respondents (16) have encountered such situations. Perhaps unsurprisingly, women and non-binary people are overrepresented among them (4), compared to their overall share among the respondents. Yet, we cannot hope to draw reliable conclusions from such small numbers.

The survey continued by asking the respondents who had encountered such situations to provide some additional details about said situations. However, it didn’t include any open text questions in order to avoid risks of re-identification of respondents. Instead, it advised that respondents could write to coq-conduct@inria.fr to report these situations (even if they are ancient). The Coq Code of Conduct enforcement team did not, however, receive any such report following the survey. Most, but not all of the respondents who had answered yes to the previous question answered the following ones.

unfairness-how-often-barplot.svg

Most respondents experienced situations of unwelcomeness or unfairness less than once per year, but there are people who have experienced these situations yearly or even monthly.

unfairness-where-barplot.svg

unfairness-where-upset.svg

The most common location to encounter these situations was in-person events (6 respondents), although online occurrences represent a larger share, taken together.

Renaming Coq

One of the things that triggered the Coq Community Survey 2022 was to ask about a potential renaming of Coq (although the idea of organizing such a survey had been discussed before and independently). The results presented in this section were previously presented at the Coq Workshop 2022 (quantitative results only) and in presentations to the Coq team. At the present time, the Coq team is still working on the renaming question in close collaboration with Inria officials.

Should Coq be renamed?

The survey asked whether Coq should be renamed with responses on a 5-level Likert scale ranging from “Absolutely” to “Absolutely not”.

Using respondents’ answers to questions from the demographics section, it is possible to observe how answers to this first general question vary across groups of respondents.

General renaming views

This plot shows the responses as divergent bars centered in the middle of the neutral “I don’t care / I don’t know” answer, for various subpopulations whose size is given in parentheses.

Overall, the survey answers reveal that the Coq community slightly favors renaming Coq (42% in favor, 21% neutral, 36% against). However, there are more significant differences on renaming when demographic groups are considered in isolation.

Respondents in English-speaking locations are about 5 times more likely to favor renaming than respondents in other locations, who are slightly more likely to be against renaming than in favor of it.

In favor Against
English-speaking locations 99 28
Other locations 84 120
Fisher’s exact test
p-value 5.00e-11
odds ratio 5.05

We ran many regressions to learn which demographic characteristics were associated with a different opinion. The only characteristic that was statistically significant (at the 0.05 level) was whether the respondent is from an English-speaking or a non-English-speaking location.

Even though it can be observed on the plot above, the difference of opinion between men and women is too small to compensate for the small sample size. Therefore, from the data we have, we cannot confidently claim that women are more likely to favor renaming than men.

In favor Against
Women 15 10
Men 161 138
Fisher’s exact test
p-value 0.68
odds ratio 1.29

Also of note is that despite the large representation of respondents from the USA in our survey, they do not drive the results from the English-speaking locations. Responses across English-speaking countries (USA, UK, Canada, Australia and India) were similar.

In favor Against
USA 63 21
Other English-speaking locations 36 7
Fisher’s exact test
p-value 0.37
odds ratio 0.58

Open text answers

The survey included an open text question on whether to rename Coq: “If you wish to elaborate on why Coq should / should not be renamed, feel free to do it here.”

With 187 answers, this was the most popular open text question in the survey. The sanitized answers to this question (as well as every other open text question) are made available as part of our Zenodo artifact.

These answers were systematically coded using an open coding strategy. This means that the codes were derived by observing patterns in the data.

The codes that appear in at least 10% of the answers (18 or more answers) are:

Code Number of answers Frequency in the open text answers
Discomfort (it is uncomfortable or awkward to have to use the name Coq) 44 24%
No need (it is childish to change the name for how it sounds or the discomfort can be easily avoided) 37 20%
Cost (cost that the renaming will incur, in particular reputation cost and discontinuity cost) 33 18%
Valid pros (acknowledging that the concerns expressed in the mailing list or the web are valid) 30 16%
Inclusion (renaming will make the community more welcoming and remove entry barriers) 28 15%
Cultural domination (renaming Coq would mean giving up to American/English cultural domination) 24 13%
Does not solve the problem (renaming will not solve the issues raised by the people wanting to rename) 19 10%
No strong opinion/not affected 18 10%

Among people against renaming (83 answers), the most common answers are:

Code Number of answers Frequency
No need 35 42%
Cost 30 36%
Cultural domination 20 24%
Does not solve the problem 17 20%
Respect heritage 10 12%
Coq is a good name 10 12%

Among people in favor of renaming (87 answers), the most common answers are:

Code Number of answers Frequency
Discomfort 40 46%
Inclusion 26 30%
Valid pros 20 23%
Avoid harassment 14 11%

Among people that are neutral about renaming (15 answers):

Code Number of answers Frequency
No strong opinion/not affected 11 73%

Renaming options

The next question was: “If Coq is renamed, how would you rate the following options?”

Then a table was provided with various general strategies and specific renaming options, which the respondents could rate with a five-level scale, from “Love” to “Hate”.

The options were chosen from the names proposed on the wiki page that was created following the discussion that sparked this process. Unfortunately, this wiki page had so many suggestions that an arbitrary choice had to be made to reduce it to a reasonable selection for inclusion in the community survey.

For the answers to this question, we plot preferences according to the majority judgment ranking method. The number of respondents to each question is provided in parentheses and the percentage of each response is provided on the right-hand side.

General renaming strategies

Below, we plot the preferences regarding the general renaming strategies, first for the whole respondent population, then for specific subpopulations (in favor, neutral or against a renaming).

We observe that the preferred strategy is completely reversed from respondents in favor of the renaming to respondents against it, with respondents in favor of the renaming preferring a more radical change and respondents against it preferring the less disruptive pronunciation change. The neutral respondents are in favor of the compromise options (making the name sound more French or longer). We should note that for all the categories except “Radical change”, the question commented “will not require changing command-line tools”, and so the evaluation of the cost of renaming was lower.

Preferred strategy for all respondents

Preferred strategy for all respondents: Make the name longer, Radical change, Make the name sound more French, Change pronunciation

Preferred strategy for respondents in favor of a renaming

Preferred strategy for respondents in favor of a renaming: Radical change, Make the name longer, Make the name sound more French, Change pronunciation

Preferred strategy for neutral respondents

Preferred strategy for neutral respondents: Make the name sound more French, Make the name longer, Change pronunciation, Radical change

Preferred strategy for respondents against a renaming

Preferred strategy for respondents against a renaming: Change pronunciation, Make the name sound more French, Make the name longer, Radical change

New names

When we look at the specific proposals that were included in the survey, we obtain the following results (for all respondents):

Renaming options

Only 5 options (Rocq, Copa—Coq proof assistant—, Gallo, C.O.Q.—only a change in pronunciation—, and LPC—Le Prouveur Coq—) are at least found acceptable by a majority of respondents. No option receives more likes than dislikes.

When looking at the breakdown by subpopulation (below), we can notice that Rocq and Gallo are strongly supported by the respondents in favor of a renaming (Rocq is even liked by a majority of respondents in favor of a renaming), but are strongly rejected by those against a renaming, while LPC and Copa could be viewed as more consensual options, which no one ranks very high, but that no one rejects as much. Among these 5 options, only the pronunciation to C.O.Q. is rejected by a majority of the respondents in favor of a renaming.

Preferred names for respondents in favor of a renaming

Preferred names for respondents in favor of a renaming

Preferred names for neutral respondents

Preferred names for neutral respondents

Preferred names for respondents against a renaming

Preferred names for respondents against a renaming

Open text answers

The survey included an open text question on the renaming options: “If you wish to share any specific arguments in favor or against some specific name choices, please do so here.”

With 91 answers, this was the third most popular open text question from the survey.

People shared pros and cons about specific renaming options as well as about general renaming strategies. We limit the reporting here to the general strategies and the specific options that were not rejected by a majority of respondents. A more complete manual analysis is available as part of the Zenodo artifact.

Options that keep Coq as a substring
Pros Numbers of answers
No renaming of command line tools 2
Preserves recognizability / reputation 2
Cons Numbers of answers
Same problems as Coq 11
People will keep referring to it as Coq 3
Make the name sound more French
Cons Numbers of answers
Harder to pronounce 5
Sounds awful 1
Same problems as Coq 1
People might not adopt it 1
Make the name longer
Pros Numbers of answers
Minimize needed changes 3
Cons Numbers of answers
Same problems as Coq 4
People might not adopt it 2
It is going to get shortened, possibly to Coq 2
Radical change
Pros Numbers of answers
It won’t allow for uncomfortable wordplay, unlike the other options 1
People won’t have a way of referring to it as Coq 1
Change the pronunciation to C.O.Q.
Pros Numbers of answers
No need to change tool names 1
Cons Numbers of answers
People might not adopt it 4
“c’est au cul” in French 3
Acronyms are bad 3
People try to pronounce acronyms as words 2
Lampshades how embarrassing saying “Coq” is 2
LPC (Le Prouveur Coq)
Pros Numbers of answers
Easy to say the letters as non-French speaking 2
Sounds professional 1
Not a complete break with Coq, just hiding 1
Cons Numbers of answers
Acronyms are bad 2
LPC is also an old and dead programming language for MUDs 1
Copa (Coq Proof Assistant)
Pros Numbers of answers
Keeps the name intact within the community but not outside 1
Unique enough that it is googleable 1
Obvious spelling 1
Cons Numbers of answers
May overlap with COPPA, the Children’s Online Privacy Protection Act 1
It’s the name of an airline 1
Sounds like dirt or litter in Russian (cop) 1
Means bra size in Portuguese 1
Rocq
Pros Numbers of answers
Named after the origin 4
Still a bird 3
Short 3
Feels like Coq to say and write 3
Keeps the same letters as Coq 2
Has a nice ring to it 2
“Rock” has good connotations: rock solid, rock star 2
Echoes Coquand with the “q” 1
Roc is a nice reference from Heroes of Might and Magic 3 1
Does not sound like a euphemism 1
Relatively few Google hits 1
Memorable 1
Easy to pronounce 1
Cons Numbers of answers
Roc is already a programming language: https://www.roc-lang.org/ 2
ROCm is already an AMD project 1
Requires changing the tooling 1
Does not keep “coq” as a substring 1
“Rock” has bad connotations: playing with rocks 1
Gallo
Pros Numbers of answers
Coincides with Gallina 2
Unique enough that it is googleable 1
Obvious spelling 1
Cons Numbers of answers
Hides the French heritage 3
Very common word 2
Hard to pronounce 1
Sounds like cock in Finnish (kalu) 1
The Spanish and Italian pronunciations are different 1
1 Like