Conclusion
Graph comparison (⚠: different scale) from before and after excluding / blacklisting 58 repositories :
The mismatch in numbers (1166 - 58 ≠ 1135) is a result of new repositories created and indexed in the meantime.
Comparing previously displayed numbers in korma:
Data displayed for month ↓ on date ➝ | 2015-06-22 | 2015-12-02 after excluding 8 repos | 2016-04-08 after excluding 58 repos |
January 2014 ⚠ | 414 | 303 | 263 |
May 2014 | 382 | 270 | 245 |
August 2014 | 371 | 278 | 259 |
November 2014 ⚠ | 228 | 233 | 189 |
May 2015 | 225 | 235 | 213 |
Diff Jan 2014 to Nov 2014: | -45% | -23% | -28% |
Diff Jan 2014 to Jan 2015: | -19% | ||
Diff Feb 2014 to Feb 2015: | -10% | ||
Diff Oct 2014 to Oct 2015: | -12% | ||
Diff Nov 2014 to Nov 2015: | +1.6% | ||
Diff Jan 2015 to Jan 2016: | +0.5% | ||
Diff Mar 2015 to Mar 2016: | -21% | ||
Diff Jan 2014 to Jan 2016: | -19% | ||
Note there might still be more repositories around which were imported/pulled from upstream at some point in the past and not updated since then. We also have repositories that are "mixed".
So I'd say we indeed lost contributors in Wikimedia Git.
It's somewhere between "stable" and "losing a few". http://korma.wmflabs.org/browser/scm-contributors.html is available for everybody's interpretation and for picking up specific months to compare to each other.
Original description
According to the "Authors" graph at http://korma.wmflabs.org/browser/scm.html, in 12 months we have lost about 40% of code contributors (users that got their code merged in Wikimedia hosted repositories).
It is a significant number and according to the graph it's not an anti-spike but a consolidated number.
January 2014: 414 May 2014: 382 August 2014: 371 November 2014: 228 !!! May 2015: 225
@Qgil is happy to highlight this number to WMF management and our community, but first we would need to be sure that these numbers are correct and not the result of a software biug or another type of misunderstanding i.e. single users committing from multiple addresses before, and now only from one. CCing here different people that are good at looking at raw data and stats. Your help is welcome!
The Engineering Community team review is on July 6 (draft materials to be presented on June 30), and this would be a good context to bring these numbers.