User:Harej (WMF)/cloud
Executive summary
Introduction
Wikimedia Cloud Services expands on the core technical infrastructure of the Wikimedia projects by providing technical resources for tool developers and others working on software that benefits the Wikimedia movement. The Technical Engagement team at the Wikimedia Foundation wants to understand who uses Cloud Services, what they use Cloud Services for, and why. Since 2015 we have surveyed developers on our Toolforge platform to learn how to better improve that service; this year, we expanded our survey to include both Toolforge and Cloud VPS. We also included questions on tool development and MediaWiki development as major use cases for our platforms. With this additional feedback, we hope to get a better sense of how Wikimedia Cloud Services can contribute to the Wikimedia movement.
Methodology
We prepared a questionnaire containing 24 questions, covering topics including basic demographic information, use of the Toolforge and Cloud VPS platforms, specific use cases of Cloud Services, and feedback around the Cloud Services. Respondents answered different parts of the survey depending on whether they have used Toolforge or Cloud VPS or whether they identify as a "tool developer."
The survey was distributed to 1,722 Wikitech users based on their membership in one or more Cloud VPS project (including the Tools project). The survey was completed by 163 respondents, a response rate of 9.5%. As this is not a random sample but a self-selecting sample, the results of this survey may not necessarily be statistically representative of the Cloud Services user population as a whole; rather, they reflect the perspectives of those who were motivated enough to respond.
The response set was divided into these demographic cohorts:
- Tool developers vs. non-tool developers
- Users of the predecessor Toolserver service vs. otherwise
- Wikimedia Foundation staff vs. non-staff
- Hours per week spent on tool development
- Number of tools maintained
- Number of years using Toolforge
For free-text responses, we tagged each response (where possible) with one or more categories, based on the different topics covered by the responses.
Findings
Demographics
Wikimedia Cloud Services provides computing resources free of charge to members of the community while also acting as an internal service provider for Wikimedia Foundation software engineers. Surveyed users of Toolforge and Cloud VPS are predominantly tool developers from the community: 85.9% of survey respondents identified as tool developers, while only 23.3% of survey respondents reported working for the Wikimedia Foundation as an employee, contractor, vendor, or intern. Though relatively few in number, staff possibly have a significantly different experience from the volunteer population. Affiliation with the Wikimedia Foundation as a member of staff was found to be associated with the frequency with which the Cloud Services team was contacted for support (p = 0.01), with feeling supported by the Cloud Services team (p < 0.01), and the amount of work done locally vs. remotely on Toolforge (p = 0.04). This may suggest that the user experience for Toolforge and Cloud VPS favors those with strong technical skills, connections to Wikimedia Foundation staff, or both.
Most users of Toolforge have been using the service for years. Of the 132 respondents who stated they used Toolforge, 20.5% reported using Toolforge for one year, 28.0% reported using Toolforge for 2-3 years, and 51.5% reported using Toolforge for four or more years.
By number of tools developed (i.e. created), 5.3% of tool developers developed zero tools, 22.1% have developed one tool, 35.1% have developed 2-3 tools, and 37.4% have developed four or more tools.
By number of tools maintained (i.e. not created, but worked on), 11.5% of tool developers maintain zero tools, 31.3% maintain one tool, 26.0% maintain two tools, and 31.3% maintain three or more tools.
By number of hours/week spent developing/maintaining tools, 26.0% spend zero hours per week, 29.8% spend one hour per week, 34.4% spend between two and eight hours per week, and 10.0% spend nine or more hours per week.
Toolforge is a successor to the Toolserver platform run by Wikimedia Deutschland from 2005 to 2014. Only 33.3% of respondents reported being users of Toolserver, which is down from last year's 40.3%.
Motivation
Wikimedia Cloud Services is just one of many providers of cloud computing service. We asked respondents to describe why they chose Wikimedia Cloud Services over other options. The most significant factor is access to Wikimedia-specific resources, including the wiki replicas, with a plurality of 30.7% using Wikimedia Cloud Services for this reason. The next two biggest factors were cost at 22.6% and "philosophical or ideological reasons" at 19.7%. Ease of use and privacy and security considerations were also considerations for some.
Respondent-submitted answers to this question vary. The most popular text response revolves around the idea that Wikimedia Cloud Services is an extension of the Wikimedia projects, making it a natural destination for Wikimedia-related code. One respondent wrote "my code belongs to the Wikimedia universe." Other comments focused on Cloud Services' collaborative environment, the use of only free software, good ping to Wikimedia servers, software testing, discoverability of tools developed, as well as simply working for the same organization.
Support and satisfaction
In general, survey respondents reported having little contact with the Cloud Services team; 86.5% reported contacting the Cloud Services team once per month or less. It is not sure whether this is due to support not being needed or an inability to find support. It is worth noting that there is a mixed opinion as to whether people feel supported when they contact the Cloud Services team; while 55.2% strongly agreed or agreed with the statement that they "feel [they are] supported by the Cloud Services team," 39.2% reported neither agreeing nor disagreeing with that statement. (5.5% disagreed or strongly disagreed.) The large proportion of people who neither agreed nor disagreed could suggest a lack of familiarity with Cloud Services support options. There is a similarly mixed opinion regarding how easy it is to run code, with 65.6% agreeing or strongly agreeing with such a statement while 23.3% neither agreeing nor disagreeing and 11.1% disagreeing or strongly disagreeing. Only 36.8% agreed (or strongly agreed) that information from the Cloud/Cloud Announce mailing lists are useful.
Survey respondents were broadly displeased with the state of Cloud Services documentation, with only 36.2% agreeing (or strongly agreeing) that they find Cloud Services documentation easy-to-find, 31.3% agreeing that it is comprehensive, and 36.8% agreeing that it is clear. Indeed, the subject of documentation came up frequently in comments, with complaints noting the documentation is geared toward advanced users, poorly organized, out of date, and not easy to find. One comment noted how documentation was spread between Wikitech, MediaWiki.org, and Meta, while another comment noted that it is difficult to distinguish between Cloud Services documentation and documentation intended for the production cluster. Documentation is a known issue and continues to be an area of focus for the Technical Engagement team.
There is broad agreement that Wikimedia Cloud Services has high uptime, with 89.0% agreeing or strongly agreeing with such a statement.
Four comments were submitted concerning access to support, including time zone issues that hinder online collaboration, complaints about Phabricator tasks taking too long to resolve, and one person feeling they were on their own. Miscellaneous comments concerned the ease of use of Cloud Services and issues with software versions.
Our platforms
- Most people have no opinion of Toolserver as it relates to Toolforge (most were not Toolserver users)
- Being a toolserver user (or otherwise) was associated with opinion on tech support on WMCS vs. toolserver (probably because people who didn't use toolserver said they didn't have an opinion)
- Toolforge
- Programming languages used: predominantly Python and PHP
- Other languages submitted via write-in: Rust, Bash, ake, C++, PGSQL, Haskell, Awk
- Majority of Toolforge dev work is done locally on user's own machine
- Source control: majority use source control of some kind, usually Git.
- Number of years using Toolforge associated with opinion on tech support on WMCS vs. toolserver, as well as whether useful info was on mailing list
- We asked Toolforge users if we could do one thing in the next year to improve Toolforge, what would it be. Responses were clustered as follows:
- Software support (10 responses)
- Simpler deployment processes
- Update sofware versions
- Node.js, Java 8
- Support for GNU Screen
- Ease of use (7 responses)
- Different people have suggested UIs for different facets of running/anaging tools
- or for git repos
- Improve documentation (7)
- Including documentation for bootstrapping projects and other how-to tutorials
- I/O frustration (4)
- NFS is very slow. People notice
- Others: better deployment workflows, concerns about there not being enough resources, bring-your-own-container requests, improvements to Grid Engine, better monitoring support, access to help, availability, backups for projects, better build processes, branding, cron for human users (and not just tools), joins between user tables and wiki replicas, better metrics, remote access to wiki replicas, deletion of tool accounts, a tool bootstrapping kit, tool discoverability
- Software support (10 responses)
- Programming languages used: predominantly Python and PHP
- Cloud VPS:
- Use of Cloud VPS: leading options are running tools, testing software, running MediaWiki instances
- Some people use NFS for accessing same files across different servers. Not a major use case; 20% didn't even know.
- Should probably rephrase this question
- We asked Cloud VPS users if we could do one thing in the next year to improve Cloud VPS, what would it be, Responses were clustered as follows:
- Top complaint: NFS speed (4 responses)
- Access to wiki content (3 responses) (referring largely to things like the text of articles, not available in the replicas)
- Documentation (3 responses)
- Easier ways to install/provision MediaWiki
- Puppet should be easier to use (3)
- Other requests: improvements to Horizon, monitoring support, object storage, backups, git-based workfloows, make it easier to understand DB resource limits, and more flexible VM resource allocation
Tool developers
- Lots of tool types, but predominantly web apps and bots
- Text responses included: developer tooling, data processing tools, maps-related tools.
- Most common backing service: MySQL
- Text responses included: filesystem-based storage, AWS Lambda, Cron, Druid, ElasticSearch, Hadoop, Kafka, Kubernetes, LevelDB, Memcached
- In survey responses, being a tool developer (or otherwise) was associated with thinking the services have high uptime and thinking it is easy to run code on Wikimedia Cloud Services
- Number of hours per week spent on tools associated with how often WMCS tea is contacted, how useful they think the mailing list is, whether the docs are clear, and amount of work done locally vs. remotely
- Higher number of hours per week spent on tool development suggests higher level of experience and therefore more comfort with WMCS offerings
- Number of tools maintained associated with whether they considered information from mailing lists to be useful and the amount of work done locally
MediaWiki development
- How MW is used:
- Usually running maser branch (5 responses)
- Vagrant is also popular (5 responses)
- Stable branch (4 responses)
- Others: Docker, Ansible, WMF branch