Jump to content

Moderator Tools/Automoderator/Multilingual testing

From mediawiki.org
Diagram demonstrating the Automoderator software decision process.

Automoderator is currently deployed on a number of Wikimedia projects using the Language-agnostic Revert Risk model, following a round of user Testing which found that it was reliable enough to be used. One other finding of that research, confirmed by our ongoing analysis of Automoderator's behaviour, is that it doesn't currently handle a significant percentage of the patrolling workload. Because of this, we are investigating switching to the Multilingual Revert Risk model on the Wikipedias which are supported by it. Before we do that, we need more community input to test this model so that we can understand its strengths and weaknesses, and set appropriate revert thresholds.

How to test the Multilingual Revert Risk model

[edit]
Screenshot of the spreadsheet, with example responses filled in.
  • If you have a Google account:
    1. Use the Google Sheet link below and make a copy of it
      • You can do this by clicking File > Make a Copy ... after opening the link.
    2. After your copy has loaded, click Share in the top corner, then give any access to swalton@wikimedia.org (leaving 'Notify' checked), so that we can aggregate your responses to collect data on Automoderator's accuracy.
      • Alternatively, you can change 'General access' to 'Anyone with the link' and share a link with us directly or on-wiki.
  • Alternatively, use the .ods file link to download the file to your computer.
    • After adding your decisions, please send the sheet back to us at swalton@wikimedia.org, so that we can aggregate your responses to collect data on Automoderator's accuracy.

After accessing the spreadsheet...

  1. Follow the instructions in the sheet to select a random dataset, review 30 edits, and then uncover what decisions Automoderator would make for each edit.
    • Feel free to explore the full data in the 'Edit data & scores' tab.
    • If you want to review another dataset please make a new copy of the sheet to avoid conflicting data.
  2. Join the discussion on the talk page.

Alternatively, you can simply dive in to the individual project tabs and start investigating the data directly.


We welcome translations of this sheet - if you would like to submit a translation please make a copy, translate the strings on the 'String translations' tab, and send it back to us at swalton@wikimedia.org.

If you want us to add data from another Wikipedia please let us know and we would be happy to do so.

About Automoderator

[edit]

Automoderator’s model is trained exclusively on Wikipedia’s main namespace pages, limiting its dataset to edits made to Wikipedia articles. Additionally, this model only supports 47 languages.

Further details on Automoderator's internal configuration and caution thresholds can be found on the original testing page .

The number of reverts expected at different caution levels can be seen below. These thresholds were selected following some brief internal testing, and are very likely to change following this testing process:

Daily edits Daily edit reverts Average daily reverts by Automoderator
Very cautious

>0.99

Cautious

>0.98

Somewhat cautious

>0.97

Low caution

>0.96

Not cautious

>0.95

English Wikipedia 140,000 14,600 84 273 468 660 884
French Wikipedia 23,200 1,400 20 43 62 81 97
German Wikipedia 23,000 1,670 35 69 96 126 163
Spanish Wikipedia 18,500 3,100 147 315 428 530 634
Russian Wikipedia 16,500 2,000 44 108 160 220 284
Japanese Wikipedia 14,500 1,000 3 7 11 16 20
Chinese Wikipedia 13,600 890 2 3 7 8 10
Italian Wikipedia 13,400 1,600 13 37 58 79 102
Polish Wikipedia 5,900 530 16 42 68 92 124
Hebrew Wikipedia 5,400 710 17 35 50 64 79
Persian Wikipedia 5,200 900 43 120 197 291 373
Korean Wikipedia 4,300 430 2 6 11 16 22
Indonesian Wikipedia 3,900 340 2 3 4 5 8
Turkish Wikipedia 3,800 510 7 17 26 38 51
Arabic Wikipedia 3,600 670 24 68 121 169 213
Romanian Wikipedia 1,300 110 3 6 9 12 16
Croatian Wikipedia 500 50 2 3 4 6 8
... ... ... ... ... ... ... ...

This data can be viewed for other Wikimedia projects here.