Rationale
As part of the ongoing work on T296847, there is a need to understand how many Gadgets and User scripts would be impacted by the policy. This data will inform further discussions, especially during the upcoming policy consultation.
Initial findings
Methodology
Overall, the data collection is a mix of various methods, including Logstash queries, global-search.toolforge.org, a script that builds a list gadgets loading non-production resources. Aside from that, the exploration below contains a raw list of reported CSP violations was obtained from a Logstash querry. It features reports from February to April 2023. Finding the number Gadgets and Users scripts involved in those CSP violations was achievable by (a) trimming the URLs so as to obtain the list of domains involved in CSP violations, (b) finding the occurences of those domains across all Wikimedia projects's Gadgets and User namespaces using https://global-search.toolforge.org and or mwgrep, discarding noise such as "eval" and "data" results.
List of gadgets loading third-party resources across all projects
TBD
Top domains violating CSP restrictions
When grouped by domain origins, URLs that violate CSP rules the most seem to originate from around 50 domains.
Observations on Gadgets loading third-party resources
Generally speaking, translation tools and WMCS-hosted applications seem to be among the top domains involved in CSP violations. Around 90 gadgets appear to load resources from Wikimedia Cloud Services, while around 80 use resources originating from non-WMCS resources, including Google Translate and Yandex APIs.
Observations on User scripts loading third-party resources (in progress)
Most of User scripts related to CSP violations load non-WMCS resources, including Facebook Connect and Google Analytics. It is also good to note that Google fonts are among the most loaded external resources.