-
Notifications
You must be signed in to change notification settings - Fork 0
/
todo
279 lines (212 loc) · 17.1 KB
/
todo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
# Papers I still want to read
Check:
* Lovink,Tkacz - Wikipedia Reader
Mayo Fuster Morell - Wikimedia Foundation and Governance of Wikipedia's infrastructure
Shun-ling Chen - The Wikimedia Foundation and the Self-governing Wikipedia Community: A Dynamic Relationship Under Constant Negotiation
evtl for conclusion
* Winner - Do artifacts have politics
* Gillespie - Politics of Platforms
* Harassment Survey Results Report
For algorithmic governance:
Musiani - Governance by algorithms
for fun
* Cathedral & Bazaar
* Lam et al - wp:clubhouse
* Litman - The exclusive right to read - copyright commentary
* Lovink,Tkacz - Wikipedia Reader
Liang - Brief History of the Internet 15th - 18th century
* Wu - When code isn't law
(Where wizzards stay up late)
(Hands - Introduction politics, power and platformativity)
# Next steps
* makes something out of this comment "Reset filter, it hit the 5% limit somehow. -- Shirik 14 May 2010" https://en.wikipedia.org/wiki/Special:AbuseFilter/history/79/diff/prev/5529 : what is the 5% limit? What does it mean to reset a filter?
* check with the history exposed endpoint what filters were introduced immediately (first couple of hours/days): https://en.wikipedia.org/wiki/Special:AbuseFilter/history?user=&filter=
* are there further examples of such collaborations: consider scripting smth that parses the bots descriptions from https://en.wikipedia.org/wiki/Category:All_Wikipedia_bots and looks for "abuse" and "filter"
* consider adding permalinks with exact revision ID as sources!
* https://en.wikipedia.org/wiki/Category:Wikipedia_bot_operators
* an idea for the presi/written text: begin and end every part (section/paragraph) with a question: what question do I want to answer here? what question is still open?
* How many of the edit filter managers also run bots. How do they decide in which case to implement a bot and in which a filter?
* Why are there mechanisms triggered before an edit gets published (such as edit filters), and such triggered afterwards (such as bots)? Is there a qualitative difference?
* do bots check also entire article text and not only single edits? as a clever person with malicious intentions I could split my malicious stuff into several edits to make it more difficult to discover -- unklar. ich hab das gefühl, die sind schon edit-basiert; confirmed by C.
* how stable is the edit filter managers group? how often are new editors accepted? (who/how nominates them? maybe there aren't very many accepted, but then again if only 2 apply and both are granted the right, can you then claim it's exclusive?) -- I think it's somewhat stable. In the last 3 months nothing has changed; for instance, mid-end 2017 there was somewhat high traffic of people requesting edit-filter-helper permissions, since it was newly implemented around that time (before that you could only get the full edit filter manager or nothing at all); it also seems that since then the practice was established that people would request edit filter helper first and only then be perhaps promoted to edit filter manager
* I want to help people to do their work better using a technical system (e.g. the edit filters). How can I do this?
* The edit filter system can be embedded in the vandalism prevention frame. Are there other contexts/frames for which it is relevant?
* Read these pages
https://lists.wikimedia.org/mailman/listinfo
https://en.wikipedia.org/wiki/Wikipedia:Edit_warring
https://en.wikipedia.org/wiki/Wikipedia:Blocking_policy#Evasion_of_blocks
https://en.wikipedia.org/wiki/Wikipedia:Blocking_IP_addresses
https://meta.wikimedia.org/wiki/Vandalbot
https://en.wikipedia.org/wiki/Wikipedia:Most_vandalized_pages
https://en.wikipedia.org/wiki/Wikipedia:The_motivation_of_a_vandal
https://en.wikipedia.org/wiki/Wikipedia:Flagged_revisions
https://en.wikipedia.org/wiki/User:Emijrp/Anti-vandalism_bot_census
https://en.wikipedia.org/wiki/Wikipedia:Counter-Vandalism_Unit/Vandalism_studies/Study1
https://en.wikipedia.org/wiki/Wikipedia:Counter-Vandalism_Unit/Vandalism_studies/Study2
https://en.wikipedia.org/wiki/Wikipedia:Counter-Vandalism_Unit/Vandalism_studies/Obama_article_study
https://en.wikipedia.org/wiki/Wikipedia:TW
https://en.wikipedia.org/wiki/Wikipedia:Dispute_resolution
https://en.wikipedia.org/wiki/Wikipedia:Oversight
https://en.wikipedia.org/wiki/Wikipedia:Revision_deletion
https://en.wikipedia.org/wiki/Wikipedia:Linking_to_external_harassment
https://en.wikipedia.org/wiki/Wikipedia:Personal_security_practices
https://en.wikipedia.org/wiki/Wikipedia:On_privacy,_confidentiality_and_discretion
https://en.wikipedia.org/wiki/Wikipedia:How_to_not_get_outed_on_Wikipedia
https://en.wikipedia.org/wiki/Wikipedia:No_personal_attacks
https://en.wikipedia.org/wiki/Wikipedia:Newbies_aren%27t_always_clueless
https://en.wikipedia.org/wiki/Wikipedia:On_assuming_good_faith
* look at AbuseFilter extention code: how is a filter trigger logged?
https://github.com/wikimedia/mediawiki-extensions-AbuseFilter/blob/master/includes/AbuseFilter.php
* understand how are stats generated
* research filter development over time
* plot number of filters over time (maybe grouped by week instead of a year)
* get a feeling of the actions the filters triggered over time
* ping aaron/amir for access to a backend db to look at filters; explanation how this is helping the community is important
* questions from EN-state-of-the-art
"Non-admins in good standing who wish to review a proposed but hidden filter may message the mailing list for details."
// what is "good standing"?
// what are the arguments for hiding a filter? --> particularly obnoxious vandals can see how their edits are being filtered and circumvent them; (no written quote yet)
// are users still informed if their edit triggers a hidden filter?
Exemptions for "urgent situation" -- what/how are these defined?
Discussions may happen postfactum here and filter may be applied before having been thoroughly tested; in this case the corresponding editor is responsible for checking the logs regularly and making sure the filter acts as desired
"Because even the smallest mistake in editing a filter can disrupt the encyclopedia, only editors who have the required good judgment and technical proficiency are permitted to configure filters."
--> Who are these editors? Who decides they are qualified enough?
# Interesting pages
## Edit filters in different languages:
https://en.wikipedia.org/wiki/Wikipedia:Bots_are_annoying
https://de.wikipedia.org/wiki/Wikipedia:Bearbeitungsfilter
https://es.wikipedia.org/wiki/Wikipedia:Filtro_de_ediciones
https://ca.wikipedia.org/wiki/Viquip%C3%A8dia:Filtre_d%27edicions
https://ru.wikipedia.org/wiki/%D0%92%D0%B8%D0%BA%D0%B8%D0%BF%D0%B5%D0%B4%D0%B8%D1%8F:%D0%A4%D0%B8%D0%BB%D1%8C%D1%82%D1%80_%D0%BF%D1%80%D0%B0%D0%B2%D0%BE%D0%BA
(no bulgarian)
https://de.wikipedia.org/wiki/Spezial:Missbrauchsfilter
## Others
https://en.wikipedia.org/wiki/Wikipedia:Vandalism
https://en.wikipedia.org/wiki/Wikipedia:Vandalism_types
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter/Documentation
https://en.wikipedia.org/wiki/User:MusikBot/FilterMonitor/Recent_changes
https://en.wikipedia.org/w/index.php?title=Special:AbuseFilter&dir=prev
https://de.wikipedia.org/wiki/Hilfe:Bearbeitungsfilter
https://tools.wmflabs.org/ptwikis/Filters:dewiki
https://en.wikipedia.org/wiki/Wikipedia:Tags
https://en.wikipedia.org/wiki/Wikipedia:Usernames_for_administrator_attention (UAA)
https://en.wikipedia.org/wiki/Special:RecentChanges?hidebots=1&hidecategorization=1&hideWikibase=1&tagfilter=abusefilter-condition-limit&limit=50&days=7&urlversion=2
https://en.wikipedia.org/w/index.php?title=Wikipedia:Edit_filter&oldid=221158142 <--- 1st version ever of the Edit_filter page; created 23.06.2008
https://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Abuse_filter&oldid=279022713#-modify_right_.28moved_from_AN.29
## In other languages
### Compare colors (actions) in the stats graphs!
https://tools.wmflabs.org/ptwikis/Filters:bgwiki <-- there are also global filters listed here! (what's that?)
https://tools.wmflabs.org/ptwikis/Filters:cawiki
https://tools.wmflabs.org/ptwikis/Filters:dewiki
https://tools.wmflabs.org/ptwikis/Filters:eswiki
## Software
## Questions
* weiß nicht wie relevant das ist, aber: wie funktionieren die gesichteten Versionen bei der Deutschen Wikipedia
# Questions to be asked to edit filter editors
* When did you join the edit filter group?
* How? (What was the process for joining)?
* Why?
* How active were you/what have you been doing since joining?
* What is an example of a typical case for which you will implement a filter?
* What is an example for a case you'd rather not implement a filter for but apply another process? Which process?
* Why does the edit filter mechanism exist? Aren't bots and ORES and semi-automatic tools such as Huggle or Twinkle enough for combating vandalism?
==========================================================
# Checked
## Edit filters in different languages:
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter
## Others
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter_noticeboard <-- announce new filters and put them up for discussion before approving them "for coordination and discussion of edit filter use and management."
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter/Requested
https://de.wikipedia.org/wiki/Wikipedia:Bearbeitungsfilter/Antr%C3%A4ge
https://en.wikipedia.org/wiki/Wikipedia:Long-term_abuse
https://en.wikipedia.org/wiki/Special:AbuseLog (+DE/ES/CAT/BG)
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter/False_positives
https://en.wikipedia.org/wiki/Special:AbuseFilter
https://en.wikipedia.org/wiki/Special:AbuseFilter/1
https://tools.wmflabs.org/ptwikis/Filters:enwiki:61
https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2009-03-23/Abuse_Filter
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter/Instructions
https://en.wikipedia.org/wiki/Wikipedia:No_original_research
https://en.wikipedia.org/wiki/Wikipedia_talk:Edit_filter/Archive_1
https://en.wikipedia.org/wiki/Category:Wikipedia_edit_filter
## Vandalism
https://en.wikipedia.org/wiki/Wikipedia:Vandalism
https://en.wikipedia.org/wiki/Wikipedia:Vandalism_types
https://en.wikipedia.org/wiki/Wikipedia:Administrator_intervention_against_vandalism
https://en.wikipedia.org/wiki/Wikipedia:Disruptive_editing
https://en.wikipedia.org/wiki/Wikipedia:Counter-Vandalism_Unit/Vandalism_studies
https://en.wikipedia.org/wiki/Wikipedia:Requests_for_page_protection
https://en.wikipedia.org/wiki/Wikipedia:Offensive_material
https://en.wikipedia.org/wiki/Wikipedia:Neutral_point_of_view
https://en.wikipedia.org/wiki/Wikipedia:Harassment
https://en.wikipedia.org/wiki/Wikipedia:Assume_good_faith
https://en.wikipedia.org/wiki/Wikipedia:STiki
https://en.wikipedia.org/wiki/Vandalism_on_Wikipedia
https://en.wikipedia.org/wiki/Category:Wikipedia_counter-vandalism_tools
https://en.wikipedia.org/wiki/Wikipedia:Counter-Vandalism_Unit
https://en.wikipedia.org/wiki/Wikipedia:Cleaning_up_vandalism
https://en.wikipedia.org/wiki/Wikipedia:Counter-Vandalism_Unit/Academy
https://en.wikipedia.org/wiki/Wikipedia:Long_term_abuse
https://en.wikipedia.org/wiki/Wikipedia:Sockpuppet_investigations
https://en.wikipedia.org/wiki/Wikipedia:Sock_puppetry
https://en.wikipedia.org/wiki/Wikipedia:Purpose
## Software
https://www.mediawiki.org/wiki/Extension:AbuseFilter/Actions
--> exists interestingly enough in all languages I'm interested in
https://www.mediawiki.org/wiki/Extension:AbuseFilter
https://www.mediawiki.org/wiki/Extension:AbuseFilter/Rules_format
https://phabricator.wikimedia.org/project/view/217/ <-- project tickets AbuseFilters extention
# Done
* Setup CSCW latex template up
* add "af_deleted" column to filter list
* Look at filters: what different types of filters are there? how do we classify them?
* add a special tag for filters targeting spam bots? (!!! important: do research on distinction/collaboration bots/filters)
* consider all types of vandalism (https://en.wikipedia.org/wiki/Wikipedia:Vandalism#Types_of_vandalism) when refining the self assigned tags
(Abuse of tags; Account creation, malicious; Avoidant vandalism; Blanking, illegitimate; Copyrighted material, repeated uploading of; Edit summary vandalism; Format vandalism; Gaming the system; Hidden vandalism; Hoaxing vandalism; Image vandalism; Link vandalism; Page creation, illegitimate; Page lengthening; Page-move vandalism; Silly vandalism; Sneaky vandalism; Spam external linking; Stockbroking vandalism; talk page vandalism; Template vandalism; User and user talk page vandalism; Vandalbots;)
* consider also other forms of (unintenionally) disruptive behaviour: boldly editing; copyright violation disruptive editing or stubbornness --> edit warring; edit summary omission; editing tests by experimenting users; harassment or personal attacks; Incorrect wiki markup and style; lack of understanding of the purpose of wikipedia; misinformation, accidental; NPOV contraventions (Neutral point of view); nonsense, accidental; Policy and guideline pages, good-faith changes to; Reversion or removal of unencyclopedic material, or of edits covered under the biographies of living persons policy; Deletion nominations;
-----
* classify in "vandalism"|"good_faith"|"biased_edits"|"misc" for now
* syntactic vs semantic vs ? (ALL CAPS is syntactic)
* are there ontologies?
* how is spam classified for example?
* add a README to github repo
// do the users notice the logging? or only "bigger" actions such as warnings/being blocked, etc.?
* look for db dumps
https://meta.wikimedia.org/wiki/Research:Quarry
https://meta.wikimedia.org/wiki/Toolserver
https://quarry.wmflabs.org/query/runs/all?from=7666&limit=50
https://upload.wikimedia.org/wikipedia/commons/9/94/MediaWiki_1.28.0_database_schema.svg
https://tools.wmflabs.org/
https://tools.wmflabs.org/admin/tools
https://www.mediawiki.org/wiki/API:Main_page
* create a developer account
Do smth with this info:
Claudia: * A focus on the Good faith policies/guidelines is a historical development. After the huge surge in edits Wikipedia experienced starting 2005 the community needed a means to handle these (and the proportional amount of vandalism). They opted for automatisation. Automated system branded a lot of good faith edits as vandalism, which drove new comers away. A policy focus on good faith is part of the intentions to fix this.
* We need a description of the technical workings of the edit filter system!
* How can we improve it from a computer scientist's/engineer's perspective?
* What task do the edit filters try to solve? Why does this task exist?/Why is it important?
* Think about: what's the computer science take on the field? How can we design a "better"/more efficient/more user friendly system? A system that reflects particular values (vgl Code 2.0, Chapter 3, p.34)?
* go over notes in the filter classification and think about interesting controversies, things that attract the attention
* what are useful categories
* GT is good for tackling controversial questions: e.g. are filters with disallow action a too severe interference with the editing process that has way too much negative consequences? (e.g. driving away new comers?)
* What can we study?
* Discussions on filter patterns? On filter repercussions?
* Whether filters work the desired way/help for a smoother Wikipedia service or is it a lot of work to maintain them and the usefullness is questionable?
* Question: Is it worth it to use a filter which has many side effects?
* What can we filter with a REGEX? And what not? Are regexes the suitable technology for the means the community is trying to achieve?
* What other data sources can I explore?
* Interview with filter managers? with admins? with new editors?
* check filter rules for edits in user/talks name spaces (may be indication of filtering harassment)
* There was the section of what are filters suitable for; should we check filters against this list?
* add also "af_enabled" column to filter list; could be that the high hit count was made by false positives, which will have led to disabling the filter (TODO: that's a very interesting question actually; how do we know the high number of hits were actually leggit problems the filter wanted to catch and no false positives?)
* https://ifex.org/international/2019/02/21/technology-block-internet/ <-- filters
* Geiger et al - Defense Mechanisms
* Halfaker et al - The rise and decline of an open collaboration system (evtl enough, don't have to read Suh at al in detail)
Urquhardt - Bringing theory back to grounded theory
# Feedback T
* gibts es vergleichbare concerns zu den Gamification concerns bei semi-automated tools bei anderen mechanismen?
* den Unterschied hervorheben: bots/semi-aut. tool: similar: automatic detection of potential vandalism; different: a person must click (in the tools)
* filters: BEFORE an edit is published; everything else: AFTER
* filters: REGEX!
* die wichtigsten erkenntnisse mehrmals erwähnen: intro, schluss, tralala; nicht dass sie unter gehen weil ich von lautern Bäumen den Wald nicht mehr sehe
* do bots check also entire article text and not only single edits? as a clever person with malicious intentions I could split my malicious stuff into several edits to make it more difficult to discover -- unklar. ich hab das gefühl, die sind schon edit-basiert