Wikidata:Requests for comment/semi-protection to prevent vandalism on most used Items
An editor has requested the community to provide input on "semi-protection to prevent vandalism on most used Items" via the Requests for comment (RFC) process. This is the discussion page regarding the issue.
If you have an opinion regarding this issue, feel free to comment below. Thank you! |
THIS RFC IS CLOSED. Please do NOT vote nor add comments.
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- consensus to semi-protect all items used by 500 Wikimedia pages or more. Semi-protection that is only justified by usage should be lifted once the item falls below the threshold. The criteria of Wikidata administrators about usage or visibility on Wikimedia projects may also be enough to semi-protect other items. --Pasleim (talk) 12:31, 24 June 2019 (UTC)[reply]
Data from a few Wikidata Items are shown on many Wikimedia pages (for instance, Wikipedia articles), so vandalism on these Items can cause much damage even when vandalism is quickly detected and reverted. Edits like the one that changed Iraq's label to "Iran", the one that changed Russia's label to "mainkra" or the one that vandalized the label for pneumonia made thousands of pages show wrong countries and absurd causes of death for hours, since problematic versions remain on Wikimedia pages after their reversion on Wikidata. The impact of this vandalism has been expressed on Wikipedias and on Wikidata and its prevention has been requested on Wikimedia's Community Wishlist Surveys in the current and past years. This vandalism usually extends beyond Wikimedia projects, affecting popular tools and services like Google and Siri and being echoed by the press.
This request for comments aims to compile the community's position on using semi-protection as a way of preventing vandalism on the most used Items. This request for comments does not aim to compile comments about other ways of combating vandalism, which may coexist and should be discussed on other pages, nor to discuss the implementation details (how) of the various preferences that can be reflected (what). Decisions should be based on the current resources of the community and the development team, not on those they might or might not have in the future.
Please read carefully the questions and write below the options that most closely approximate those you consider best. Thanks for your feedback! --abián 14:23, 15 February 2019 (UTC)[reply]
Questions
1. Should some Wikidata Items be semi-protected according to their usage on Wikimedia pages?
Help: Advantages and disadvantages of semi-protecting Items according to their usage on Wikimedia pages | |
---|---|
Pros | Cons |
' Semi-protection is the most effective mechanism available so far in Wikidata to prevent vandalism' |
' Can not become the general rule, openness is Wikimedia's key value' |
- ↑ Stefan Heindorf, Martin Potthast, Benno Stein, and Gregor Engels. 2015. Towards Vandalism Detection in Knowledge Bases: Corpus Construction and Analysis. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '15). ACM, New York, NY, USA, 831-834. DOI: https://doi.org/10.1145/2766462.2767804.
- ↑ phab:T210664#4907866
1.A. Yes, high usage on Wikimedia projects justifies semi-protecting one or more Items.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
- Tiberius1701 (talk) 15:19, 16 February 2019 (UTC)[reply]
- GPSLeo (talk) 17:50, 16 February 2019 (UTC)[reply]
- --Epìdosis 20:28, 17 February 2019 (UTC)[reply]
- Taking in account the benefits, it might be a good solution for the vandalism in Wikidata that affects to another Wikimedia projects. In addition, the quantity/percentage of items would be very low, so I don't think it would damage any value or pillar of the Wikimedia Movement. Ivanhercaz (Talk) 15:02, 18 February 2019 (UTC)[reply]
- Vandalism on wikidata is getting more and more frequent. MarioFinale (talk) 16:58, 18 February 2019 (UTC)[reply]
- It'd be a huge help in improving data quality imo. Nicereddy (talk) 21:03, 18 February 2019 (UTC)[reply]
- Snipre (talk) 06:39, 19 February 2019 (UTC)[reply]
- Taking into account that vandalism persists several hours on many places despite reversion locally here, this is necessary measure. Data quality is quite as important as our openness. Ammarpad (talk) 07:06, 20 February 2019 (UTC)[reply]
- Ayack (talk) 12:39, 20 February 2019 (UTC)[reply]
- Vandalism (especialy the label vandalism) has a very bad influence on Wikidata reputation within the Wikimedia projects. The benefit of anonymous IP addresses editing in these popular items is negligible, while their vandalism has direct negative impact. Jklamo (talk) 18:15, 21 February 2019 (UTC)[reply]
- B25es (talk) 19:18, 24 February 2019 (UTC)[reply]
- Kristbaum (talk) 13:24, 28 February 2019 (UTC)[reply]
- This is comparable to the protection of the main pages in projects: we assume that it is a very visible content with a very high probability of being vandalized, and we do not wait to prove it to protect. In the future it may be possible to fine-tune the measures for a better openness/protection balance, but at the moment this is reasonable. -jem- (talk) 21:39, 6 March 2019 (UTC)[reply]
- --Trade (talk) 00:00, 26 March 2019 (UTC)[reply]
- Endorse with no circumstances --Liuxinyu970226 (talk) 00:51, 2 May 2019 (UTC)[reply]
- This absolutely sucks and I think we should have gone to pending changes years ago, but supporting per [1]. --Rschen7754 01:25, 19 June 2019 (UTC)[reply]
1.B. No, high usage on Wikimedia projects does not justify semi-protecting any Item.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
- I think this needs to wait at least for phab:T143486, and preferably instead of semi-protection, to protect more granularly (allow sitelink changes) with phab:T189412. Or we need some other solution that allows experienced users on other wikimedia sites to automatically be able to bypass semiprotection in Wikidata. ArthurPSmith (talk) 17:28, 16 February 2019 (UTC)[reply]
- MisterSynergy (talk) 17:28, 17 February 2019 (UTC); After thoughts about the matter in the past days, I do not think that there is a problem severe enough to restrict Wikidata’s openness. I am fine with (long-term, but definite) page protections for frequently vandalized pages, but this proposal seems way too much for me to be acceptable.[reply]
- Per MisterSynergy, and also on the principle that there is no simple metric that is an accurate predictor of the variable "needs protection".--Jasper Deng (talk) 23:05, 17 February 2019 (UTC)[reply]
- Frequently-used items should be more closely monitored, not locked down. Protected items will be lower quality, as they prevent contributions. --Yair rand (talk) 01:55, 18 February 2019 (UTC)[reply]
- Mar del Sur (talk) 03:04, 18 February 2019 (UTC) No, the anonymous contribution should not be prohibited "preventively". The only case in which the pages should be protected or semi-protected is that of repeated and persistent vandalism, really when it has been proven that it can not be controlled in any other way.[reply]
- I'm on the fence, but leaning towards not supporting for now. I would support this, but the sitelink issue could be problematic, as most Wikipedia editors are not autoconfirmed on Wikidata. If particular items are being vandalized a lot then they can usually be protected on a case-by-case basis; and a lot of potential vandalism targets can be watchlisted in large batches by adding them from Wikipedias' topviews reports (I would think vandalism is more closely correlated to Wikipedia page views than to item transclusions). Protecting important items would also negatively affect some edit-a-thons, particularly those focused on adding labels and descriptions in uncommon languages. I don't think I've ever come across more than one such edit-a-thon, though. Jc86035 (talk) 13:57, 21 February 2019 (UTC)[reply]
- I think high usage can be a valid reason to justify protection, but the remainder of this proposal seem to take discretion away from administrators altogether, and instead propose that any item with more than X transclusions must be protected and any item below the threshold may never be protected. I don't it's a good idea to remove that flexibility. Deryck Chan (talk) 14:46, 8 April 2019 (UTC)[reply]
2. If so (1.A), should these Items be semi-protected according to a threshold based on the number of uses or according to the various criteria of Wikidata administrators?
2.A. Every Item above a certain number of uses should be semi-protected, while the criteria of Wikidata administrators about usage or visibility on Wikimedia projects may also be enough to semi-protect other Items.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
- GPSLeo (talk) 17:53, 16 February 2019 (UTC)[reply]
- --Epìdosis 20:29, 17 February 2019 (UTC)[reply]
- Ivanhercaz (Talk) 15:04, 18 February 2019 (UTC)[reply]
- MarioFinale (talk) 17:02, 18 February 2019 (UTC)[reply]
- Snipre (talk) 06:40, 19 February 2019 (UTC)[reply]
- Ammarpad (talk) 07:02, 20 February 2019 (UTC)[reply]
- Ayack (talk) 12:40, 20 February 2019 (UTC)[reply]
- Jklamo (talk) 18:17, 21 February 2019 (UTC)[reply]
- B25es (talk) 16:11, 24 February 2019 (UTC)[reply]
- Kristbaum (talk) 13:24, 28 February 2019 (UTC)[reply]
- -jem- (talk) 21:39, 6 March 2019 (UTC)[reply]
- Just nihil obstat (Q1994297) this --Liuxinyu970226 (talk) 00:52, 2 May 2019 (UTC)[reply]
2.B. Every Item above a certain number of uses should be semi-protected and the criteria of Wikidata administrators about usage or visibility on Wikimedia projects can not be a reason to semi-protect any other Items.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
2.C. No threshold based on the number of uses should be imposed, the various criteria of Wikidata administrators about usage or visibility on Wikimedia projects should be enough to determine what Items should be semi-protected for this reason.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
3. If you think that all Items above a certain number of uses should be semi-protected (2.A or 2.B), in terms of which metric should this threshold be set?
3.A. Every Item used on more than T pages on Wikimedia projects.
Since Wikidata usage grows over time, the number of Items semi-protected for this reason should also grow over time.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
- GPSLeo (talk) 17:54, 16 February 2019 (UTC)[reply]
- --Epìdosis 20:31, 17 February 2019 (UTC)[reply]
- Ivanhercaz (Talk) 15:05, 18 February 2019 (UTC)[reply]
- MarioFinale (talk) 19:39, 18 February 2019 (UTC)[reply]
- Nicereddy (talk) 21:04, 18 February 2019 (UTC)[reply]
- Snipre (talk) 06:41, 19 February 2019 (UTC)[reply]
- Ayack (talk) 12:41, 20 February 2019 (UTC)[reply]
- Jklamo (talk) 18:18, 21 February 2019 (UTC)[reply]
- Kristbaum (talk) 13:25, 28 February 2019 (UTC)[reply]
- -jem- (talk) 21:39, 6 March 2019 (UTC)[reply]
- SixTwoEight (talk) 18:44, 15 March 2019 (UTC)[reply]
- nihil obstat (Q1994297) --Liuxinyu970226 (talk) 00:53, 2 May 2019 (UTC)[reply]
3.B. T% of the Items used on the most pages on Wikimedia projects.
Since the total number of Items grows over time, the number of Items semi-protected for this reason should also grow over time.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
3.C. The T Items used on the most pages on Wikimedia projects.
Although both Wikidata usage and the total number of Items grow over time, the number of Items semi-protected for this reason should not change.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
4. If you think that all Items above a certain number of uses should be semi-protected (2.A or 2.B), what this threshold should be?
4.A. Each Item is used by 500,000 Wikimedia pages or more (if 3.A) / 0.0001% most used Items (if 3.B) / 65 most used Items (if 3.C).
This threshold would currently mean unprotecting several semi-protected Items.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
4.B. Each Item is used by 100,000 Wikimedia pages or more (if 3.A) / 0.0005% most used Items (if 3.B) / 260 most used Items (if 3.C).
This threshold would currently mean unprotecting several semi-protected Items.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
4.C. Each Item is used by 50,000 Wikimedia pages or more (if 3.A) / 0.0009% most used Items (if 3.B) / 500 most used Items (if 3.C).
This threshold would currently mean unprotecting several semi-protected Items.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
4.D. Each Item is used by 10,000 Wikimedia pages or more (if 3.A) / 0.0025% most used Items (if 3.B) / 1400 most used Items (if 3.C).
This threshold would currently mean unprotecting several semi-protected Items.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
4.E. Each Item is used by 2500 Wikimedia pages or more (if 3.A) / 0.007% most used Items (if 3.B) / 3600 most used Items (if 3.C).
This threshold is close to the current state.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
4.F. Each Item is used by 1000 Wikimedia pages or more (if 3.A) / 0.014% most used Items (if 3.B) / 7450 most used Items (if 3.C).
This threshold would currently mean semi-protecting several Items.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
4.G. Each Item is used by 500 Wikimedia pages or more (if 3.A) / 0.029% most used Items (if 3.B) / 15,800 most used Items (if 3.C).
This threshold would currently mean semi-protecting several Items.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
- It might seem many items, but it is only 0.029%. Ivanhercaz (Talk) 15:46, 18 February 2019 (UTC)[reply]
- My opinion is essentially the same as Ivan's. Nicereddy (talk) 21:05, 18 February 2019 (UTC)[reply]
- MarioFinale (talk) 17:18, 19 February 2019 (UTC)[reply]
- Kristbaum (talk) 13:26, 28 February 2019 (UTC)[reply]
- -jem- (talk) 21:39, 6 March 2019 (UTC)[reply]
- --Liuxinyu970226 (talk) 00:54, 2 May 2019 (UTC)[reply]
4.H. Each Item is used by 300 Wikimedia pages or more (if 3.A) / 0.06% most used Items (if 3.B) / 33,200 most used Items (if 3.C).
This threshold would currently mean semi-protecting several Items.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
4.I. Each Item is used by 200 Wikimedia pages or more (if 3.A) / 0.10% most used Items (if 3.B) / 60,000 most used Items (if 3.C).
This threshold would currently mean semi-protecting several Items.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
4.J. Each Item is used by 150 Wikimedia pages or more (if 3.A) / 0.15% most used Items (if 3.B) / 80,000 most used Items (if 3.C).
This threshold would currently mean semi-protecting several Items.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
4.K. Each Item is used by 100 Wikimedia pages or more (if 3.A) / 0.21% most used Items (if 3.B) / 115,000 most used Items (if 3.C).
This threshold would currently mean semi-protecting several Items.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
4.L. Each Item is used by 50 Wikimedia pages or more (if 3.A) / 0.46% most used Items (if 3.B) / 250,000 most used Items (if 3.C).
This threshold would currently mean semi-protecting several Items.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
4.M. Each Item is used by 30 Wikimedia pages or more (if 3.A) / 0.84% most used Items (if 3.B) / 460,000 most used Items (if 3.C).
This threshold would currently mean semi-protecting several Items.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
4.N. Each Item is used by 15 Wikimedia pages or more (if 3.A) / 1.74% most used Items (if 3.B) / 950,000 most used Items (if 3.C).
This threshold would currently mean semi-protecting several Items.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
5. If you think that all Items above a certain number of uses should be semi-protected (2.A or 2.B), should a semi-protection that is only justified by usage be lifted once the Item falls below the threshold?
5.A. Yes, semi-protection should be lifted unless there are exceptional reasons to keep it.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
- GPSLeo (talk) 17:55, 16 February 2019 (UTC)[reply]
- --Epìdosis 20:32, 17 February 2019 (UTC)[reply]
- Ivanhercaz (Talk) 15:48, 18 February 2019 (UTC)[reply]
- MarioFinale (talk) 17:18, 19 February 2019 (UTC)[reply]
- Ammarpad (talk) 07:09, 20 February 2019 (UTC)[reply]
- Ayack (talk) 12:42, 20 February 2019 (UTC)[reply]
- Grace period of month or two semi-protection after item falls below the threshold may be considered.Jklamo (talk) 18:23, 21 February 2019 (UTC)[reply]
- Kristbaum (talk) 13:26, 28 February 2019 (UTC)[reply]
- I remark that an item being used in less projects (that is, with its pages deleted) is going to be very rare. -jem- (talk) 21:39, 6 March 2019 (UTC)[reply]
- SixTwoEight (talk) 18:46, 15 March 2019 (UTC)[reply]
- No reason to oppose. --Liuxinyu970226 (talk) 00:54, 2 May 2019 (UTC)[reply]
5.B. No, semi-protection should remain unless there are exceptional reasons to lift it.
If you prefer this option, please add your signature (*~~~~) below and, if you wish, write a brief comment next to it.
Discussion and other suggestions
If you have further comments and suggestions on the subject of this request for comments, semi-protection as a way of preventing vandalism on the most used Items, feel free to leave them below. Please don't forget your signature, ~~~~, and try not to repeat the comments you left above.
- I'd personally say that semi-protection should only occur if a page has a history of vandalism, or is this to protect pages being vandalised on these other Wikimedia websites which then spillover to Wikidata? -- Donald Trung/徵國單 (討論 🀄) (方孔錢 💴) 13:55, 16 February 2019 (UTC)[reply]
- No, that's not the main focus, although there's a Phabricator task for protection level propagation, phab:T205783. I would say 1.B is your option, you could add your comment next to your signature there. --abián 14:14, 16 February 2019 (UTC)[reply]
- I added a couple of related questions (e.g. should we change more efficient means to semi-protection) --- Jura 14:04, 17 February 2019 (UTC)[reply]
- Sorry, Jura1, but the structure of the RfC shouldn't be changed once the RfC has started and this page has been many days in draft status, open to receive any kinds of suggestions. I obviously have to revert the changes. --abián 15:49, 17 February 2019 (UTC)[reply]
- That's not really helpful. Anyways, we can do a part 2: Wikidata:Requests for comment/semi-protection to prevent vandalism on most used Items (part 2). --- Jura 16:49, 17 February 2019 (UTC)[reply]
- IMHO future developments should be aimed to a more granulated protection (
statements
vslabels&descriptions
vssitelinks
). Even protecting single statements (bot-imported from a credible source and sourced -> no need for changing that, except, maybe, its "ranks"). I do not have a strong oppinion on this mass-protection yet. As an experiment (limited to a short set of items) it has some interest, but reaching previous community consensus would have been nice. Due to this "lacking-consensus" status, it doesn't matter to me whether these protections are mass-reverted or not, but at least they should have a temporal end (1-2 months, maybe). Some statistical evaluation of the results once that period is passed ...will be nice. strakhov (talk) 19:08, 17 February 2019 (UTC)[reply]- I do think granular protection would be a far better solution. There are some claims that should almost never be changed, such as the atomic number of an element, but that should not lock down the whole item from editing.--Jasper Deng (talk) 03:13, 18 February 2019 (UTC)[reply]
- It'd be helpful to have a clear definition of what semi-protection entails in the RfC itself. Nicereddy (talk) 21:06, 18 February 2019 (UTC)[reply]
- @Nicereddy: Please read the introduction : "Semi-protection is the intermediate status of page protection, weaker than the full protection but stricter than the default unprotection. To change a semi-protected page on Wikidata you have to be a confirmed user, which means you have to own an account created at least 4 days ago with at least 50 edits or have a confirmed flag granted by an administrator". Snipre (talk) 13:14, 19 February 2019 (UTC)[reply]
- I agree this is a good idea. Lucywood (talk) 09:37, 24 February 2019 (UTC)[reply]
- In the future, these protections (and the rest) should be coordinated somehow with protections in the rest of the Wikimedia projects. We have to move towards increasing integration, and that aspect is one of many to consider. -jem- (talk) 21:39, 6 March 2019 (UTC)[reply]