Steward requests/Bot status/2021-07

From Meta, a Wikimedia project coordination wiki

Global bot status requests

Global bot status for EncyclopedistBot

Not ending before 21 July 2021 16:07 UTC

(your remarks)

The discussion was not created

Энциклопедист свободного контента (talk) 16:07, 7 July 2021 (UTC)

Not done Fast track close, does not meet requirements listed in the policy. --Martin Urbanec (talk) 19:53, 7 July 2021 (UTC)

Global bot status for InternetArchiveBot

Not ending before 15 July 2021 15:57 UTC
  • Hello, I am Cyberpower678, a sysop on English Wikipedia. I am the original developer of InternetArchiveBot. I am studying for my Masters Degree in computer science, and I currently work for the Internet Archive.
  • Simply, what IABot does is it finds a dead link on Wikipedia and checks Internet Archive to see if it has a backup version in the Wayback Machine from when that website was live. Particularly, it looks to see if Wayback has a version whose date is close to when the link was added to Wikipedia. If it finds a backup version, it adds it to the reference.
  • IABot is currently enabled on 65 wikis. We have sought individual approval for each wiki, which has been a time-consuming and repetitive process. While we've worked with over 60 wikis, there are 250 more that should also have live links for readers.
  • IABot has made almost 17 million edits so far. It's been covered in the BBC and Wired for our sincere attempt to make sure no link on Wikipedia sends a reader to a dead webpage (404 error).
  • I started InternetArchiveBot in 2015 as a volunteer to solve the problem of link rot. When the bot got too important to handle alone, I was recruited by the Wayback Machine team at the Internet Archive. My team has 4-8 people supporting IABot working on everything from documentation, to making it easier for developers to work with the codebase, to figuring out how to solve long-term engineering challenges.
  • Over the years we have added many customization options, including different configuration options. We have also built a management interface so that bot operations, including starting and disabling the bot, can be managed by regular users. We are continuing to work on making it easier to use and control the bot, and for users to interact with our development team.
  • Any link to any page can break at any time. This year alone InternetArchiveBot has made over two million edits to links; there are potentially millions of broken links across Wikimedia projects; and, there will be more broken links as time goes on and more websites are taken offline. This is a problem at a scale that needs to be addressed with automated and semi-automated tools.
  • To be clear, this bot approval request is for the task of fixing and enhancing references (combating link rot). It is not for adding links to books.
  • I appreciate the community working with us over the years to improve the bot and now I ask the community to empower us to fix more dead links on every language version of Wikipedia. Thank you!

Meta bot user: https://meta.wikimedia.org/wiki/User:InternetArchiveBot

Global SUL: https://meta.wikimedia.org/wiki/Special:CentralAuth/InternetArchiveBot

Github: https://github.com/internetarchive/internetarchivebot

Dashboard: https://tools-static.wmflabs.org/botwikiawk/dashboard.html

Team: Cyberpower678, Mark Graham, Harej, GreenC, Ocaasi

CYBERPOWER (Chat) 15:57, 1 July 2021 (UTC)

Comment Comment Mass message sent, as required by policy. --Martin Urbanec (talk) 16:39, 1 July 2021 (UTC)
I don't like to be annoying, but could you perhaps finish the task at en:WT:RFPP, requiring a reconfiguration of one of your bots since January 2019 to match current consensus, before requesting new bot approvals – in this case, for 250 additional wikis? ToBeFree (talk) 22:15, 1 July 2021 (UTC)
Hello ToBeFree, please link to the specific discussion and I will take a look. harej (talk) 22:59, 1 July 2021 (UTC)
Hi harej, it's the discussion at the top of the linked page. It is unrelated to the Internet Archive, but it is currently blocked by the lack of an update for Cyberbot I, which is operated by Cyberpower678. ToBeFree (talk) 19:14, 2 July 2021 (UTC)
I believe there was a dispute on whether that task was even useful (the two VPR discussions found a consensus that the proposal was useless/meh). In any case, we're all volunteers, and we work on what we enjoy at a given time. ProcrastinatingReader (talk) 12:20, 2 July 2021 (UTC)
Ah. ProcrastinatingReader, that was about the "Implement a full archive" part. I'm mainly interested in "Reconfigure Cyberbot I: Work transparently on the subpages (/Increase, /Decrease, /Edit) too." and the related checkboxes. I've been asking for this for years, recognizing the volunteer nature of the project. However, there is a community consensus to change the underlying behavior of a page, and the bot needs to be updated for that change to be implemented. I can at least ask the bot operator to stop requesting further responsibilities before they comply with en:WP:BOTCOMM/en:WP:BOTISSUE on their existing one. ToBeFree (talk) 19:10, 2 July 2021 (UTC)
@ToBeFree: I've been committing a lot of time to this lately and was hoping to have a response for you by now about moving forward to transitioning the bot to the new format. I am happy to report that I am now testing the bot and should have a working update in about 4 days. I do have to ask, shall I just reformat the page when I'm ready, or should I slowly transition over when everyone else is ready?—CYBERPOWER (Chat) 17:08, 9 July 2021 (UTC)
Yayyy 💚😊 Thank you very much, Cyberpower678, and sorry for having raised the issue in this unrelated, high-traffic forum out of frustration. I had been waiting for this for years and am very happy to read this. Regarding the question, feel free to notify me on my talk page when you're about to run the update, and I'll be there within 12 hours to fill the non-bot checkboxes on the list. As the page will probably survive a day without bot clerking, if it makes the transition easier, you may like to message me and temporarily disable the bot; I'll do the non-bot work and notify you that the updated bot can be enabled. ToBeFree (talk) 19:01, 10 July 2021 (UTC)
(by the way, unless I'm overlooking something, the message above did not ping me – perhaps the interwiki link isn't correctly identified as a signature by Echo.) ToBeFree (talk) 19:04, 10 July 2021 (UTC)
  • Support Support Highly trusted User with a tool everyone would miss, if it wouldn't be present for a few weeks. Should be granted for all wikis which especially helps all readers/users, but I'm still thankful as author of articles which I created (there I look for new/better resources or have at least the Wayback-machine-link) and which I read or expand (reference is still available or I can fix it as I see there is something missing). Please approve/support. --Mirer (talk) 00:52, 2 July 2021 (UTC)
  • Oppose Oppose This is a request from an employee of an external entity requesting permission to preemptively add millions of links, per year, to their company's service to 250 wikis that have not requested this bot task and may or may not want it. While the Wayback Machine is a great service for the average individual Internet user, and the Internet Archive a charity with laudable goals, their interests and priorities are not the same as the Wikimedia movement's. And as the recent book linking episode on enWP demonstrated aptly, those divergent interests do get practical expression through bot behaviour; and neither Cyberpower nor their manager at IA handled that issue in a way that was particularly reassuring (so no, Cyberpower no longer falls implicitly into the "highly trusted user" category due to their w:WP:PAID conflict of interest). In particular, in that whole episode I saw little interest in interacting with the community affected by the bot's edits, a complete inability or unwillingness to acknowledge the community's concerns (mostly they gave the impression of feeling put upon), and a complete disinterest in exploring alternate approaches or compromises.
    Note that on most individual projects where I participate I would most likely support a suitably scoped BRFA for IABot: Wayback is a great service, IABot does do a valuable job, and in no way shape or form am I saying that Cyberpower is a bad or inherently untrustworthy person. I just think this is a call that each project needs to make, and to rescind if they so choose. --Xover (talk) 05:33, 2 July 2021 (UTC)
    @Xover: Could you link the book linking episode you allude to? {{u|Sdkb}}talk 07:19, 2 July 2021 (UTC)
    w:Wikipedia:Village pump (policy)/Archive 159#Stop InternetArchiveBot from linking books * Pppery * it has begun 13:53, 2 July 2021 (UTC)
    And its aftermath at w:WP:Bots/Noticeboard/Archive 14#VPPOL discussion closed: linking by InternetArchiveBot. It's the same issue Cyberpower alludes to in the request up above. --Xover (talk) 15:02, 2 July 2021 (UTC)
    The Internet Archive isn't a "company"; it's a mission-aligned non-profit whose work has benefited Wikimedia for our entire existence. Ed [talk] [en] 17:43, 2 July 2021 (UTC)
    Indeed, the IA is, as I mentioned, roughly aligned with the Wikimedia movement's mission; but that doesn't mean their interests and priorities are identical or even always compatible. Just as the crudest and most obvious example they partly compete for the same donors that the WMF does. A more subtle one is that they use raw outbound links from Wikipedia to their service as a success metric, but Wikipedia policy severely restricts the use of external links and regulates where they may be used in what way (in sometimes excruciating detail). That is, Cyberpower et al's next paycheck depends on them caring more about the number of such links than about the local policy for links. In the (hopefully very small) number of cases where these priorities are in direct conflict, this will become a real problem. This is exactly the issue WP:PAID and WP:COI tries to address, and it being now a whole team with significant infrastructure and backing operating a bot with this problem makes it a lot more concerning. At this scale and intensity even small differences in priorities are more likely than not to get magnified until they become real problems; and the book linking episode and the handling of it by both Cyberpower and their manager at IA made me lose faith in their ability to counteract this problem to the movement's benefit. If you're a PAID editor that can't see that you have a COI you are exceedingly unlikely to be able to actually compensate for that COI and act accordingly. Xover (talk) 08:27, 3 July 2021 (UTC)
    That was about an entirely different task that what this approval is about. (Also, since you appear to be concerned with indirect effects of COIs, it should also be mentioned that that discussion, which did not see that much participation and was closed as "no consensus", had been initiated and heavily argued by a user with a clear and acknowledged COI themselves, who focused on perceived threats to the financial profitability of his industry way more than on what's best for Wikimedia projects.) Regards, HaeB (talk) 18:05, 2 July 2021 (UTC)
The COI concerns are a red herring here: If you think that keeping these links broken is in the interest of Wikimedia projects, then say so. But that doesn't seem to be your argument. If you happen to think that fixing these links is useful, then there is no conflict of interest about this particular bot task. Regards, HaeB (talk) 18:05, 2 July 2021 (UTC)
  • Support Support This is the first reasonable request submitted under the new global bot policy. Part of the reasoning behind the RfC was that we have tons of projects, and botops are not going to spend time configuring and requesting approval for each one manually (many of which may not even have approval policies). InternetArchiveBot was cited in the opening of that RfC as an example of a bot that could benefit from this new policy. It is worth noting that local communities opt into the global bot policy; it is not automatic. IABot has approval on many large projects, and 65 projects in total. This shows that communities of varying size, culture, language and purpose believe the bot is doing a helpful and uncontentious task. Considering these facts, I see no reason not to give this bot global bot status. ProcrastinatingReader (talk) 12:16, 2 July 2021 (UTC)
  • Support Support No concerns. There are no alternatives archive services, so it makes sense to roll this out globally. My home wiki is Commons. --Schlurcher (talk) 12:20, 2 July 2021 (UTC)
    Comment Comment Connection of the bot to the archive should be disclosed on the bot's user page, as on the page from the main developer. --Schlurcher (talk) 12:30, 2 July 2021 (UTC)
    Comment Comment I get where you're coming from, but isn't the name of the bot fairly self explanatory?Jackattack1597 (talk) 13:09, 2 July 2021 (UTC)
  • Support Support Very useful crosswiki bot that should not have to go through the bot approval process for wikis that opt in to global bots.Jackattack1597 (talk) 13:09, 2 July 2021 (UTC)
  • Oppose Oppose I don't think it's appropriate for bots that have had to be indefinitely blocked for violating local bot policy (in this case, on the Japanese Wikipedia) to be approved globally. * Pppery * it has begun 13:58, 2 July 2021 (UTC) (struck * Pppery * it has begun 22:49, 2 July 2021 (UTC))
    • There is a unique situation with Japanese Wikipedia. At first, the bot was not respecting rate limits. This has now been entirely fixed. Second, many websites block access to IP addresses that are outside of Japan leading to false positives from the bot. We are still working on fixing this. I have been in conversation with Ney, an administrator on Japanese Wikipedia, and he said he is open to unblocking the bot once we are ready to seek re-approval. harej (talk) 18:18, 2 July 2021 (UTC)
    (edit conflict) I understand that global bot status mainly affects the default assumption on wikis that have not yet formed a consensus one way or the other, and that it would not override local blocks. Besides, it seems that the block in question was about an issue with the particular edit rate limits of Japanese Wikipedia in a particular type of situation, rather than content concerns about the types of edits that this RfC is about, that the blocking admin afterwards stated that they saw that issue as resolved and did no longer consider the block necessary, and that the bot operators filed and later closed a technical fix (phab:T254017) to prevent that situation from reoccurring. Regards, HaeB (talk) 18:20, 2 July 2021 (UTC)
  • Support Support No brainer. This bot has been a huge benefit to the Wikipedias where it is used. Gamaliel (talk) 17:02, 2 July 2021 (UTC)
  • Support Support per ProcrastinatingReader. Ed [talk] [en] 17:46, 2 July 2021 (UTC)
  • I've hesitated on supporting because there should be a clearly defined way for a wiki to opt out of this if desired. --Rschen7754 18:06, 2 July 2021 (UTC)
    @Rschen7754: Thanks for raising that concern Rschen. The global bot flag only works on opted-in wikis--any wiki can remove itself from opt-in status generally. More specifically, the bot itself can be stopped by any regular user by turning off the run page on their wiki here. If there is local consensus to fully disable the bot, we would of course respect that.—CYBERPOWER (Chat) 18:37, 2 July 2021 (UTC)
  • Support Support Immensely useful for readers and contributors alike, and the proposal makes a reasonable case that filing individual BRFAs on the long tail of the 250 remaining wikis is unrealistic. Regards, HaeB (talk) 18:25, 2 July 2021 (UTC)
  • Support, per Haeb. Enormously useful across languages and contexts. –SJ talk  19:38, 2 July 2021 (UTC)
  • Support Support -- Marcus Cyron (talk) 21:41, 2 July 2021 (UTC)
  • Support Support I had wished before that a bot like this existed in Konkani Wikipedia. We had problems with linkrot when a widely used source website url was taken over by an entirely unrelated website. Our volunteers did not have the habit of adding archive links to citations, this bot will solve the problem. Having looked at how the bot operates in Hindi Wikipedia, I can reasonably believe it won't cause issues. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 10:51, 3 July 2021 (UTC)
  • Support Support Nothing prevents a Wikipedia language version from opting out, if they decide that this bot is not useful. Given the success of this bot where implemented, and the resources committed to its functioning well, it makes sense to presume that it will be beneficial to the vast majority of Wikipedia language versions where it has not yet been implemented. And if so, it benefits Wikipedia (and the bot operators) to avoid all the extra administrative work of asking at 200+ more places for permission to operate. John Broughton (talk) 16:19, 3 July 2021 (UTC)
  • Support Support no bot (nor operator) is ever perfect and it's unrealistic to expect that. The bot is actively maintained and has good disabling options in case something does go wrong. thank you for fixing link rot! Legoktm (talk) 16:26, 3 July 2021 (UTC)
  • Support Support The bot works well at all projects I saw it operating. I trust the bot operator(s) to work with local communities to make sure the bot doesn't do anything the community would not agree with, and it makes sense to me to issue a global bot approval in this case. The bot has implemented shutdown mechanism, allowing enough people from the community to turn it off. The ja.wikipedia block was sufficiently clarified. Thanks for all you do. --Martin Urbanec (talk) 18:46, 4 July 2021 (UTC)
  • Support Support as a great example of very aligned organisations cooperating with mutual benefit. A couple of questions about making it work as well as it can:
  1. Is IA blocked in any countries? If so are these countries strongly associated with any particular language Wikipedia? Also I saw the issue with Japanese websites above, there's a similar issue with a lot of US based websites blocking EU + UK visitors over GDPR.
  2. Wikipedia is in over 300 languages, Is it possible there could be technical or other issues with IA hosting content in any of these languages or any new ones added (which I assume would automatically be added to the bot's list). If so how could WMF and others support IA to get support for these languages? In addition any other user experience issues with these languages in IA e.g interface labels etc that language speakers could help with.
  3. Assuming that the languages which this would be turned on for are the smaller wikis (number of articles and number of contributors). Is there a risk that IA will not archive the links which are likely to die? Is there a way to index which sites are used as references on these wikis and IA to back them up?
Thanks
John Cummings (talk) 21:14, 4 July 2021 (UTC)
Hello John Cummings, thank you for the questions. (1) The Internet Archive is blocked in China. We are not aware of it being blocked in other countries. (2) The Wayback Machine archives all websites as they are, including web fonts if there are any. (3) For the past several years, the Internet Archive has been archiving all external links posted on all Wikimedia wikis. Whether a link is posted to a smaller wiki or larger wiki should not matter in this case. harej (talk) 18:48, 7 July 2021 (UTC)
Yes/and... not only do we archive URLs as provided by Wikipedia's EventStreamAPI to the Wayback Machine but we also archive all "outlinked" URL from those URLs. User:markjgraham_hmb —Preceding undated comment added 03:57, 8 July 2021 (UTC).
Thanks very much Harej and markjgraham_hmb for the explanation. John Cummings (talk) 09:10, 8 July 2021 (UTC)
Support Support Definitely. Even though Internet Archive (more specifically the Wayback Machine) is blocked where I'm in, the bot does a very valuable task. Leaderboard (talk) 14:29, 5 July 2021 (UTC)
Support Support InternetArchiveBot trusted and useful for all wiki-- कन्हाई प्रसाद चौरसिया (talk) 15:49, 5 July 2021 (UTC)
  • Support Support The service that InternetArchiveBot provides is fundamental to the long term operation of Wikimedia projects. This collaboration between Internet Archive and Wikipedia is becoming a defining feature and flagship project of both projects. I hear Xover's criticism about the book linking problem and I confirm its validity. Despite the usefulness of this bot and the archiving project, this is a tool which makes millions of edits, and its activity is not neutral but rather contains elements of editorial creativity. I myself have criticized the InternetArchiveBot for its talk page behavior, but after conversation, Cyberpower and the wiki community collaboratively fixed the problem. These various problems need conversation, not a prohibition of development of the tool and its benefits. Of all the external partners that we could have to test out bot relationships with humans, Internet Archive is an ideal collaborator for this because they are mission aligned, have excellent resources to share, and this conversation is sincere and not forced. I would support major Wikimedia Foundation funding going into this project because it is so useful for Wikipedia's citation infrastructure, and because I like the idea of separation of power with off-wiki communities and projects in our shared infrastructure. Blue Rasberry (talk) 22:15, 5 July 2021 (UTC)
    @Bluerasberry: I would have been a lot more comfortable if this was an actual formal joint project between the IA and the WMF, where the WMF put up at least half of the direct funding (most of it, preferably; and with long-term commitment), and hence goals, priorities, and success metrics were negotiated between the entities. I also dearly wish the two organisations would collaborate a lot more on citations and bibliographic data (there are some incredible synergies and massive impact to be had there), because the book linking thing didn't have to turn into a mess!
    In any case… Quibbles aside, I agree with the points you make. Wayback Machine linking has much much lower potential for causing a mess than the book linking thing. It's just that that incident was enough to crystallise the underlying PAID/COI issue (that applies to both tasks) for me. For global bot permission I fall down on the other side from you; for any suitably scoped local BRFA I probably wouldn't. Thank you for articulating these (counter)points so well. Xover (talk) 08:02, 6 July 2021 (UTC)
@Xover: The Wikimedia Foundation is not the leader here, the community is. If the community and the Internet Archive drafts guidelines for a partnership between the IA and the Wikimedia Movement, then that is what makes the partnership, not permission from WMF corporate people. The Internet Archive has annual revenue of about US$20 million whereas WMF has annual revenue of US$180 million rising rapidly. The amount of funding that the WMF would have to put into this to make the tool highly effective is small, but the bigger cost is combining the brands and public images of the two organizations. If this relationship is worth developing, and I think it is, then the community can make public requests for money to go into this and the Wikimedia Foundation can answer that publicly. Putting all criticism on the table clearly is useful so that everyone can make an informed decision. It is not the place of the Wikimedia Foundation to decide which partners are ethically compatible with our movement.
Also about COI - remember that the Wikimedia Foundation has a COI too. The community's mission is to advance the Wikimedia Movement, and the Wikimedia Foundation has conflicting goals of both advancing the Wikimedia Movement and running a corporation. When there is a decisions about ethics versus corporate interest, the Wikimedia community chooses ethics more often and the WMF chooses corporate interest more often. I hear that you want WMF to lead a formal joint partnership, but enough years have passed in the IA / Wikimedia experiment to observe that WMF staff are unwilling to talk in recorded media about partnerships and a lot of other things. If there is going to be a partnership conversation, then it should start with IA and the community and the WMF can enter at the community's invitation. There is an extent to which the Internet Archive is a marketplace competitor with the Wikimedia Foundation in terms of money and the hearts of activists. There are lots of reasons why the WMF would not give money to an external org without a lot of obvious Wikimedia community support for them doing so. A relationship between the Wikimedia Movement and IA would be unprecedented. I would like one though, because this is a great partnership and because I am not afraid of division of power in the Open Movement. Blue Rasberry (talk) 11:58, 6 July 2021 (UTC)
@Bluerasberry: We're getting rather far afield, and probably outside the scope of this particular thread. So I'll limit myself to saying I agree, at least roughly, with what you've expressed here. I would just add that there is nothing preventing the WMF from actually being proactive here, and initiating a dialogue about such a partnership (as they've done with numerous GLAMs, Google, and others over the years) both with IA and the community. I hear Brewster is a bit busy lately, but I'm sure they could find someone to talk to there all the same. :) Xover (talk) 15:51, 6 July 2021 (UTC)
FWIW, I manage the Turn All References Blue project at the Internet Archive (as well as the Wayback Machine) and have been in close communication, and collaboration, with many staff members of the WMF (including some who have the term "partnership" in their title) for years. They are very supportive. And, if they want to help fund our work we would say thank you! But, we have never waited for funding from others to try to help make the Web more useful and reliable. User:markjgraham_hmb 20:59, 6 July 2021 (UTC)
Done Ruslik (talk) 15:55, 17 July 2021 (UTC)

Removal of global bot status

Removal of global bot status for Invadibot

Per Bot policy: Inactivity is deemed as any global bot account not performing any edits on any project where global bot flag is allowed for a whole year. The only edits within the last year have been on eswiki which is not a global bot wiki. The operator has agreed with the removal. Rschen7754 00:16, 23 July 2021 (UTC)

Done, thanks - QuiteUnusual (talk) 13:20, 23 July 2021 (UTC)

Bot status requests

TU-G205@ja.wikivoyage

I applied for authority on 10 days before, no opposition and 2user supports. --Mario1257 (talk) 10:28, 1 July 2021 (UTC)

Is this account really a bot account? Ruslik (talk) 17:20, 1 July 2021 (UTC)
Done Ruslik (talk) 09:03, 3 July 2021 (UTC)

PastooshekBOT@plwikinews

Pastooshek (talk) 16:20, 9 July 2021 (UTC)

Done Martin Urbanec (talk) 17:22, 9 July 2021 (UTC)

Removal of bot status