Anti-Vandal Bots

With the introduction of AbuseFilter, pretty much all of the things the current batch of AntiVandal bots can be implemented as AF rules. This also provides better fine grained control of actions/responses. I propose we put a moratorium on considering any more Anti-Vandal bots. Q T C 23:43, 17 March 2009 (UTC)

I haven't seen any serious anti-vandal bot requests lately anyway; have I missed one? I have seen (and denied) several that seem to be wanting a vandal bot, though. Anomie 03:02, 18 March 2009 (UTC)
Anything important enough to "coast through" BRFA is IMO important enough to apply WP:IAR to the entire approval process, as was done recently to add {{copyvio}} tags to several thousand stubs when it was discovered the creator copied the text from a book. I did offer to approve the bot to do the same thing for future incidents of this type so IAR wouldn't be needed, but I've received no response on that so far. Anomie 19:39, 22 March 2009 (UTC)

Global Policy Page

The GRU page needs updated. It says we allow per the Meta policy, but the Meta policy has expanded to include double redirects as well as interwiki links. Do we want to readdress our stance with the Global Policy, or should we update the Global rights page to indicate that it's allowed only for interwiki links as that is all that was agreed upon? Q T C 03:06, 30 March 2009 (UTC)

I say update the global rights page, especially since T19888 will probably result in double-redirect-fixing being unwanted. Anomie 03:13, 30 March 2009 (UTC)

Proposal to change the interwiki bot policy

Interwiki bot approval practice does not seem to follow the current policy, for good reason. I propose several clarifications to the bot policy and one extra requirement. First, understanding of target languages is not required to add reciprocal links. Second, the operator must be able to prevent an incorrect reciprocal link from being added again. This would result in fewer cases like the one described in this thread, where an article maintainer is annoyed by multiple bots. Third, I copied the policy on global bots into this section, where it is most relevant. I also removed some redundant phrases from the existing language. This is the proposed policy:

Operators of bots which create non-reciprocal interwiki links must be familiar with the languages to which they are linking. Bot operators running standard tools such as the pywikipedia framework should update to the latest version daily. If a bot adds an incorrect reciprocal link, its operator must be able to prevent other bots from restoring the link. Globally-approved interwiki bots are permitted to operate on English Wikipedia, subject to local requirements.

This paragraph would replace the two currently in the "Interwiki links" section of Wikipedia:Bot policy#Restrictions_on_specific_tasks. I don't know if any non-reciprocal interwiki bots exist or will ever exist, so that sentence may not be necessary. I also recognize that the extra requirement on bot operators to disable certain interwiki links is a controversial change that affects many current operators and could also be removed. Wronkiew (talk) 00:53, 1 April 2009 (UTC)

I support this wholeheartedly, though I'd like to suggest a couple of slight tweaks, if I may: firstly, I wouldn't use the word "reciprocal" - I know what you're on about, you know what you're on about, but we have to remember that English may not be the first language of interwiki operators and we should make it as obvious as possible. "Original" perhaps? Also, I'm a bit confused regarding the phrase "to prevent other bots from restoring the link": isn't that other owners' jobs? Obviously, the same bot should never re-edit the page and/or edit war. Anyhow, yes, I support this sensible change. - Jarry1250 (t, c) 19:00, 1 April 2009 (UTC)
Thank you for your comments. I think reciprocal and non-reciprocal are the most specific terms that describe the type of links these bots add. Would an initial sentence defining these terms address your concerns? My intent for addressing the incorrect link/edit war problem was to make the bot owners responsible for correcting the bot behavior rather than the page maintainer. I don't think pywikipedia checks page history, so operators have no way to keep their bots from re-adding incorrect links. However, if they do add an incorrect link, the operator can ensure that the link is not re-added by other bots. I have heard about a link blacklist, but I don't know if it is operational. Adding and commenting out the link may also help. Another solution might be to delete the link at the remote wiki. None of these can be done proactively, but operators should know what to do if they are notified of an error. Wronkiew (talk) 06:28, 2 April 2009 (UTC)
Here's a possible initial definition: "Interwiki bots maintain links among articles in different languages. Most commonly, a bot will search for pages to which other wikis link and ensure that the page contains links back to them; this is called reciprocal linking. If the page is not already linked from other languages, added links are called non-reciprocal." Wronkiew (talk) 06:39, 2 April 2009 (UTC)
Call me crazy but that defeats the whole purpose of interwiki bots if they can't add the links that are missing... It's not all that hard for anyone to make sure the edit war on a given page doesn't keep hapening, you go and remove the IW links from all the linked pages then a bot won't continue to readd them once removed from all pages. -Djsasso (talk) 14:31, 3 April 2009 (UTC)
Exactly. My intent here was simply to make the operator responsible for cleaning up incorrect links that their bot added, rather than the page maintainer. I had not intended to place any restrictions on interwiki bot operations, and I apologize if my wording gave you that impression. Are you concerned only with the sentence "if a bot adds an incorrect reciprocal link, its operator must be able to prevent other bots from restoring the link"? Do you have any suggestions for improvements to more clearly state the intent? If it's unclear, and cannot be stated simply, I'd be fine with dropping that one sentence and moving on. Did you have any concerns with the rest of the proposed policy? Wronkiew (talk) 21:52, 4 April 2009 (UTC)
I found at least one interwiki bot that adds non-reciprocal links, BotKung. I've also seen a few like it at bot requests. If no one objects, I'll merge in the proposed policy, minus the contested anti-bot-warring requirement, in the next couple of days. Wronkiew (talk) 07:43, 7 April 2009 (UTC)
This is done. Wronkiew (talk) 21:59, 11 April 2009 (UTC)

Clarifying where some kinds of tool fit

There are certain kinds of tool that aren't clearly defined under the present policy wording. At present, bots are basically automated tools, and tools whose edits require some element of interaction come under "assisted editing".

There are some tools that are presumably assisted editing but not clearly defined as such, where the user doesn't "interact" for each edit, because the task is very well defined for each run, requires no element of machine judgement, and all interactional data is specified in advance of the run.

What these have in common is the use of an automated tool, to perform at high speed a repetitious and non-controversial "housekeeping" task on multiple "targets". They don't really come under "fully automated bots" since there is user interaction - it's just front loaded at the point the run is executed, by the user individually checking each action will be okay.

I'd like to see under assisted editing, something to the effect: "It also covers tools that perform well-defined non-controversial actions requiring no further judgement, on a previously checked list of targets."

The fact the user has checked the proposed targets individually beforehand, is the key. Can the wording be made more watertight to prevent its abuse for mass actions though?

Hopefully this aim is sensible and not controversial. FT2 (Talk | email) 02:50, 20 April 2009 (UTC)

My understanding of the current policy is that the user has to approve each edit before the script commits it. Tasks that process a large number of simple edits at high speed, even if they are working off a user-generated list, must go through BRFA. I think the distinction is very clear in the existing language. Also, I reverted your change to the assisted editing definition for now. Wronkiew (talk) 04:29, 20 April 2009 (UTC)
It seems to me that there are several reasons for the current approvals process in this case:
  1. To some people, "checked" means "I generated this list from known criteria", which fails if corner cases were not considered. For example, I've recently had requests to tag every page in Category:Energy and subcats and every page in Category:NATO and subcats with the corresponding WikiProject banners; in both cases, it was not hard to find a page a few levels down in the category tree that had nothing to do with the project.
  2. High-speed edits can mean high-speed mistakes. While we can't catch every mistake someone's bot might make, the approvals process does help detect them and the requirement to manually approve each script-assisted edit (hopefully) handles those scripts.
  3. Users often do not consider how their idea may be controversial, and (when we do our job right) BAG will point this out during the approvals process. Restricting unapproved actions to "human" speed gives reviewing humans a chance to object before too big of a mess is made.
Is there something in particular this addition is aimed at? Anomie 12:07, 20 April 2009 (UTC)
It's aimed at cases where a fixed well defined action is required on many pages. This arises quite regularly in checkuser or oversight work, where a vandal may post many dozens of "outing" data, all of which need the same treatment. Other examples are as a developer of the SPI process, having to change the format of the {{RFCU}} "request for checkuser" template on all SPI subpages that used that template, when it was modified to allow a second "case letter" to be added. A further case was a process update that would be of relevance to a number of users (~30 from memory), all of whom needed the same note added to their talk page so they were aware. In each case the matter may be classified as "routine", and the exact actions to be taken are listed and individually reviewed before any are processed, up front. These tend to be non-controversial types of actions that just need doing to a list of pages. What's being asked is to add that if each individual edit is checked (ie no "assumptions") and the matter is non-controversial, they may use an "assisted tool" (at "human" and not "bot" speed) to do the actual posting for them once they have reviewed all the proposed items. I think this fits into the criteria and spirit of "assisted tools", but it would be good to add it explicitly.
Quick comments on the above - 1/ automated generation of "targets" isn't contemplated. Individual checking of each item is intended, before they are added. The tool acts purely as a saving on the repetitious activity, not a substitute for checking each case. 2/ See #1, 3/ Agree, this would appear to be an argument for speed restriction (as with all assisted editing) though, which is already in the policy. FT2 (Talk | email) 14:07, 20 April 2009 (UTC)
The only example you give where having to get bot approval would be troublesome is your oversighting example, and even there I think it likely that a bot tasked to generically "revert (if necessary) and oversight a specific list of revisions determined by an authorized oversighter" has a fair chance of being approved and would be more likely to correctly handle any edge cases. As for the rest, we already have bots approved to manipulate templates without having to worry about any random user's poor regex skills, and we already have many bots approved to deliver notices to talk pages. And I doubt doing either with AWB in manual mode (or the equivalent using any other script falling under the current assisted-editing criteria) would be excessively onerous, if it came down to that.
In general, well-defined non-controversial tasks by proven bot ops are precisely those that can easily sail through BRFA with little delay anyway, while ill-defined tasks and those by unproven bot operators are precisely those that should have more attention paid. Anomie 16:43, 20 April 2009 (UTC)

I used to use a tool I wrote called TINA (admins can view the page User:Ameliorate!/TINA). It worked by displaying a list of each changes it was going to make, as an example:

  1. Somepage: {{sometemplate|someparam=foo}} --> {{sometemplate|someparam=bar}}
  2. Anotherpage: {{sometemplate|someparam=foo}} --> {{sometemplate|someparam=bar}}
  3. Someotherpage: {{sometemplate|someparam=foo}} --> {{sometemplate|someparam=bar}}

So in a list of 1000+ edits it was easy to see any corner-cases and exclude them. As far as I was concerned this was acceptable under "... but do not alter Wikipedia's content without some human interaction" as there was human interaction, just it was done en-mass at the beginning. I never received any complaints about it, so I would support any change to explicitly allow this. ~ Ameliorate! 04:52, 20 April 2009 (UTC)

Back when you brought it up, I thought it displayed a bit more of a diff than that... Anomie 12:07, 20 April 2009 (UTC)
I think that this should be in the bot tasks category. Even though the script is presenting all the diffs for confirmation before it makes any edits, no allowance is made for feedback by other editors. If someone disagreed with the change and it was running unattended, the only way they could stop it would be to have your account blocked. Wronkiew (talk) 17:11, 20 April 2009 (UTC)
It stopped if anyone edited my talkpage. ~ Ameliorate! 23:24, 20 April 2009 (UTC)
That makes you a very responsible bot developer. I see your point, but I would consider the process you described to be unattended editing rather than assisted editing. Wronkiew (talk) 06:03, 21 April 2009 (UTC)
This is exactly the issue when trying to clarify the difference between a bot and a script. How/why is it different to manually approve 1000 changes in advance than to approve them all in real time as the script runs? Is there a difference between making a list of pages in userspace and using Twinkle BatchDelete on them versus making a list in a file and using a Python script to delete them? Mr.Z-man 06:36, 21 April 2009 (UTC)
Yes, there is a difference. If you review a list of articles and then start the edits, the delay between the review and the edit may become long and intermediate edits become likely, which may change the edit from the form initially reviewed. When the edit is generated "live" and reviewed at the moment before the edit is made, the delay should be small. Deletions in particular can cause difficulties if done in large batches - I recall one case where an image was up for deletion for some deficiency in source or copyright info. An editor fixed it, then the image was deleted a few minutes later by a script run by an admin who had reviewed the pre-fix version. As I recall, the list of images was fairly lengthy, so the delay between review and actual deletion was quite long; the servers may have also been backlogged and delayed script work. Gimmetrow 01:51, 28 April 2009 (UTC)
I've occasionally run semiautomatic scripts where I reviewed all the edits ahead of time, then the script made the edits for the next hour or so while I did something else (I was still online and available from my talk page in case of problems). The difference between what I was doing and what you describe is that the script was set up in such a way that any edit to a page between my review of that page and the script editing it would have resulted in an "edit conflict" (which would have prevented the script from editing that page at all and logged it), even if the edit happened half an hour before the script got to it, as long as it happened after my review. This is why we do not need a hard and fast rule here, otherwise you will encounter border cases that do not fall into your narrow definition of a semiautomatic script, but are in fact semiautomatic. In my opinion when the review happened should be irrelevant, so long as the reviewed version of the page to be changed is the same as the version that is in fact changed.--Dycedarg ж 02:11, 28 April 2009 (UTC)
I could see it being semiauto if the edit is reviewed, then sometime later, immediately before the edit is actually attempted, the page is downloaded again to verify it's the same as the formerly reviewed version. Is that what you're doing, and could we count on other scripts doing that? Gimmetrow 00:22, 29 April 2009 (UTC)
What the script I was using did was as follows: The timestamp of the revision I'd reviewed was compared to the timestamp of the latest revision of the page at the time the script was about to edit it, and if they did not match then the script didn't edit. So it accomplished the same thing. The asynchronous editing function in Pywikipedia does this automatically from what I can tell by looking at the code, so any semiautomatic script built using that would do it. As for other scripts, you can only count on this kind of checking if the person writing it was responsible enough to implement it. I would not have any problem with the bot policy specifying that this must be the case for these kinds of scripts; I would have thought that it would be a basic rule of common sense for this sort of thing anyway.--Dycedarg ж 05:09, 29 April 2009 (UTC)
Unfortunately, common sense is far from common. Anomie 12:23, 29 April 2009 (UTC)
Indeed it is. So how about a minor change to the policy inserting something along the lines of: "Edits to be made by semiautomatic tools may be reviewed en masse before a script makes the edits autonomously, so long as the pages are checked to ensure that no edits were made to the page in the intervening time between the review and the actual edit, and either the editor is available to stop the script or the script can be stopped by someone else (such as by editing the editor's talkpage) in case something goes wrong."--Dycedarg ж 21:05, 29 April 2009 (UTC)
And then someone will come up with some new corner case and we're back to square one. I think it would be best to not try to define such things and just handle them on a case-by-case basis. Mr.Z-man 21:21, 29 April 2009 (UTC)
Seconded. Anomie 22:45, 29 April 2009 (UTC)

Change to bot account name policy

Please see the discussion here. Wronkiew (talk) 06:41, 27 April 2009 (UTC)

Wikipedia:Bot Approvals Group/nominations/Tinucherian

This is a 'mandatory' notification to all interested parties that I have accepted a nomination to join the Bot Approvals Group - the above link should take you to the discussion. Best wishes, -- Tinu Cherian - 10:58, 1 May 2009 (UTC)

BAG nomination for Nakon

Per the required "spamming" of venues, I would like to bring attention to my nomination for the Bot Approval Group, which may be found at Wikipedia:Bot Approvals Group/nominations/Nakon. Thanks, Nakon 01:23, 17 May 2009 (UTC)

"Right of first refusal"

I think it would be good to clarify long-standing practice that if source code has been released by an author and the author is able and willing to run it on a specific project, they have the right of first refusal. Any objections to clarifying this? --MZMcBride (talk) 22:39, 27 May 2009 (UTC)

Yeah, I don't think this would be useful. It's not really a right. In some controversial cases, a task may require a particularly cautious and polite bot-runner, who may not necessarily be the person who wrote the code. Consider that in the date delinking case, we may find ourselves under the following restriction: "The Bot Approvals Group will require that the operators selected to perform any date delinking have a history of being able to handle complaints well, and willing to pause their bot when problems have been identified." This may well not be the author of the code, and I don't think giving the author "right of first refusal" would help the project. It's a good tradition, but a bad explicit guarantee. – Quadell (talk) 03:59, 28 May 2009 (UTC)
I don't think etiquette should be encoded in policy, although IMO an essay might not be out of place. Anomie 11:09, 28 May 2009 (UTC)
BTW, it would probably be a good idea to bring up the "Can you remain WP:CIVIL and respond promptly even if your talkpage is flooded with many repetitive uncivil complaints?" question on any "enforcement" BRFA. I'll have to try to remember that. Anomie 11:09, 28 May 2009 (UTC)

WT:Policies_and_guidelines#Subcats

Discussion about policy subcategories for several pages, including this one. As far as I know, this doesn't make any difference, except as a help to people trying to browse policy. - Dank (push to talk) 03:14, 9 July 2009 (UTC)

BAG nomination

Hello there. Just to let you know that I (Kingpin13) have been nominated for BAG membership. Per the requirements, I'm "spamming" a number of noticeboards, to request input at my nomination, which can be found at Wikipedia:Bot Approvals Group/nominations/Kingpin13. Thanks - Kingpin13 (talk) 08:01, 20 July 2009 (UTC)

P.S.

WP:BOTPOL is and has been in the "behavioral" policy template. I don't think it makes any practical difference whether the page is in the conduct cat or the enforcement cat ... I had guessed "enforcement" since all of the conduct policy pages seem to be primarily about user conduct. If there's no response here, I guess I'll update the behavioral template in a week or so. - Dank (push to talk) 18:30, 23 July 2009 (UTC)

For info: AUSC October 2009 elections

The election, using SecurePoll, for Audit Subcommittee appointments has now started. You may:

The election closes at 23:59 (UTC) on 8 November 2009.

For the Arbitration Committee,  Roger Davies talk 07:39, 1 November 2009 (UTC)

Urgent! Last call for votes: AUSC October 2009 elections

There's only one day to go! The Audit Subcommittee election, using SecurePoll, closes at 23:59 (UTC) 8 November. Three community members will be appointed to supervise use of the CheckUser and OverSight tools. If you wish to vote you must do so urgently. Here's how:

MBisanzTznkai;

  • Or go straight to your personal voting page:

here.

For the Arbitration Committee, RlevseTalk 17:02, 7 November 2009 (UTC)

PseudoBot deletes redlinks?

I see that User:PseudoBot deleted a redlink here - http://en.wiki.x.io/w/index.php?title=February_17&diff=322604317&oldid=322604294 - with the edit summary "page linked does not exist".
This would seem to be a straightforward contradiction with the editing guideline WP:REDLINK.
Is it desirable for Pseudobot (or other bots) to do this, or am I missing something here?
(I'm definitely not discussing the desirability or otherwise of the specific article Christopher Colella.)
Thanks. -- Writtenonsand (talk) 16:48, 10 November 2009 (UTC)

Yes and No. Blanket removal of red links is a BAD thing. But the bot is correct in removing these specific red links, It has a specific scope and only works on specific date related articles where red links should not exist. if you take a look at its contribs you will see it removes cruft while leaving the good things per related policy on those pages. βcommand 16:53, 10 November 2009 (UTC)
As Beta says; the bot is restricted to a very small set of pages with a tightly-defined format. In general, appropriate redlinks are no bad thing; the issue here is that individuals without an article are not candidates for the Births and Deaths section of the year pages. Note that the bot is removing the entry, not just de-linking it. Pseudomonas(talk) 16:56, 10 November 2009 (UTC)
I agree with the above, the bot is limited to a small scope and, within its scope, it is taking correct action. And, yeah, a bot that de-linked all redlinks would be a terrible idea. Useight (talk) 17:18, 10 November 2009 (UTC)
Thanks to all for response on this question. -- Writtenonsand (talk) 14:28, 11 November 2009 (UTC)

ArbCom election reminder: voting closes 14 December

Dear colleagues

This is a reminder that voting is open until 23:59 UTC next Monday 14 December to elect new members of the Arbitration Committee. It is an opportunity for all editors with at least 150 mainspace edits on or before 1 November 2009 to shape the composition of the peak judicial body on the English Wikipedia.

On behalf of the election coordinators. Tony (talk) 09:24, 8 December 2009 (UTC)

Harej running for BAG

This is due notification that I have been nominated to become a member of the Bot Approvals Group. My nomination is here. @harej 05:46, 29 December 2009 (UTC)

Proposed change to bot template

I've proposed a small change to the {{bot}} template which I think makes it more consistent: see here. Olaf Davis (talk) 10:12, 9 January 2010 (UTC)

List of admin bots?

Is the a list of bots with administrative rights? --Apoc2400 (talk) 15:59, 5 January 2010 (UTC)

Well I can give you a list of flagged bots with administrative rights, which I compiled by ctrl-f for "administrator" on this page. –xenotalk 16:06, 5 January 2010 (UTC)
User:AntiAbuseBot ‎(bot, administrator) (Created on 8 January 2009 at 07:43)
User:Cydebot ‎(bot, administrator) (Created on 7 April 2006 at 01:24)
User:DYKadminBot ‎(bot, administrator) (Created on 26 October 2008 at 04:04)
User:MPUploadBot ‎(bot, administrator) (Created on 11 October 2008 at 21:06)
User:Orphaned image deletion bot ‎(bot, administrator) (Created on 26 September 2009 at 08:30)
User:Orphaned talkpage deletion bot ‎(bot, administrator) (Created on 15 May 2009 at 11:26)
User:ProcseeBot ‎(bot, administrator) (Created on 19 January 2009 at 01:32)
User:Yet Another Redirect Cleanup Bot ‎(bot, administrator) (Created on 3 May 2009 at 01:55)

Thanks. --Apoc2400 (talk) 23:45, 5 January 2010 (UTC)

There is also Category:Wikipedia adminbots. Tim1357 (talk) 03:14, 11 February 2010 (UTC)
Let's take this opportunity for a moment of silence for RedirectCleanupBot. Useight (talk) 03:55, 11 February 2010 (UTC)

Tim1357 on the Bot Approvals Group

Hey Wikipedians, I am here to advertise my nomination to be on the Bot Approvals Group. Take a look if you have some time. Tim1357 (talk) 02:24, 16 January 2010 (UTC)

RFBAG

I am currently standing for BAG membership. Your input is appreciated. ⇌ Jake Wartenberg 02:53, 26 January 2010 (UTC)

BAG membership request

I have been nominated for Bot Approvals Group membership by MBisanz, and I am posting a notification here as encouraged by the bot policy. If you have time, please comment at Wikipedia:Bot Approvals Group/nominations/The Earwig. Thanks, — The Earwig @ 03:39, 3 February 2010 (UTC)

Application for BAG membership

I have accepted MBisanz's nomination of myself for membership of the Bot Approvals Group, and invite interested parties to participate in the discussion and voting. Josh Parris 03:02, 11 February 2010 (UTC)

Restrictions on specific tasks: interwiki -auto -force

Looking at Wikipedia:Bot owners' noticeboard#Reporting a bot and Wikipedia:Bot owners' noticeboard#New interwiki bots behaviour: problems seems to exist consensus on discuraging the contemporary use of -auto and -force parameters with interwiki bots. What do you think? Shoud we add it to "Restrictions on specific tasks"? -- Basilicofresco (msg) 11:03, 6 March 2010 (UTC)

It's already there, but it could be restated with different phrasing; those affected by said rule will often not have English as a first language. Josh Parris 11:19, 6 March 2010 (UTC)
Those operators who do that should have their bots blocked on sight, for disruption. βcommand 13:53, 6 March 2010 (UTC)

Editing logged out

Recently the AIV helper bots have been logging out and editing via IP accounts. This has been causing quite a bit of confusion (repeated posts to noticeboards etc.), and runs the risk of the IP addresses being blocked as unapproved bots. I'm not aware that any bots running on the toolserver have edited logged out, but I think it's against toolserver policy (and would be much more serious, since we don't want toolserver addresses blocked). I propose we make it explicitly clear in the bot policy that bots may not edit while logged out, and any that are know to should have an assert function added. Best, - Kingpin13 (talk) 14:55, 9 March 2010 (UTC)

Support that, seems common sense. — Martin (MSGJ · talk) 14:57, 9 March 2010 (UTC)
Support from me too. SGGH ping! 15:20, 9 March 2010 (UTC)
More support. Josh Parris 16:18, 9 March 2010 (UTC)

  Done No opposition, and is in line with current policies. - Kingpin13 (talk) 13:24, 16 March 2010 (UTC)

The Toolserver rules require that bots edit logged in. However, sometimes flukes happen, and personally I'd rather see a page updated with an IP than have the edit or action rejected altogether. --MZMcBride (talk) 17:01, 20 March 2010 (UTC)
Personally I think the idea of assert is to get the bot to log in again (when it realises it's been logged out), and then retry the edit, that way we get the page updated, and it's not done from an IP address - Kingpin13 (talk) 19:05, 27 March 2010 (UTC)

Proposed minor change

For the sake of clarity, in the following sentence:

"Note that higher speed or semi-automated processes may effectively be considered bots in some cases. If in doubt, check."

I recommend adding something such as "even if performed by a human editor". CopaceticThought (talk) 19:05, 20 March 2010 (UTC)

I presume you mean "even if performed from an account used by a human editor". Personally I think this is pretty obvious already, but don't mind adding this to make it clear. - Kingpin13 (talk) 19:45, 20 March 2010 (UTC)

question

Hi, if a bot is not adhering to the requirements and the owner says they don't care (the owner is using it to add a request to lots of talk pages quickly, just as a time saver, not related to the bots usual activity), I'm unclear what the next step is? 018 (talk) 16:23, 14 April 2010 (UTC)

In general the next step would be to get some input at Wikipedia:Bot owners' noticeboard or WP:ANI if you think it requires administrative action. In this case I don't think Jarry is saying he doesn't care, as such, anyway I'll go have a chat with him about it. However, if you're going to put yourself in the category of WikiProject members you should be aware that you may be subject to bot notifications about said project, unless you take the time to opt-out. - Kingpin13 (talk) 16:37, 14 April 2010 (UTC)
Maybe this policy just needs updating. Something about once you have a bot approved, you have Carte blanche to automate edits if you think that they are generally acceptable. 018 (talk) 16:45, 14 April 2010 (UTC)
.... Anyway, I've messaged Jarry - Kingpin13 (talk) 16:51, 14 April 2010 (UTC)
Seems Jarry is happy to get approval or use a different bot if he needs to notify project members again in the future. - Kingpin13 (talk) 17:13, 14 April 2010 (UTC)

Application for BAG membership (Xeno)

I have accepted Kingpin13's nomination for membership in the Bot Approvals Group, and per the instructions invite interested parties to participate in the discussion and voting. Thank you, –xenotalk 19:24, 17 April 2010 (UTC)

For some reason, we keep having bots editing while logged out. See the current ANI discussions, for example. One of the toolserver IPs was blocked for a day because of this. Given the large number of admins, most of whom aren't digging bots much, it would be simpler to deal with this type of issue if every bot prefixed its edits with its Wikipedia account name. With this approach, if a bot is editing while logged out, at least it's trivial to find the misbehaving bot and notify its owner. I see above that the toolserver policy forbids bots from edditing while logged out, but nevertheless that happens often enough, and the toolserver itself is not governed by en.wp policies, and enforcement of their own policies seems rather lax. Pcap ping 07:30, 17 May 2010 (UTC)

Well, only a small fraction of the bots edit logged out, and it's really only difficult to catch when they are clones. So I'm not sure I would support a policy effecting all the bots. Plus, it would be no more difficult to implement than for the operators to just fix the logging out problem. The problem is hardly any of them are active. --Kingpin13 (talk) 07:46, 17 May 2010 (UTC)

Proposed configuration tip

Bots which deal with replacing internal links (such as double redirects) should check to verify that their source information has been stable for at least 48 hours.

In light of this discussion, I propose that we implement the above text (or some variant thereof) into the configuration tips section.   — C M B J   23:13, 13 June 2010 (UTC)

Should it be more generic than just "replacing internal links"? Maybe something like this:
Bots performing operations that are difficult to find for reversion later, such as bypassing redirects, should verify that their source information has been stable for an appropriate period of time. The bot should also regularly publish logs of its actions to a subpage in their userspace with sufficient information for the relevant changes to be quickly found (e.g. "* [http://en.wiki.x.io/w/index.php?diff=prev&oldid=12345 Example]: bypassed redirects "foo" → "bar", "baz" → "baz (disambiguation)"").
Although, I note in this case it wouldn't have made much difference for DASHBot, as the problematic redirect had been in place for almost 2 months. Had Xqbot done so, it would have been safe. Anomie 23:29, 13 June 2010 (UTC)
As with most blatant vandalism, the source information that caused DASHBot's erroneous edit(s) was reverted within 24 hours—so it is clear that this policy will save us a lot of headaches in the future. I do like the idea of making the clause more generic. I also think that we should include some sort of conditional waiver, so as to not bureaucratically hinder future applicants from creating bots designed to combat vandalism in real time.   — C M B J   01:08, 14 June 2010 (UTC)
Feel free to figure out a good conditional waiver, I used the RFC 2119 definition of "should" without thinking about it. Anomie 01:28, 14 June 2010 (UTC)
While this suggestion has merit, Anomie is correct in saying it wouldn't have applied to DASHBot in this case. DASHBot acted based on Associative thinking which was stable for 2 months. The "culprit" was Xqbot which changed Associative thinking based on the vandalism to Magical thinking. Xqbot would have had to have this implemented to avoid the problem. As such, I'm notifying Xqt of this discussion. -- JLaTondre (talk) 02:44, 14 June 2010 (UTC)
Thanks for notifying me. In past I had some request according that bot script due to some vandalim problems. The first changes was made in pyrev:7499 which displays the target page with the bots comment. Then I found some vandalised pages using a redirect link by the toolbar. This was fixed in pyrev:7875. Now I am implementing a sort of vandalism checking if there is a redirect on a regular page which gives a mess trying to fix redirects to this nonsensical redirects. The fix is comming soon and I consider tagging a message to its talkpage (or I guess there is a project page for tagging potential vandalized pages, isn't it) for such cases which are definetly wrong. For the remaining case when a page becomes an redirect other than moving it, and the editor is an IP, I would also presume a vandalism and leave it (maybe for 48 hours ant tagg a notice if you wish it). -Xqt (talk) 16:55, 15 June 2010 (UTC)

Proposal to softblock Toolserver IP addresses

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
Consensus to softblock Toolserver IPs reached. Though Dispenser makes some legitimate points, it seems arbitary to restrict some servers but not others when editing logged-out is so strongly discouraged. The consensus here would suggest any such editing is undesirable. Bot/tool code is not static and can be updated to fit within the new paramters; equally, the block can be lifted should major problems make themselves known.
Could someone please send the relevant details to the Toolserver admins? I am away from home at the moment, making such things overly difficult. - Jarry1250 [Humorous? Discuss.] 09:57, 18 June 2010 (UTC)

Note: this developed as a reply to the above proposal, but it seems to have much more traction. Pcap ping 08:25, 18 May 2010 (UTC)
It's not only Toolserver policy, our Bot Policy also states "Bots must only edit while logged into their account, bots which often attempt to edit while logged out should use AssertEdit, or a similar function" and "A block may also be issued if a bot process operates without being logged in to an account, or is logged in to an account other than its own." AFAIK, the only reason not to block these IPs is that some tools that do no editing have no account and will malfunction if they detect a block. See also prior discussion at Wikipedia:Village pump (technical)/Archive 67#Bot editing while logged out and Wikipedia:Village pump (technical)/Archive 68#Toolserver IP editing logged-out again.
IMO, the proper solution is:
  1. Change this "consensus" mentioned at ANI from not blocking toolserver IPs to permanently soft-blocking toolserver IPs, to be enforced two weeks after step #2 is completed.
  2. Ask the appropriate toolserver people to send an announcement to toolserver-announce stating that, due to bots continually violating both toolserver and enwiki policy by editing while logged out, enwiki will be soft-blocking the toolserver IPs in two weeks, and that people running bots or tools that will adversely affected by this change should fix their code immediately. They could also add that bots/tools only not affected because it's enwiki should be fixed anyway, in case other wikis decide to follow suit.
  3. Two weeks after the announcement is sent (or the toolserver people refuse to send the announcement), issue the blocks.
Of course, the two week timeframe is arbitrary, and the whole deal is just to remove objections based on breaking those useful but poorly-coded tools. I personally would be in favor of blocking the IPs immediately and letting those tools malfunction until fixed. Anomie 12:37, 17 May 2010 (UTC)
Seconded. Josh Parris 01:40, 18 May 2010 (UTC)
I support Anomie's proposal CrimsonBlue (talk) 01:45, 18 May 2010 (UTC)
I don't really see why you need to warn people in advance, but there should be no issue with soft-blocking the Toolserver IPs. The one lingering issue is that pywikipedia throws a hissy fit, but that's just pywikipedia stupidity that can (and should) be ignored. They'll adapt. --MZMcBride (talk) 01:47, 18 May 2010 (UTC)
I believe it just throws a warning to not edit and forbids editing - read functions work fine. Josh Parris 01:59, 18 May 2010 (UTC)
I'd just hate for it to not go through because people object to blocking with no warning. I'd be happy if the consensus ends up to not bother with giving a warning. Anomie 04:22, 18 May 2010 (UTC)

I support this as well. The big issue that keeps recurring is with the AIV helperbots - but a fix has been available for those for some time now, and the logged-out edits keep happening. It obviously won't really get fixed unless we force the issue. I also support the two-week timeframe, even though this is existing policy - we haven't been enforcing it, so there would be complaints if we just sprung it on botops. Gavia immer (talk) 04:36, 18 May 2010 (UTC)

  • I'm also fine with blocking anon toolserver editing as more drastic measure for enforcing our and toolserver's bot policies. At least the AIV and ANI helper bots need to be fixed; both have been observed misbehaving recently. Two weeks lead time for enforcing something that's been policy for quite a while is reasonable. After looking at the various technical discussion pages, fixing the bots to "assert" their edits, i.e. check if they are logged in, is nowadays not a lot more difficult that fixing them to say who they are in the edit summary. Since some change is going be needed anyway, at least this is going to be "the right thing", as opposed to a leniency hack. Pcap ping 07:56, 18 May 2010 (UTC)

Note: As this proposal seems to be gaining steam, I went ahead and advertised this discussion at Wikipedia:Administrators' noticeboard/Incidents#User:91.198.174.202 (toolserver IP / AIV bot blocked), Wikipedia talk:Blocking IP addresses#Toolserver IPs, and MediaWiki talk:Blockiptext#Toolserver IPs. Anomie 04:42, 18 May 2010 (UTC)

I definitely support this; it's a very sensible way of preventing further mistakes and applying clue to bot operators who CBA to fix their tools. Happymelon 13:27, 18 May 2010 (UTC)

Probably a dumb question, but is there any reason why a non-bot user would be editing from these addresses? Jafeluv (talk) 13:41, 18 May 2010 (UTC)

I agree that bot owners who cba to fix the code for the new login should fix it or get soft-blocked. It's not like it's a huge code change. This, however, will affect tool assisted edits from non-logged in users. BUT, as per above question — why would anyone (bot or tool-assisted) edit from the toolserver while logged out? The good faith IP contributors don't use toolservers. So I see no reason why these shouldn't get blocked just to enforce logging in.  Hellknowz  ▎talk  13:50, 18 May 2010 (UTC)

I had to look up "cba" in the urban dictionary. Pcap ping 16:40, 18 May 2010 (UTC)
I'm confused as to how will it affect tool assisted edits? What cases are you thinking of. Q T C 06:17, 19 May 2010 (UTC)
I am assuming a tool run from toolserver would not have transparent IP and would be seen from WP servers as the toolserver IP. Even if the user is logged in the tool would need proper login and edit token cookies. I suppose the tools on toolserver are semi-automated assisted and used by their authors. Then again I don't know any assisted editing tool on toolserver that I would be able to use. I am assumning as lot because I do not know the exact details. In any case, I am guessing this would affect very few tools.  Hellknowz  ▎talk  11:16, 19 May 2010 (UTC)
If there were a way for a toolserver tool to legitimately proxy edits (AFAIK there isn't), it would have to supply XFF data (so the edit would be seen as coming from the user's IP rather than the toolserver's) to avoid being blocked as an anonymizing proxy, so the issue there is moot. Existing tools like Checklinks that allow you to make changes under your own account actually generate a "fake" edit form that is submitted in the manner of the standard Preview button by your local browser and therefore comes from your IP address rather than the toolserver's. Other tools just generate an [edit] link and let you perform the necessary wikitext changes manually, or perform the edits under a bot account dedicated to the tool. A toolserver tool that integrates with Wikipedia as a user script (that you put in your skin.js) would be fetching data from the toolserver via AJAX, but the actual edits would again be coming from your IP address (either by AJAX submissions to the API or by manipulating the text in the standard Wikipedia edit form) rather than the toolserver's. Anomie 17:03, 19 May 2010 (UTC)
The two tools I've used from there will not be affected. WP:REFLINKS just creates an edit link as you said, and User:Citation bot runs as itself from what I recall. Pcap ping 17:12, 19 May 2010 (UTC)
I had to patch pywikipedia for Checklinks (which includes a read-only bot component) to remove the block detection on the previous block. Those changes were gone when I switched over to the rewrite branch. — Dispenser 14:07, 22 May 2010 (UTC)

So far, this seems to be well-supported above. I've just advertised it at WP:VPR and WP:VPP, just in case people who don't watch ANI or IP-blocking-related pages want to comment. Anomie 15:14, 28 May 2010 (UTC)

Some probably dumb questions:

  • How big of a problem is this?
  • Which tools are doing it? Can we not tell which tools are doing it from their actions?
  • If so have we approached the owners directly? Or tried taking the tools down ourselves (presumably via the toolserver admins)?
  • Would a block of the offending owners, rather than the (theoretically innocent) toolserver, not be more appropriate?
  • How are tools editing anonymously from toolserver? I thought anonymous editing was not possible through the API. Are these screen-scraper type tools?
  • Is this a proposal for a permanent soft-block of toolserver? Or just until these bot owners no-longer "CBA"?
  • Is this a problem posed by tools operating from other servers (aside from toolserver)? If so, how are we addressing those owners? --RA (talk) 15:35, 28 May 2010 (UTC)
To answer your questions:
  • The problem is basically "some useful toolserver bot starts making edits while logged out, an admin sees an IP address making bot edits and blocks it, drama ensues".
  • We can (usually) tell which bot it is, AFAIK.
  • I'm sure the operators have been made aware of the problem, but it keeps happening anyway.
  • A block of the offending owners would do little good and would cause excessive drama, and would violate WP:BLOCK. Similarly, a block of the offending bot account would do little good because the problem is that the bot is editing without being logged into its account.
  • Anonymous editing is possible through the API.
  • Yes, the proposal is for a permanent soft-block of the toolserver. There is no reason any IP edit should be coming from the toolserver, so there is no reason to ever bother unblocking it. As it is a soft block, all toolserver tools will be able to continue to edit without issue as long as they are logged in, and read-only tools will also not be affected (as long as they're not being stupid by refusing to edit if their IP is blocked despite their not ever editing; the two-week notice in the proposal is to give these operators time to fix their broken read-only tools).
  • The toolserver is currently a special case, as it hosts a large number of tools and the current consensus (which we are proposing to change here) is to not block its IP. I suspect that a bot running from an IP address used only by itself (and possibly its operator) would be soft-blocked without the drama blocking the toolserver causes. If this situation occurs with some other server, the issue can be discussed in relation to that server at that time.
HTH. Anomie 17:44, 28 May 2010 (UTC)
Thanks.
"some useful toolserver bot starts making edits while logged out, an admin sees an IP address making bot edits and blocks it, drama ensues" - Eh ... don't you want to block the IP address? Won't drama ensue?
It seems like a technological solution to a human problem. Why can't a user that breaches policy be blocked? Or why can't a tool breaching toolserver's rules be taken down or it's operator's account suspended/closed by toolserver?
I'm not necessarily anti the proposal (and like Tisane points out, it might be a good safety catch for operators) but (a) if someone's bot is running awry and they persist in running it anyway, that's not on; and (b) I just want to trash out that there is no way direct way of dealing with the problem bots/operators.
(BTW - just tried editing via the API without logging in. Didn't have an edit token so it failed. Out of curiosity, how are these guys getting around that?) --RA (talk) 19:30, 28 May 2010 (UTC)
Drama ensues, because (afaik) no guideline currently says "block toolserver tools/bots that edit logged out". Although bot guideline is specific about being logged in.
The author and programmer of the tool may not be its user. So blocking author will do nothing for tool and end-users. Suspending tool-server account is very slow because it needs WP to contact TS who then need to do it. A soft-block is so much faster and more efficient.
You first retrieve the edit token and then use it in your next edit. It's just that it doesn't change each time if you are logged in. If the tool gets the token every time it would not even know it's logged out. Than again, that's what assert edit is for. Hellknowz  ▎talk  19:43, 28 May 2010 (UTC)
Ah, I had assert edit on by default. IP edit went though just like that. Thanks.
Well, as it is we're looking at having a discussion and then sitting on our hands for two weeks. In the mean time, looking through the contrib, it looks like it's essentially only one process on one IP that is the trouble maker. I may be wrong but between now and this time last year it looks like it was essentially only whatever posts to Wikipedia:Administrator intervention against vandalism and to Wikipedia:Usernames for administrator attention (which looks to me like the same process). There were occasional slips by others but it looks like that one process is the one big offender. I may be wrong. Correct me if I am.
The big gun approach of soft blocking toolserver would certainly work - but if it's only one process that is the only genuine offender then blocking everyone, including those that make the occasional genuine mistake, seems out of proportion. This process has been posting as an IP for over a year now so there's hardly a rush on. And since we're planning on doing nothing about it for two weeks anyway surely getting in touch with toolserver or account holder would be the quickest approach.
Anyway, that's my 2¢ - and it being said, I still wouldn't oppose a soft-block on toolserver. I just see it as needless act when there is a far more direct approach available. --RA (talk) 21:36, 28 May 2010 (UTC)
I think the reason for this is because it has gone on for so long without any more direct approaches fixing it, and this is an easy fix that will work without fail and anyone affected by "collateral damage" is doing something wrong anyway. To apply a metaphor, if you have a door that no one may ever use (which excludes fire exits, people may use them in case of fire) but people keep using it anyway, why not just lock the door? Anomie 22:01, 28 May 2010 (UTC)

I think it's a good idea; if I were a bot operator, I would want this in place just to avoid embarrassment if I made a coding mistake that caused the bot to fail to login. Tisane (talk) 17:09, 28 May 2010 (UTC)

Bots are trusted to make uncontroversial edits very fast, so having an IP suddenly do so will trigger watchlists, rchanges, a lot of attention and create a general mess. It's not even so much the embarrassment for operator as saving trouble for others. Hellknowz  ▎talk  19:45, 28 May 2010 (UTC)
FYI, I've sent toolserver-l and pywikipedia-l an email about this thread. It isn't in the archives yet...  — mikelifeguard@enwiki:~$  04:53, 31 May 2010 (UTC)

Could we simply limit it to the login servers? Bots aren't suppose to be run on any other servers, and blocking the web servers screws up some of my web tools. Requiring login for web tools requires an extra 2 requests per run and increase the attack area on the tool. — Dispenser 20:26, 7 June 2010 (UTC)

Which are the "login servers", and which are the "web servers"? In what way would blocking the web servers screw up your tools? Can your web tools not persist the login cookies between sessions? Anomie 01:25, 8 June 2010 (UTC)
The login servers are currently nightshade (91.198.174.201) and willow (91.198.174.202). Creating an edit form require all the fields present on the edit page to work properly. As an example if wpStarttime or wpEdittime are not present it will break deletion and edit conflict detection. Using a login session, persistent or not, increases the security attack area, not something desirable for a web facing tool. — Dispenser 14:08, 8 June 2010 (UTC)
Thanks; I now also see the list of web servers. I do see that wolfsbane (91.198.174.210) has more than a few edits, even though bots aren't suppose to be run there. Even ortelius (91.198.174.211) has an edit since it was brought online mid-May.[1]
As for creating an edit form, you do not need a login session or even the ability to edit to determine the proper values for wpStarttime and wpEdittime. wpStarttime is just the current timestamp when you load the revision data to start editing, and wpEdittime is the timestamp of the latest revision when you start editing (or the same as wpStarttime if the page does not exist). I suspect you can even get the starttimestamp from the API's prop=info&intoken=edit while blocked, as a quick glance through the code doesn't reveal anywhere in that code path that checks block status; it certainly doesn't check page protection status, as shown by this query (for us non-admins). The other fields are even easier, except of course for wpEditToken which you can never get "right" anyway.
I can only guess that you must be screen-scraping the edit form for a block to be a problem, but that has been deprecated in favor of the API for some time now and AFAIK has never been guaranteed to keep working. BTW, don't your tools already have those problems now when working on semi-protected pages? I see Checklinks gets wpEdittime wrong for A, for example. Anomie 16:28, 8 June 2010 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

I spoke to Dab on #wikimedia-toolserver connect, who said he would send the announcement in about 2 hours. Anomie 16:15, 18 June 2010 (UTC)

Thanks Anomie (also for fixing the discussion archive, the timer on my internet café was running perilously close), looks like DaB kept his word and sent it out (thank you DaB). So I guess the 14 days starts now. - Jarry1250 [Humorous? Discuss.] 21:26, 18 June 2010 (UTC)

Just a little poke here - it appears to me that this block hasn't been carried out yet, and it's been sufficient time, so can someone with the proper bits implement the block? Gavia immer (talk) 17:55, 7 July 2010 (UTC)

Will the blocking admin please also edit MediaWiki:Blockiptext to remove the toolserver from the list of sensitive IPs? Anomie 18:08, 7 July 2010 (UTC)
Actually, we're both too slow. Jarry1250 did it yesterday. Anomie 19:06, 7 July 2010 (UTC)
That's what I get for checking the individual IPs and not looking for a rangeblock. Such is life. Thanks, Jarry. Gavia immer (talk) 19:10, 7 July 2010 (UTC)

"Should"?

The policy uses the word "should" quite a lot (47 times in the current version). I interpreted the majority of these (39; the others seem to be figures of speech using the word in other ways) along the lines of RFC 2119: basically a requirement, but if you have a really good reason not to (or your bot predates the requirement) then the rule can be bent. However, I've seen others wanting to treat these as suggestions to be ignored on a whim. Opinions? Anomie 03:40, 15 July 2010 (UTC)

Policy on gazetteer content

You are invited to join the discussion at Wikipedia Talk:What Wikipedia is not#How is Wikipedia a gazetteer? How is Wikipedia not a gazetteer?. patsw (talk) 12:18, 19 July 2010 (UTC) (Using {{Please see}})

User scripts / user agent

My understanding of the policy on this page, specifically this sentence:

The majority of user scripts are intended to merely improve, enhance, or personalize the existing MediaWiki interface, or to simplify access to commonly used functions for editors. Scripts of this kind do not normally require BAG approval.

is that I do not need any specific approval for what I want to do: A simple Ruby script to pull some information off certain pages and compile it for my own use. However, because I wanted wiki markup, I did try to access the "action=edit" page (but not post back to it) and got a warning about not having my User-Agent set. I'll fix the issue about the User-Agent, but I want to make sure I'm not violating any policy. - Regards, PhilipR (talk) 02:31, 2 August 2010 (UTC)

If your script doesn't make any edits, doesn't request so many pages that it puts an undue load on the servers, and doesn't do anything else disruptive, probably no one will even notice. For details on the user-agent thing, see meta:User-Agent policy. Also, you should use the API or action=raw rather than trying to screen-scrape the edit form. Anomie 03:11, 2 August 2010 (UTC)

BAG nomination

Hello everyone! I have been nominated for the bot approval group and would like to invite you all to participate in the discussion at Wikipedia:Bot Approvals Group/nominations/EdoDodo. Thanks. - EdoDodo talk 02:46, 17 August 2010 (UTC)

Clarification regarding high-speed human editing

Original discussion

Proposal

Manual high-speed mass article creation/modification using a single manual technique (such as, pasting boilerplate templates) should be treated as semi-automated edits.

Given the heated issue, I suggest a simple and direct proposal. If we have not fully established what "semi-automated" stands for, then perhaps we can decide if high speed use of templates falls under semi-automation. Alternate solutions and proposed rewording are welcome.

  • Support. It requires scrutiny and care of operator/editor in making sure the new material is accurate. Judging by the above contributions, even manual human editing is not beyond making mistakes. I see no principal difference between using a script and filling fields of a template — the result is the same, if only editorially cumbersome to produce. In my opinion, any almost equal edits more frequently than 5-7 per minute for more than a couple of minutes is in par with semi-automation, though not "semi-automation" per se. —  HELLKNOWZ  ▎TALK 17:18, 22 September 2010 (UTC)
While it seems a bit like instruction creep, I suppose the clarification is needed to prevent wikilawyering. –xenotalk 17:19, 22 September 2010 (UTC)
Ditto. This is ridiculous, but apparently necessary. Anomie 19:09, 22 September 2010 (UTC)

Proposal 2

Human editors are expected to pay attention to the edits they make, and ensure that they don't sacrifice quality in the pursuit of speed or quantity. Bot-like edits (i.e. high-speed or large-scale edits that make errors an attentive human would not) are liable to be judged on the same terms as an automated or semi-automated bot, even if the methods actually used are ostensibly manual.
We could use the shortcut WP:Turing test ;)

The dilemma above is that it is not clear whether the restriction on semi-automated article creation applies to the "manual" bot-like template substitution this user was doing. The first proposal tries to cut the knot by expanding the restriction to manual large-scale article creation. This comes at it from the other direction, and basically says we can apply WP:DUCK instead of having to argue over the method used to make apparently automated edits.

I don't have specific examples, but I do recall seeing a few cases in the past at WP:ANI and elsewhere that effectively did this, with the resolution that the offender must be more careful to not make stupid mistakes. Anomie 19:46, 22 September 2010 (UTC)

Yep, I prefer this to the above which seems altogether too specific. –xenotalk 19:50, 22 September 2010 (UTC)
Sounds reasonable. The above proposal is also good, but a bit too specific. - EdoDodo talk 19:52, 22 September 2010 (UTC)
Well-phrased, support.—  HELLKNOWZ  ▎TALK 20:06, 22 September 2010 (UTC)
Support. d'oh! talk 16:45, 23 September 2010 (UTC)

This is nothing but Wikilawyering. Bad policy drive contributors away from project by taking away freedoms to edit.Starzynka (talk) 17:49, 23 September 2010 (UTC)

Of course, there would be no need for this clarification to the policy if you exercised due diligence with your boilerplate articles. You remain free to edit, but you are not free to create countless articles that require cleanup or A10 candidates that give no more information than was already contained in the list of settlements. –xenotalk 17:54, 23 September 2010 (UTC)

I see.Starzynka (talk) 18:01, 23 September 2010 (UTC)

I have to agree that even with all good faith and intentions, there needs to be discretion exercised. I agree with you on excessive fanboy episode lists and fictional characters filling an undue part of the encyclopaedia. And I support your work on adding material of actual real locations. But as long as it is approved by the community, as biased as it may sometimes be. —  HELLKNOWZ  ▎TALK 18:30, 23 September 2010 (UTC)

This is a really bad proposal. Firstly, good-faith improvement of the encyclopedia is not and should not be prohibited, although disruption can be prohibited. Secondly, using the bot policy to prohibit normal human editing is in error regardless. Gavia immer (talk) 20:17, 23 September 2010 (UTC)

This isn't about "normal" human editing. This is about "being so careless or inattentive that people could easily believe you're actually a bot" human editing. Anomie 20:33, 23 September 2010 (UTC)
I have to concur that this is not about "normal human editing", as the method, speed, and accuracy are all differing. This is about when editors cannot tell apart human edition from (semi-)automated editing. Furthermore, this does not prohibit good-faith improvements; if anything, this ensures that no harm comes from changes en mass. —  HELLKNOWZ  ▎TALK 20:38, 23 September 2010 (UTC)
Thing is, I have actually been threatened with being blocked before, purely because I was editing at high speed and an editor assumed that there must surely be something wrong with that. The threat didn't go anywhere, because such edits don't violate policy - but you can understand how I might think that it is a bad idea to change policy in order to make some third party's ignorance a sufficient reason for blocking. Gavia immer (talk) 20:58, 23 September 2010 (UTC)
This is not about giving more reasons to block, this is about giving clear references to discussion about being allowed to do what you are doing. If anyone threatens to block you, you can always point to your task's discussion and show that no objections were raised by the community. This is both for editor's and other's benefit. Editor benefits by having his tasks approved, so no objection are raised. And others benefit by having no doubts about the validity of such tasks or actions. This is what (I hope) the proposal is about. —  HELLKNOWZ  ▎TALK 21:11, 23 September 2010 (UTC)
The editor would benefit a lot more by not needing permission to improve the encyclopedia. Gavia immer (talk) 23:27, 23 September 2010 (UTC)
How do "large-scale edits that make errors" improve the encyclopedia? Mr.Z-man 03:03, 24 September 2010 (UTC)
They don't, and remember that I have no problem with preventing actual disruption - such as large numbers of erroneous edits. I am worried that adding more policy encrustation will make editors that much more likely to report or block those making large numbers of constructive edits, simply because they don't see what the editor is doing and "there must be something wrong here". Also, bear in mind that I'm responding to both proposals here; the one above, which luckily has not gained much traction, would subject editors to the bot policy simply for being efficient. I am aware of how bad high-speed screwups can be; I just don't see these proposals as the answer. Gavia immer (talk) 03:41, 24 September 2010 (UTC)

"large-scale edits that make errors". False. My stubs don't have errors. I just need to remove the bracket for some dabbed article names from the intro and infoboxes after creation that is all and if possible add some initial data. This is not a problem. Starzynka (talk) 10:41, 24 September 2010 (UTC)

My concern here is not just the MOS errors, but a lack of discussion with the WikiProjects and the community before creating the stubs on a large scale. The community involvement will avoid a major cleanup and waste of effort at a later time. Ganeshk (talk) 12:25, 24 September 2010 (UTC)
It certainly is a problem if you're going around creating large numbers of articles that need further editing to remove brackets and add actual data. Anomie 15:41, 24 September 2010 (UTC)
Looking in some of your older edits, many of the stubs you created are unsourced, that's a little more important than the subst issue. I certainly hope they don't have any factual errors, considering most of them contain only 1 fact. Most of them are probably orphans as well. In any case, are you actually planning on fixing them? The point here is that these types of errors, while not severe in this case, would not have been made by an editor paying attention nor a bot that had been properly approved. Mr.Z-man 21:44, 24 September 2010 (UTC)

Proposal 3

For the purpose of dispute resolution when there is uncertainty whether a series of edits is a done by an automated process, or a user editing manually at a bot-like speed, they may be treated as automated edits.

  • The above was cribbed liberally from Wikipedia:Requests_for_arbitration/Regarding_Ted_Kennedy#Sockpuppets which is the basis for WP:MEAT. The wording is concise, and captures the problem: It's not whether or not a bot or a human makes the edits that is the problem, it is the fact that very rapid, unchecked edits can cause a huge mess if they are not done properly. It really doesn't matter if a bot does the edits, or a person does them manually but so fast that it is impossible to actually check what they are doing, its inattentive editing that is the problem. --Jayron32 05:16, 25 September 2010 (UTC)
    It's not only speed that can be at issue. The relentless performance of edits exhibiting careless errors an attentive human would not make is an issue whether it's at a high or low speed. Anomie 17:23, 25 September 2010 (UTC)
    I support adding this to the wording of Proposal 2. However, "at a bot-like speed" should be omitted. Also, it should be "treated as semi-automated edits" as this is not true automation (i.e. unsupervised bot). —  HELLKNOWZ  ▎TALK 17:30, 25 September 2010 (UTC)

Notification when account is blocked?

Should the policy include directions on notifying anyone besides the blocked user if an account is blocked for operating as an unapproved bot, or an approved bot operating in unapporved ways? BAG, perhaps? Are there such procedures? --Bsherr (talk) 17:41, 4 October 2010 (UTC)

I don't think it's necessary to always drop a line, but if there were some reason you thought one was needed, you could post to WP:BON. If you block an unapproved bot (or an approved bot doing unapproved things), then directing them to BAG for approval is enough and then BAG will be engaged when that occurs. If they choose not to, then the bot would stay blocked, of course. –xenotalk 17:45, 4 October 2010 (UTC)

Clarification regarding high-speed human editing, redux

Previous discussion

I was looking back at this discussion last night, and decided to sleep on it without knowing that the bot would archive it before I woke up. I propose the following compromise wording to be added to WP:BOTPOL#Dealing with issues:

====Bot-like editing====
Human editors are expected to pay attention to the edits they make, and ensure that they don't sacrifice quality in the pursuit of speed or quantity. For the purpose of dispute resolution, it is often irrelevant whether high-speed or large-scale edits that involve errors an attentive human would not make are actually being performed by a bot, by a human assisted by a script, or even by a human without any programmatic assistance. No matter the method, the disruptive bot-like editing must stop or the user may end up blocked.
Note that merely editing quickly, particularly for a short time, is not by itself disruptive.

Anomie 17:59, 17 October 2010 (UTC)

"it is often irrelevant" — remove "often" — it is irrelevant, period. Otherwise, support as before. —  HELLKNOWZ  ▎TALK 18:31, 17 October 2010 (UTC)

Adminbots with inactive operators

Do we have a policy on adminbots whose operators aren't around anymore? I ran across User:Orphaned image deletion bot tonight; it's operated by Chris G: he hasn't edited for a month, and he's had fewer than fifty edits in the last four months. I'm uncomfortable with adminbots that do anything (except for very specialised tasks, such as updating the fully-protected Template:Did you know) without the oversight of an active human editor; would it be reasonable to block this bot and leave Chris a message of "feel free to unblock whenever you want; we just didn't want it to run when you weren't around to look after it"? Nyttend (talk) 01:06, 23 October 2010 (UTC)

While it would be preferable to have active users behind the adminbots, it seems to me that the bot should be allowed to continue unless there's an immediate reason to block it, such as, it's actually malfunctioning, or something raised on the talk page requies attention before the bot should be permitted to continue. Since everyone has the ability to examine bot edits, there's no need to speculate. --Bsherr (talk) 02:25, 23 October 2010 (UTC)
*cough* not inactive, just don't edit fairly often. I do respond to requests regarding my bots (the most common being undeleting images) on my talkpage, and generally I try to respond within 48 hours. --Chris 02:28, 23 October 2010 (UTC)

Gazetteer content discussion

You are invited to join the discussion at WT:N#What is the consensus on City articles?. patsw (talk) 01:21, 29 October 2010 (UTC) (Using {{Please see}})

Wikipedia is not prepared for a possible bot attack

I have a feeling that because anybody can make a (unathorised) bot, I believe that at some point there will be user(s) that will program a bot to spread true shit all over the project, and messing up all over and possibly spreading a virus. I think that the bots have a too low security level, and if a bot account ends up in the hands of a clever vandal, we may be faced with a major problem. I think we should take more care about the safety of our bots, to prevent possible abuse. MikeNicho231 (talk) 21:28, 19 November 2010 (UTC)

The last bot that started misbehaving was blocked 5 minutes into its edits. Unauthorised bots (which are just regular accounts/anon edits) show up on watchlists/recent changes so their edits will be highlighted very soon. Bots cannot spread malicious software directly as MediaWiki software prevents that. Currently bot operator is required to have sufficient edits, list enough function details, and run a trial before having their bot approved. What additional measures do you suggest we take? —  HELLKNOWZ  ▎TALK 21:40, 19 November 2010 (UTC)
A bot malfunctioning is not a major issue, as long as it does not behave in any malicious way. What IS concerning, is the risk of a human being programming a bot to behave in a way that may disrupt the whole Wikipedia. A person with some insight in computing will have little problem to do this. I want Wikipedia to improve bot safety in such a way that even the slightest deviation from the intended purpose of the bot causes the bot to be blocked from further work until it has been solved. MikeNicho231 (talk) 21:50, 19 November 2010 (UTC)
While the bot policy does not say "slightest deviation", a bot would be blocked if it performed a task it is not approved for. How do you propose to monitor the bot operation? —  HELLKNOWZ  ▎TALK 21:56, 19 November 2010 (UTC)
An automated script that should prevent any change to the bot tasks. If a bot is programmed into vandalising, spreading malicious software, or if an admin bot is programmed into deleting lots of pages, blocking users, and such, it could be very harmful to the whole community. MikeNicho231 (talk) 09:03, 20 November 2010 (UTC)
How would an automated script prevent changes exactly? To be precise, how is an automated process to recognise what edits are good and what are bad? —  HELLKNOWZ  ▎TALK 11:11, 20 November 2010 (UTC)
The "bot tasks" pages[I assume you mean BRFAs?] here are simply human readable descriptions of what the bots are supposed to do. Changing them will not change what the bot does. The bots are actually run from various servers, where only the bot's operators can change it. -- Cobi(t|c|b) 11:32, 20 November 2010 (UTC)

Mass article creation

Question: does "Mass article creation" also apply to mass category creation? Rd232 talk 14:20, 15 December 2010 (UTC)

The discussion and resulting consensus only considered article creation; mass content creation in other namespaces (reader-facing or non-reader-facing) was not discussed. Is there a particular case you have in mind? Anomie 15:39, 15 December 2010 (UTC)
yes, a particular case inspired the question. I would have thought the same principles applied. Rd232 talk 20:50, 15 December 2010 (UTC)
Whilst not following the actual case, I would too have thought that this logically applied to categories, if to a lesser extent. Perhaps another tweak is in order, that while namespace plays a role, one shouldn't think this is article exclusive guide and should exercise due discretion? —  HELLKNOWZ  ▎TALK 20:57, 15 December 2010 (UTC)
There is no particular reason the same principles shouldn't apply to reader-facing content in other namespaces (e.g. normal categories like those are reader-facing, but hidden maintenance categories or non-article categories probably aren't), but there isn't a consensus one way or the other yet that I am aware of. Feel free to start the discussion at WP:VPR or WP:VPP if you want. Anomie 21:25, 15 December 2010 (UTC)

RfC: BAG membership notification requirement

Proposal—Wikipedia:Bot policy#Bot Approvals Group should be amended so that when applying for BAG membership, editors are not required to post to WP:AN, WT:RFA and WP:VPM. ╟─TreasuryTagRegent─╢ 20:26, 4 December 2010 (UTC)

Comments from TreasuryTag

I was genuinely astonished to discover that editors nominating themselves for BAG membership are required to "spam" various major fora. What is the purpose of this? Surely anyone interested in the bot process will be watching this page, the nomination page, WP:BON, whatever. It cannot be necessary to clog up already congested noticeboards with this sort of very niche announcement? ╟─TreasuryTagRegent─╢ 20:26, 4 December 2010 (UTC)

Comment from X!

(short statement in response to TT, because I'm about to head off). It mainly was a result of anger with BAG members adding themselves, and people didn't like that. So they kind of went over the top to please everybody. I don't think that requirement is needed anymore. (X! · talk)  · @897  ·  20:31, 4 December 2010 (UTC)

Comment from Nakon

Support removing the requirement to post to WT:RFA and WP:VPM. I think it's fine to make an optional post to WP:AN as the nominations seem to be sparse enough to not be considered spam. Nakon 20:34, 4 December 2010 (UTC)

Comment from Anomie

Keep WP:VPM, as that's what a Village pump is for. And maybe WP:AN, as admins may know relevant history of a candidate but not be watching WT:BAG. No real need for WT:RFA, IMO. Anomie 20:41, 4 December 2010 (UTC)

@Dycedarg: Interesting bit of history there. At least with the "mass forum spam" people can't claim they didn't have the opportunity to comment. Anomie 16:31, 7 December 2010 (UTC)

Comment from Dycedarg

I remember when this was added. Although my memory on the details is fuzzy, essentially a bot got approved that shouldn't have, there was a kerfluffle which resulted in demands for the heads of all the BAG members, and what eventually resulted was a consensus that BAG had gotten to be too much of an insular group that promoted members into itself without community review or discussion, and was completely out of touch with the needs/wants/desires of the community at large. Of course, the reason for that happening in the first place (as it so often is when it comes to these things) is that that vast majority of people don't care about the bot approval group or bots in general until there's a rogue bot or some approved yet horribly ill-conceived bot, so the only people who watch bot-related pages are bot owners/members of BAG/random people who are interested, which hardly makes up a decent cross-section of the community. The easiest way to solve this was mass forum spam.

Of course, it doesn't work. The nomination which alerted you to this requirement has received all of 6 !votes, 3 of which are by current BAG members, and the other three were by editors who have run or currently run bots. All the !votes on all the BAG nominations that have taken place in the last year combined do not add to the number of !votes in a single average RFA. The community doesn't care, and all the spamming in the world won't make them care. The only thing the spam might accomplish is if someone who is utterly unacceptable runs someone might notice the spam posts and bring it to the usual voters' attention, assuming they didn't notice themselves. This might warrant keeping it at WP:VP and WP:AN, but I've personally never understood the part about posting at WT:RFA. It has nothing to do with that. This whole discussion might warrant greater community input, given that it was the community at large who implemented this, and such things generally shouldn't be undone by 5 editors on a talkpage no one reads. Of course, if you did bring it up somewhere else to gain more attention here, no one would care anyway...--Dycedarg ж 07:45, 7 December 2010 (UTC)

I think this was the bot that added protozoa or some such. A storm in a teacup, since the only real problem it created was deciding whether to delete all the articles it had created or merely most. Since deleteing them all is low cost, their must have been some benefit from the the articles saved. I also remember checking out the BRFA at the time, and it was quite reasonable. Since then we had a BRFA on a content creation vein which wanted to create a handful of pages (possibly only one long one) which was talked to death - by which I mean months, not weeks. Rich Farmbrough, 13:16, 17 December 2010 (UTC).

Summary?

Everyone supports removing WT:RFA. Opinions are split on WP:VPM (3 to 2 in favor of removing) and WP:AN (3 to 2 in favor of keeping), although no reasoning for removal was given while reasons for keeping both were presented. I've gone ahead and removed WT:RFA, but I've left the other two for now pending further discussion. Anomie 15:46, 15 December 2010 (UTC)

To be fair, I think I presented clear reasons for removal. ╟─TreasuryTagprorogation─╢ 23:33, 15 December 2010 (UTC)
"Surely anyone interested in the bot process will be watching this page, the nomination page, WP:BON, whatever."? Maybe, maybe not. Unlike RFA, hardly anyone seems to really pay attention to bot matters. An occasional post at WP:VPM and WP:AN is a minimal price to avoid claims that we do things without at least trying to involve the wider community. Anomie 02:03, 16 December 2010 (UTC)
So what you mean is, "I personally disagree with the reasons for removing WP:AN," not, "Nobody gave any reason for removing WP:AN." ╟─TreasuryTagbelonger─╢ 08:44, 16 December 2010 (UTC)
No. When I looked at the comments, I considered statements like "It's good" or "It's not necessary" to be personal opinion instead of reasoning. I perhaps incorrectly put your unsupported assertion that surely anyone interested will be watching bot-related pages into that category. But hey, now we have some of that further discussion I hoped for.
Back on topic, do you have any comment on the history presented by Dycedarg that lead to the current wide advertising of BAG nominations? That seems to indicate that at one point people did generally make your assumption and it was determined to be flawed. Anomie 12:24, 16 December 2010 (UTC)

Comment. I suppose later then never. WP:RFA is unnecessary, there is little relevance between the two. WP:AN should be kept, as BAG is "administrative" section and if anything sysops should weigh in, especially if they are aware of some relevant history. I am borderline on WP:VPM. After all the other sections, those interested should have seen the nomination. Perhaps if we need to trim, then VPM is the next in line. Rest are bot-related pages, and are fine. —  HELLKNOWZ  ▎TALK 10:19, 16 December 2010 (UTC)

Applying botpol to human edits

This is crazy. If we are to do that then we need to unify editing requirements for humans and bots. Having a two-tier system is bad enough, non-human prejudice is to be expected. But when we then say, "Oh and by the way if we wish it, you will be classed as bot, and subject to non-human editing restrictions" the thing becomes Kafkaesque. (Or perhaps I should say more Kafkaesque.) Rich Farmbrough, 11:20, 17 December 2010 (UTC).

The text you removed didn't create any sort of unified editing requirements for humans and bots. It just lets us avoid having to argue with someone over whether their series of disruptive edits that looks like a poorly-done bot is really a poorly-done bot or is actually a human failing an impromptu reverse Turing test, by pointing out that disruptive edits are disruptive edits either way and need to stop. Anomie 12:16, 17 December 2010 (UTC)
Makes sense to me. Rd232 talk 13:32, 17 December 2010 (UTC)
Have to agree on this; bureaucracy aside, we seem to need this addition since some high-speed editing users seem to think they are exempt from being meticulous. —  HELLKNOWZ  ▎TALK 13:49, 17 December 2010 (UTC)
Those who quack like bots, edit at the rate of bots, make errors like bots, shall be treated like bots. Don't see a problem. –xenotalk 14:07, 17 December 2010 (UTC)
  • After reading Rich's comments and others in the old discussion, I have to agree and wonder why this needed and why is it in a policy about bots? From what I read this proposal came about because of a user creating a few articles in a short period of time and not using any bots or AWB. I am not sure what the user did, but it is possible to create a few articles in a short period of time without using a bots, for example in the past I have redirected 13 articles in the space of three minutes without using a bot, AWB or any scripts. I did that by opening multiply edit windows, check each edit to make sure all are good, and finally save them all at about the same time. So for all we know these edits made by the user in question was checked and done in good faith, and since none of the articles was deleted I believe this is the case. In any case this issue is about reckless high-speed editing by humans not bots, so I oppose keeping it here in the bot policy, there maybe another place for this issue to be clarified, but it should not be in a policy about bots. -- d'oh! [talk] 14:34, 17 December 2010 (UTC)
    "check[ed] each edit to make sure all are good" <-- In that case, you aren't the one being addressed with this policy. –xenotalk 14:36, 17 December 2010 (UTC)
    Yes, exactly my thoughts. This is about editors who create problems by edits almost like bots, therefore the addition to this particular policy. What better place? —  HELLKNOWZ  ▎TALK 14:38, 17 December 2010 (UTC)
    You need to go back and review the situation a little more carefully. The problem wasn't that the user created a number of articles or did it at any particular speed, it's that he created articles with errors that any reasonably attentive and meticulous editor would have corrected before saving. Your statement "So for all we know these edits made by the user in question was checked" is incorrect, because we do know quite well that either the edits were not checked or the checking was faulty. Your statement "since none of the articles was deleted I believe this is the case" is poor reasoning, as it is entirely possible that the articles were fixed rather than deleted.
    The reason this is in the policy about bots is because this is one of the places people would look for policy in a situation like this: User A makes a bunch of edits with errors that make him look like a poorly-run bot, User B accuses User A of running an unauthorized bot, User A responds "No, I did those edits manually, I swear!", then follows a long unproductive discussion trying to determine just what User A did and whether or not that really qualifies as a bot (with reference to this policy) and so on when the real issue is that User A needs to stop making these bad edits whether or not they were done by hand or with some degree of automated assistance. Anomie 15:27, 17 December 2010 (UTC)

For me it's less about what the user's doing, than how responsive they are to suggestions that they should desist. Anyone might make a lot of bad edits quickly; they need to be treated as a bot rather than a human if they've responded to criticism like a bot does (i.e. not) rather than like a human does (i.e. stop what they're doing and/or demonstrate that there's consensus for doing it).--Kotniski (talk) 15:37, 17 December 2010 (UTC)

Although there was discussion along those lines in the original proposal, nothing in the final version being discussed here says to treat anyone like a bot. It just points out that it is normally irrelevant whether disruptive editing is automated or manual, as the consequences are the same. Anomie 15:51, 17 December 2010 (UTC)
But not all bot (or bot-like) editing is disruptive. The case to be considered is where no-one actually objects to the edits themselves, but is concerned merely about the fact that they appear to be being done by an unapproved bot. For me the Turing test in this case is less to do with what the edits are, and more to do with what answer I get (if any) if I leave a query about it at the account's talk page.--Kotniski (talk) 16:45, 17 December 2010 (UTC)
Actually, do I mean that? Remind me - why do we have a bot approval policy anyway? Does anyone actually look at the source code to check it won't screw up? No, right? So all that's actually being approved is the task, not the method? So rather than having a bot policy, shouldn't we have a mass-edit policy, to help determine when any mass editing campaign is/is not appropriate? That would then be separate from the process of getting a bot flag, which would merely reduce to convincing the BAG that the task is automatable and showing them some test edits to show that the program does what it's supposed to do. If an account is making obviously silly edits then you're right, the consequence should be that it gets blocked, regardless of whether it's a malfunctioning bot, an extremely careless person or a deliberate vandal. But if it's not quite so obviously silly, then we should leave them a query first and see if they stop and discuss, thus avoiding an unnecessary block.--Kotniski (talk) 17:28, 17 December 2010 (UTC)
If there is nothing disruptive about the edits, then this new section doesn't really apply. That's not to say people shouldn't discuss whether the user is running an unauthorized bot or not.
Sometimes yes, the code is actually reviewed. It's up to BAG to decide whether a review of the code is required before approval, but anyone can ask for code (if it isn't already posted) and review code if it is. Anomie 17:51, 17 December 2010 (UTC)

Converting HTML name codes to Unicode

Is there a policy about whether bots may convert HTML name codes, such as "&Nu;", to Unicode characters such as "Ν"? Some of the Unicode characters are hard for a human editor to identify in editing software, while the HTML name code is easy to identify. Thus, such conversions make articles hard to edit. Jc3s5h (talk) 02:12, 7 March 2011 (UTC)

The bot would first have to be approved to do this, which may fall under the heading of "general cleanup" that some bots have been approved to do in conjunction with their more substantial edits. Also, no bot may do this as the only edit to a page or as part of an edit that contains only other "trivial" changes. Whatever conversions are done must also be in line with the relevant guidelines and policies that apply to all editors. Anomie 04:47, 7 March 2011 (UTC)

Rewording of "Spell-checking" into "Context-sensitive changes"

{{rfctag|policy}}

Current version: Spell-checking

Bot processes may not fix spelling or grammar mistakes or apply templates such as {{ww}} in an unattended fashion, as accounting for all possible false positives is unfeasible. Assisted spell- and grammar-checking is acceptable when done with due diligence, and may or may not be considered a bot process depending on the editing rate. Such processes must not convert words from one regional variation of English to another.

Proposed version: Context-sensitive changes

Unsupervised bot processes should not make context-sensitive changes that would normally require human attention, as accounting for all possible false positives is generally unfeasible. Exceptionally, such tasks may be allowed if – in addition to having consensus – the operator can demonstrate that no false positives will arise (for example, a one-time run with a complete list of changes from a database dump) or there is community consensus to run the task without supervision (for example, vandalism reversion with a community-accepted false positive rate).
Examples of context-sensitive changes include, but are not limited to:
  • Correcting spelling, grammar, or punctuation mistakes.
  • Converting words from one regional variation of English to another.
  • Applying context-sensitive templates, such as {{weasel word}}.

The proposal is to clarify that bots cannot make "human-like" decisions in context-sensitive content. Currently, the passage mostly addresses spelling. However, same principle applies to tasks that are not necessarily spell checking. Since there is no BOTPOL point that addresses this, BAG is left to come up with all kinds of possible false positives, whereas it should be the operator that demonstrates that the bot is "harmless", i.e. does not do false positives.

It has been voiced and hinted that BOTPOL needs an overhaul. So let's start here and hopefully write a better page. I put some thought into the wording, so if there are specific issues, please do bring them up. I will leave this for comments first before advertising broader. —  HELLKNOWZ  ▎TALK 14:25, 6 March 2011 (UTC)

In general, I think it makes sense. Two comments though:
  1. I'm not fond of the use of "problem" in "Applying problem templates" as it reads as a value judgment on the template which is outside the scope of this policy. Though redundant, it might be better to change it to "context-sensitive problematic" templates to avoid that (or find other wording).
  2. I always read the "Such processes" in the original "Such processes must not convert words from one regional variation of English to another" as being the assisted editing processes in the prior sentence. That is not supposed to be done with a bot or non-bot. Including it in the bullet list is fine, but I think it should also be included at the end of the assisted editing paragraph (similar to the current version).
-- JLaTondre (talk) 15:08, 6 March 2011 (UTC)
I thought changing WP:ENGVAR per WP:TIES or for consistency is fine, even with tools of "assisted editing"? At least, I never read it as forbidden. The current version was inserted here, so I cannot say for sure what it meant. I placed a clarification sentence in italics above, but let's see if anyone clarifies as to how it is actually supposed to be interpreted as. —  HELLKNOWZ  ▎TALK 15:39, 6 March 2011 (UTC)
Not sure of the specific rationale behind that addition, but we have had cases in the past of people making mass changes of regional variations. I was referring to such mass changes and I think you are correct if instead it was per WP:TIES, it would be okay. I like your italicized addition better than my suggestion. Actually, in reading it, it occurs to me that it address my concern much better and that if people wanted to get rid of the WP:ENGVAR example, I'd be fine with it. A specific example is less important that simply saying semi-automated edits still need to have consensus and changes likely to be contested should not be made. -- JLaTondre (talk) 17:32, 6 March 2011 (UTC)
I got rid of example to not get into specifics. I added "large-scale" so that this implies bots rather than being BOLD or IAR. —  HELLKNOWZ  ▎TALK 18:20, 6 March 2011 (UTC)
It works for me, as long as the italic formatting is not part of the actual proposal. ;) I don't know about the original intention of the flagged clause, but changing it to "follow WP:ENGVAR" works for me too. Anomie 15:52, 6 March 2011 (UTC)
Italic part got added after JLaTondre's comment. The "follow policies" bit is really just a reiteration of existing principles. "changes likely to be contested", though, is something BOTPOL didn't say before; so that would be new. —  HELLKNOWZ  ▎TALK 18:20, 6 March 2011 (UTC)
I think this is mostly good. The second sentence could use some tweaking. Right now it sort of sounds like if you can demonstrate no false positives, you don't need consensus, which obviously isn't the case. I'd either clarify it to note that all tasks require consensus, or just leave it out entirely, since the rule that all tasks require consensus is stated elsewhere in the policy. I'd get rid of the last 2 sentences entirely, they're not specific to context-sensitive changes and redundant to the section that it links to. Just a short note "All assisted editing tasks must follow the rules", if anything at all, should suffice. But there's no need to summarize other parts of the same policy. Mr.Z-man 16:46, 8 March 2011 (UTC)
I reworded the second sentence to make it clear consensus is needed. The thing is, the last two sentences were added exactly to re-iterate that consensus is still needed, no matter the specifics. But it is true, that the "Assisted editing" section already says "Contributors intending to make a large number of assisted edits are advised to first ensure that there is a clear consensus that such edits are desired." Per you, anomie, and redundancy to the latter section; I'll remove those two then. —  HELLKNOWZ  ▎TALK 17:14, 8 March 2011 (UTC)

I boldly added the wording to the policy. As the previous version was done by a single user without prior discussion (afaik), I believe this RfC + several responses + what BAG usually does anyway is at least more suitable. —  HELLKNOWZ  ▎TALK 10:47, 6 April 2011 (UTC)

Wikipedia:BOTPOL#Mass_article_creation (REVIVED)

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
I was asked to take another look at this. Based on the further discussion it looks like that Rd232's second set of proposals has decent support. Certainly, Expand the mass creation policy to cover categories which apply to mainspace has clear support. I'd suggest the "pre-approval" part suffered from being vague - so perhaps proposing a piece of text to put on the page would be appropriate as the next stage for that :) --Errant (chat!) 08:16, 7 April 2011 (UTC)

The Bot Policy currently contains an injunction against mass article creation (Wikipedia:BOTPOL#Mass_article_creation); it is proposed that this injunction be extended to cover all namespaces. Mass creation of pages in any namespace may be (unintentionally) disruptive, and should be discussed with the community first. It may be argued that certain namespaces (eg userspace) should be exempted, but the starting point for debate should be that it be applicable to all namespaces, and a case made for any exemptions. Rd232 talk 19:07, 19 January 2011 (UTC)

Before we get started on the merits, are there any circumstances under which mass creation has been regularly done in the recent past? Skomorokh 19:14, 19 January 2011 (UTC)
I'm not sure if that's relevant; if it's done regularly it's probably OK, since if it weren't it presumably wouldn't be repeated due to backlash after the first time. Rd232 talk 19:38, 19 January 2011 (UTC)
You're proposing to ban all instances of mass creation above; if you mean to include caveats best do so now while the discussion is young. See Kingpin13's comment for a related potential issue. Skomorokh 20:44, 19 January 2011 (UTC)
Wasn't a caveat. Generally, mass creation tasks will turn out fine: however the trouble is the ones that don't are so disruptive that pre-approval is a good idea. Repetition creates a sort of (weak) implicit pre-approval for future repetition, but it doesn't make any difference because the task still needs approving the first time, and presumably that approval holds for future instances as well. Rd232 talk 21:30, 19 January 2011 (UTC)


(edit conflict) Support including other namespaces into the wording by default. Article space may be the most disruptive if erroneous pages are created, but this should not mean leeway is given to other namespaces. Basically – the more potential for disruption, the less allowance is given before a prior discussion is expected. —  HELLKNOWZ  ▎TALK 19:20, 19 January 2011 (UTC)
The primary namespace I would exclude is category, Often large categories are broken down, most commonly by date, and a 100 categories might be called for. ΔT The only constant 19:41, 19 January 2011 (UTC)
An exception might be engineered for that (though it might be tricky to word). Rd232 talk 21:25, 19 January 2011 (UTC)
Eh, not convinced by this. Article and category would be the main ones for me, so I'd support adding category to this. But I'm thinking of a few situations where a user might want to use a semi-automated method for creating pages (e.g. tagging a long list of sockpuppets with a sockpuppet tag, although I'm not sure if they currently do this over at SPI or not, but you get the general idea) - Kingpin13 (talk) 19:44, 19 January 2011 (UTC)
I can think of another. Tagging articles in a category with a project banner, if, as likely, a significant number of talk pages do not yet exist, would potentially result in mass creation of talk pages. This editing is productive, not disruptive, and should be encouraged. --Bsherr (talk) 19:59, 20 January 2011 (UTC)
If there are only a handful of tasks like that, they can be enumerated as exceptions. I can't think of any others. Rd232 talk 18:25, 26 January 2011 (UTC)
Yes, we do this at SPI. I actually have a script (two, actually) for that... T. Canens (talk) 22:45, 20 January 2011 (UTC)
Is there an actual issue that this change is proposed to address or is this just a WP:CREEPy solution looking for a problem? Jim Miller See me | Touch me 20:48, 19 January 2011 (UTC)
Yes, it theoretically could be disruptive, but off the top of my head, I can't think of any instances where it actually was. The only sort-of-instances I can think of were from bots that were approved, albeit with insufficient review. Mr.Z-man 20:55, 19 January 2011 (UTC)
Wikipedia:Administrators'_noticeboard/Incidents#Automated_creation_of_incorrect_categories. Rd232 talk 21:25, 19 January 2011 (UTC)
that seems the only known case of a problem. It does not make sense to change established policy do deal with the possible recurrence of an isolated example. That's what we have WP:IAR for. DGG ( talk ) 00:36, 20 January 2011 (UTC)
There is already an edit restriction for that case. The proposed policy change is because the case (which surely isn't the first example) shows there is a need to extend the policy. Rd232 talk 13:51, 20 January 2011 (UTC)

I have several times suggested, and continue to support a restriction on category creation. This applies to categories that are visible to the readers on articles. Ill-conceived categories are an old, ongoing bugbear to the few who try to maintain the category system. The creation of categories by people how don't appreciate the issues involved means greater amount of work for those who fix it up afterwards. Large-scale creation of ill-conceived categories is disruptive.

I take User:Δelta's point. Some editors, like User:Δelta, know what they are doing, and we don't want to hinder good work. I therefore suggest that rapid creation of mainspace visible categories be restricted to specifically approved bots, OR to specifically approved individuals. --SmokeyJoe (talk) 21:37, 21 January 2011 (UTC)

That could work. The point is to disallow the current "anyone can do any mass creation, as long it's not in mainspace" approach; if the community wants to declare "we trust User X generally on this issue", that's fine (not least because it allows that declaration to be potentially revisited if necessary). Basically, make the default "no to mass creation, unless explicitly allowed for a task/account/user", rather than the current "yes to mass creation [outside mainspace], unless disallowed by edit restriction". Rd232 talk 18:21, 26 January 2011 (UTC)

Overall, this sounds CREEPy to me, as a solution in search of a problem. Additionally, I don't think that enough examples have been considered. Should special, advance approval for "mass creation of talk pages" be required by bots that leave messages for users or at articles? What about uploading images? Is that "mass creation of file pages"? WhatamIdoing (talk) 20:34, 26 January 2011 (UTC)

As the RFC formulation indicated, whilst expanding the policy to cover all namespaces may be simpler, we can also just expand it to specific namespaces (eg categories) or to all except some (eg except talk pages and user talk pages). The problem exists, and whilst it may not occur frequently, risk is harm times probability, and it's in the nature of the issue that it involves a lot of work to deal with when it happens. Rd232 talk 21:48, 26 January 2011 (UTC)
After hanging out at CfD for a while, I think I see that a lot of work is done by a few in tidying up ill-considered category creations. A particular difficulty with categories is that they cannot be just moved. Speedy renames may be possible, or a CfD discussion is required, but in either case, listing, notification, waiting, and administrative tools are required. There is no doubt that a problem, the wasting of experienced category editors time, exists. This point has been raised many times. A recent case where an editor, in a bot-like manner, created a massive list of categories littered with misspellings and ill-considered categories brought this here.
Another point to make about categories that appear in mainspace is that they assert a fact about the subject that is not explicitly cited or placed in context. These ill-advised categories not only confuse the article reader and the category system, they often violate WP:NPOV.
Approaching this as a small change (a small addition to the text of the policy) for a well defined problem, I suggest only that this new restriction apply to categories that appear in mainspace. Not hidden categories, not talk page categories, not usespace, not file pages. It would ask that anyone mass creating these categories make a prior proposal. Explicitly exempted all administrators, because they are capable and therefore expected to clean up their own messes. Exempting the occasional non-admin, bot wielding editor like Delta should not be an issue. --SmokeyJoe (talk) 21:06, 26 January 2011 (UTC)
I'm fine with extending the policy just to categories which apply to mainspace, but unfortunately I don't think we can give any blanket exemptions - any individual wanting to be exempt from clearing each specific mass creation task with the community would have to ask for "pre-approval" to do whatever they see fit, which the community may or may not grant (and may later withdraw). This RFC was prompted by an admin mass-creating categories. Since the RFC is running out of steam a bit, I'm going to make this a specific proposal. Rd232 talk 21:48, 26 January 2011 (UTC)

Proposal

As an alternative to the possibility of expanding the policy to all namespaces (which perhaps raises more complexity with the need for exceptions than I'd anticipated), here's an alternative proposal:

1.   Expand the mass creation policy to cover categories which apply to mainspace (that is, categories which appear at the bottom of articles, not including hidden maintenance categories).
2.   Add a "pre-approval" provision to the policy. Any individual wanting to be exempt from clearing each specific mass creation task with the community would have to ask for "pre-approval" to do whatever they see fit, which the community may or may not grant (and may later withdraw).

The two parts of the proposal are independent, and we come up with a better name than "pre-approval".

Rd232 talk 21:48, 26 January 2011 (UTC)

I agree.
"pre-approval" suggests that there is a specific location/noticeboard to request approval, or at least to state an intention and wait for any objection. Where would this be?
A counter-point given above was this: "Often large categories are broken down, most commonly by date, and a 100 categories might be called for. ΔT The only constant 19:41, 19 January 2011". I find it unpersuasive. The breaking down of existing categories into hundreds of subcategories sounds exactly like the sort of thing that should be proposed before doing it. --SmokeyJoe (talk) 15:14, 2 February 2011 (UTC)
Support 1. Neutral on 2. --SmokeyJoe (talk) 11:11, 15 March 2011 (UTC)

This RFC was archived without closure. I was asked to have a look at it & perhaps close; as there seems to be something still worth discussing here I resurrected the thread for further comment :) --Errant (chat!) 10:41, 9 March 2011 (UTC)

  • Support 1 as before. Not sure what pre-approval intends though? Pre-approval before some other approval? Is it not just saying "get consensus for your change" in a very wordy manner? —  HELLKNOWZ  ▎TALK 10:53, 9 March 2011 (UTC)
    I think the "pre" prefix is redundant, but is used to emphasize that approval to mass create must precede the mass creation, if I may guess at interpreting Rd232's words. --SmokeyJoe (talk) 10:48, 15 March 2011 (UTC)
    Isn't it just "The community has decided that any large-scale automated or semi-automated article creation task must be approved at Wikipedia:Bots/Requests for approval." reworded to not include BRFA? —  HELLKNOWZ  ▎TALK 10:52, 15 March 2011 (UTC)
    Not sure about that summary. You've dropped the reference to categories.
    2nd bite: The "pre-approval" is about exempting specific individuals from the expanded restriction under #1. --SmokeyJoe (talk) 11:09, 15 March 2011 (UTC)
    That "summary" is from the current WP:BOTPOL wording on "Mass article creation". I did not write that. —  HELLKNOWZ  ▎TALK 11:11, 15 March 2011 (UTC)
  • Support both, with the immediate pre-approval of WikiProject tagging (and possibly others). עוד מישהו Od Mishehu 05:50, 15 March 2011 (UTC)
    Is that to allow creation of WikiProject talk pages in order to allow the tagging of WikiProjects on their talk pages where the talk pages don't already exist? I think this is unnecessary because the proposal only concerns things that appear on mainspace pages. --SmokeyJoe (talk) 10:55, 15 March 2011 (UTC)

Anti-archive timestamp. —  HELLKNOWZ  ▎TALK 10:55, 6 April 2011 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Further clarification to mass article creation

OK, that may be enough, but to follow up point 2, how about this:

Any individual wanting to be exempt from clearing each specific mass creation task with the community would have to ask for "pre-approval" to do whatever they see fit, which the community may or may not grant. Such requests should be made at BRFA, and must be advertised at WP:AN. A pre-approval may subsequently be challenged in the same way, and is considered immediately suspended unless or until there is consensus for it to be renewed.

I'm not sure it's worth the complication, but that's what I was thinking of. Rd232 talk 10:57, 7 April 2011 (UTC)

BAG nomination

I'm required by BAG policy to notify this noticeboard of my nomination for BAG member. Headbomb {talk / contribs / physics / books} 07:44, 20 April 2011 (UTC)

Wikipedia:BOTPOL#Mass_article_creation (REVIVED AGAIN)

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
I was asked to take another look at this. Based on the further discussion it looks like that Rd232's second set of proposals has decent support. Certainly, Expand the mass creation policy to cover categories which apply to mainspace has clear support. I'd suggest the "pre-approval" part suffered from being vague - so perhaps proposing a piece of text to put on the page would be appropriate as the next stage for that :) --Errant (chat!) 08:16, 7 April 2011 (UTC)

The Bot Policy currently contains an injunction against mass article creation (Wikipedia:BOTPOL#Mass_article_creation); it is proposed that this injunction be extended to cover all namespaces. Mass creation of pages in any namespace may be (unintentionally) disruptive, and should be discussed with the community first. It may be argued that certain namespaces (eg userspace) should be exempted, but the starting point for debate should be that it be applicable to all namespaces, and a case made for any exemptions. Rd232 talk 19:07, 19 January 2011 (UTC)

Before we get started on the merits, are there any circumstances under which mass creation has been regularly done in the recent past? Skomorokh 19:14, 19 January 2011 (UTC)
I'm not sure if that's relevant; if it's done regularly it's probably OK, since if it weren't it presumably wouldn't be repeated due to backlash after the first time. Rd232 talk 19:38, 19 January 2011 (UTC)
You're proposing to ban all instances of mass creation above; if you mean to include caveats best do so now while the discussion is young. See Kingpin13's comment for a related potential issue. Skomorokh 20:44, 19 January 2011 (UTC)
Wasn't a caveat. Generally, mass creation tasks will turn out fine: however the trouble is the ones that don't are so disruptive that pre-approval is a good idea. Repetition creates a sort of (weak) implicit pre-approval for future repetition, but it doesn't make any difference because the task still needs approving the first time, and presumably that approval holds for future instances as well. Rd232 talk 21:30, 19 January 2011 (UTC)


(edit conflict) Support including other namespaces into the wording by default. Article space may be the most disruptive if erroneous pages are created, but this should not mean leeway is given to other namespaces. Basically – the more potential for disruption, the less allowance is given before a prior discussion is expected. —  HELLKNOWZ  ▎TALK 19:20, 19 January 2011 (UTC)
The primary namespace I would exclude is category, Often large categories are broken down, most commonly by date, and a 100 categories might be called for. ΔT The only constant 19:41, 19 January 2011 (UTC)
An exception might be engineered for that (though it might be tricky to word). Rd232 talk 21:25, 19 January 2011 (UTC)
Eh, not convinced by this. Article and category would be the main ones for me, so I'd support adding category to this. But I'm thinking of a few situations where a user might want to use a semi-automated method for creating pages (e.g. tagging a long list of sockpuppets with a sockpuppet tag, although I'm not sure if they currently do this over at SPI or not, but you get the general idea) - Kingpin13 (talk) 19:44, 19 January 2011 (UTC)
I can think of another. Tagging articles in a category with a project banner, if, as likely, a significant number of talk pages do not yet exist, would potentially result in mass creation of talk pages. This editing is productive, not disruptive, and should be encouraged. --Bsherr (talk) 19:59, 20 January 2011 (UTC)
If there are only a handful of tasks like that, they can be enumerated as exceptions. I can't think of any others. Rd232 talk 18:25, 26 January 2011 (UTC)
Yes, we do this at SPI. I actually have a script (two, actually) for that... T. Canens (talk) 22:45, 20 January 2011 (UTC)
Is there an actual issue that this change is proposed to address or is this just a WP:CREEPy solution looking for a problem? Jim Miller See me | Touch me 20:48, 19 January 2011 (UTC)
Yes, it theoretically could be disruptive, but off the top of my head, I can't think of any instances where it actually was. The only sort-of-instances I can think of were from bots that were approved, albeit with insufficient review. Mr.Z-man 20:55, 19 January 2011 (UTC)
Wikipedia:Administrators'_noticeboard/Incidents#Automated_creation_of_incorrect_categories. Rd232 talk 21:25, 19 January 2011 (UTC)
that seems the only known case of a problem. It does not make sense to change established policy do deal with the possible recurrence of an isolated example. That's what we have WP:IAR for. DGG ( talk ) 00:36, 20 January 2011 (UTC)
There is already an edit restriction for that case. The proposed policy change is because the case (which surely isn't the first example) shows there is a need to extend the policy. Rd232 talk 13:51, 20 January 2011 (UTC)

I have several times suggested, and continue to support a restriction on category creation. This applies to categories that are visible to the readers on articles. Ill-conceived categories are an old, ongoing bugbear to the few who try to maintain the category system. The creation of categories by people how don't appreciate the issues involved means greater amount of work for those who fix it up afterwards. Large-scale creation of ill-conceived categories is disruptive.

I take User:Δelta's point. Some editors, like User:Δelta, know what they are doing, and we don't want to hinder good work. I therefore suggest that rapid creation of mainspace visible categories be restricted to specifically approved bots, OR to specifically approved individuals. --SmokeyJoe (talk) 21:37, 21 January 2011 (UTC)

That could work. The point is to disallow the current "anyone can do any mass creation, as long it's not in mainspace" approach; if the community wants to declare "we trust User X generally on this issue", that's fine (not least because it allows that declaration to be potentially revisited if necessary). Basically, make the default "no to mass creation, unless explicitly allowed for a task/account/user", rather than the current "yes to mass creation [outside mainspace], unless disallowed by edit restriction". Rd232 talk 18:21, 26 January 2011 (UTC)

Overall, this sounds CREEPy to me, as a solution in search of a problem. Additionally, I don't think that enough examples have been considered. Should special, advance approval for "mass creation of talk pages" be required by bots that leave messages for users or at articles? What about uploading images? Is that "mass creation of file pages"? WhatamIdoing (talk) 20:34, 26 January 2011 (UTC)

As the RFC formulation indicated, whilst expanding the policy to cover all namespaces may be simpler, we can also just expand it to specific namespaces (eg categories) or to all except some (eg except talk pages and user talk pages). The problem exists, and whilst it may not occur frequently, risk is harm times probability, and it's in the nature of the issue that it involves a lot of work to deal with when it happens. Rd232 talk 21:48, 26 January 2011 (UTC)
After hanging out at CfD for a while, I think I see that a lot of work is done by a few in tidying up ill-considered category creations. A particular difficulty with categories is that they cannot be just moved. Speedy renames may be possible, or a CfD discussion is required, but in either case, listing, notification, waiting, and administrative tools are required. There is no doubt that a problem, the wasting of experienced category editors time, exists. This point has been raised many times. A recent case where an editor, in a bot-like manner, created a massive list of categories littered with misspellings and ill-considered categories brought this here.
Another point to make about categories that appear in mainspace is that they assert a fact about the subject that is not explicitly cited or placed in context. These ill-advised categories not only confuse the article reader and the category system, they often violate WP:NPOV.
Approaching this as a small change (a small addition to the text of the policy) for a well defined problem, I suggest only that this new restriction apply to categories that appear in mainspace. Not hidden categories, not talk page categories, not usespace, not file pages. It would ask that anyone mass creating these categories make a prior proposal. Explicitly exempted all administrators, because they are capable and therefore expected to clean up their own messes. Exempting the occasional non-admin, bot wielding editor like Delta should not be an issue. --SmokeyJoe (talk) 21:06, 26 January 2011 (UTC)
I'm fine with extending the policy just to categories which apply to mainspace, but unfortunately I don't think we can give any blanket exemptions - any individual wanting to be exempt from clearing each specific mass creation task with the community would have to ask for "pre-approval" to do whatever they see fit, which the community may or may not grant (and may later withdraw). This RFC was prompted by an admin mass-creating categories. Since the RFC is running out of steam a bit, I'm going to make this a specific proposal. Rd232 talk 21:48, 26 January 2011 (UTC)

Proposal

As an alternative to the possibility of expanding the policy to all namespaces (which perhaps raises more complexity with the need for exceptions than I'd anticipated), here's an alternative proposal:

1.   Expand the mass creation policy to cover categories which apply to mainspace (that is, categories which appear at the bottom of articles, not including hidden maintenance categories).
2.   Add a "pre-approval" provision to the policy. Any individual wanting to be exempt from clearing each specific mass creation task with the community would have to ask for "pre-approval" to do whatever they see fit, which the community may or may not grant (and may later withdraw).

The two parts of the proposal are independent, and we come up with a better name than "pre-approval".

Rd232 talk 21:48, 26 January 2011 (UTC)

I agree.
"pre-approval" suggests that there is a specific location/noticeboard to request approval, or at least to state an intention and wait for any objection. Where would this be?
A counter-point given above was this: "Often large categories are broken down, most commonly by date, and a 100 categories might be called for. ΔT The only constant 19:41, 19 January 2011". I find it unpersuasive. The breaking down of existing categories into hundreds of subcategories sounds exactly like the sort of thing that should be proposed before doing it. --SmokeyJoe (talk) 15:14, 2 February 2011 (UTC)
Support 1. Neutral on 2. --SmokeyJoe (talk) 11:11, 15 March 2011 (UTC)

This RFC was archived without closure. I was asked to have a look at it & perhaps close; as there seems to be something still worth discussing here I resurrected the thread for further comment :) --Errant (chat!) 10:41, 9 March 2011 (UTC)

  • Support 1 as before. Not sure what pre-approval intends though? Pre-approval before some other approval? Is it not just saying "get consensus for your change" in a very wordy manner? —  HELLKNOWZ  ▎TALK 10:53, 9 March 2011 (UTC)
    I think the "pre" prefix is redundant, but is used to emphasize that approval to mass create must precede the mass creation, if I may guess at interpreting Rd232's words. --SmokeyJoe (talk) 10:48, 15 March 2011 (UTC)
    Isn't it just "The community has decided that any large-scale automated or semi-automated article creation task must be approved at Wikipedia:Bots/Requests for approval." reworded to not include BRFA? —  HELLKNOWZ  ▎TALK 10:52, 15 March 2011 (UTC)
    Not sure about that summary. You've dropped the reference to categories.
    2nd bite: The "pre-approval" is about exempting specific individuals from the expanded restriction under #1. --SmokeyJoe (talk) 11:09, 15 March 2011 (UTC)
    That "summary" is from the current WP:BOTPOL wording on "Mass article creation". I did not write that. —  HELLKNOWZ  ▎TALK 11:11, 15 March 2011 (UTC)
  • Support both, with the immediate pre-approval of WikiProject tagging (and possibly others). עוד מישהו Od Mishehu 05:50, 15 March 2011 (UTC)
    Is that to allow creation of WikiProject talk pages in order to allow the tagging of WikiProjects on their talk pages where the talk pages don't already exist? I think this is unnecessary because the proposal only concerns things that appear on mainspace pages. --SmokeyJoe (talk) 10:55, 15 March 2011 (UTC)

Anti-archive timestamp. —  HELLKNOWZ  ▎TALK 10:55, 6 April 2011 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Further clarification to mass article creation

OK, that may be enough, but to follow up point 2, how about this:

Any individual wanting to be exempt from clearing each specific mass creation task with the community would have to ask for "pre-approval" to do whatever they see fit, which the community may or may not grant. Such requests should be made at BRFA, and must be advertised at WP:AN. A pre-approval may subsequently be challenged in the same way, and is considered immediately suspended unless or until there is consensus for it to be renewed.

I'm not sure it's worth the complication, but that's what I was thinking of. Rd232 talk 10:57, 7 April 2011 (UTC)

I've moved this back from the archive and have notified the participants about this proposed clarification. Cunard (talk) 08:50, 8 May 2011 (UTC)
  • "..to do whatever they see fit" -> "..to run any mass article creation tasks." Let's not give leeway to wikilayering. —  HELLKNOWZ  ▎TALK 09:06, 8 May 2011 (UTC)
  • "must be advertised at WP:AN"? Not sure that's a necessary step. Perhaps VP and related projects instead? After all, AN is really for admin issues and less content creation. —  HELLKNOWZ  ▎TALK 09:06, 8 May 2011 (UTC)
  • "may subsequently be challenged in the same way" In the same way as what? As BRFAs? As in, post on the corresponding BRFA's talk page? —  HELLKNOWZ  ▎TALK 09:06, 8 May 2011 (UTC)
  • "and is considered immediately suspended" So anyone objecting immediately suspends the task? I would think at least third opinion/BAG should weigh in before that, especially if the objection is subjective. —  HELLKNOWZ  ▎TALK 09:06, 8 May 2011 (UTC)
  • Does this apply to bots, non-bots, or both? —  HELLKNOWZ  ▎TALK 09:06, 8 May 2011 (UTC)
I think "point 2" may be more complication than benefit.
I believe that the original concern was about individuals with insufficient expertise or experience embarking on mass creations (articles, and now categories visible in mainspace) that are hard to fix or undo, and that will have a negative impact with the readership in the short term.
I would think that experienced expert in these things would be given pretty easy treatment after a few passes. I think that in practice, an experienced expert would only need to state his intentions to receive quick approval. I don't think there is a current need to codify this.
To HELLKNOWZ's point "Does this apply to bots, non-bots, or both?". I read it as applying to both, but with "semi-automated article creation task" specifically applying to non-bots. I read it as an instruction to an editor at the limit of their technical ability and judgment to not proceed unilaterally. --SmokeyJoe (talk) 12:39, 8 May 2011 (UTC)