RfC: Populating article descriptions magic word

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


In late March - early April 2017, Wikipedia:Village pump (proposals)/Archive 138#Rfc: Remove description taken from Wikidata from mobile view of en-WP ended with the WMF declaring[1] "we have decided to turn the wikidata descriptions feature off for enwiki for the time being."

In September 2017, it was found that through misunderstanding or miscommunication, this feature was only turned off for one subset of cases, but remained on enwiki for other things (in some apps, search results, ...) The effect of this description is that e.g. for 2 hours this week, everyone who searched for Henry VIII of England or saw it through those apps or in "related pages" or some such got the description "obey hitler"[2] (no idea how many people actually saw this, this Good Article is viewed some 13,000 times a day and is indefinitely semi-protected here to protect against such vandalism).

The discussion about this started in Wikipedia:Village pump (policy)/Archive 137#Wikidata descriptions still used on enwiki and continued mainly on Wikipedia talk:Wikidata/2017 State of affairs (you can find the discussions in Archive 5 up to Archive 12!). In the end, the WMF agreed to create a new magic word (name to be decided), to be implemented if all goes well near the end of February 2018, which will replace the use of the Wikidata descriptions on enwiki in all cases.

We now need to decide two things. Fram (talk) 09:58, 8 December 2017 (UTC)

How will we populate the magic word with local descriptions?

  1. Initially, copy the Wikidata descriptions by bot
  2. With a bot, use a stripped version of the first sentence of the article (the method described by User:David Eppstein and User:Alsee in Wikipedia talk:Wikidata/2017 State of affairs/Archive 5#Wikipedia descriptions vs Wikidata descriptions)
  3. With a bot, use information from the infobox (e.g. for people a country + occupation combination: "American singer", "Nepali politician", ...)
  4. Start with blanks and fill in manually (for all articles, or just for BLPs)
  5. Start with blanks, allowing to fill in manually and/or by bot (bot-filling after successful bot approval per usual procedures)
  6. Other

Discussion on initial population

  • #5 – allows bot operations for larger or smaller sets of articles per criteria that don't have to be decided all at once, and manual overrides at all times. --Francis Schonken (talk) 10:28, 8 December 2017 (UTC)
  • #5 is my preference following the reasoning below:
    Option 1, copying from Wikidata, will populate Wikipedia with a lot of really bad descriptions, which will remain until someone gets around to fixing them. My initial rough estimates are that there are more bad/nonexistent Wikidata descriptions than good ones. I strongly oppose this option unless and until someone comes up with solid data indicating that it will be a net gain.
    Option 2, extracting a useful description from the first sentence or paragraph seems a nice idea at first glance, but how will it be done? Has anyone promoting this option a good idea of how effective it would be, how long it would take, and if it would on average produce better descriptions than option 1? This option should be considered unsuitable until some evidence is provided that it is reasonably practicable and will do more good than harm.
    Option 3, copying from the infobox, may work for some of the articles that actually have an infobox with a useful short description, or components that can be assembled into a useful short description. This may work for a useful subset of articles, but it is not known yet how many. I would guess way less than half so not a good primary option.
    Option 4, start with blanks and fill in manually, is probably the only thing that can be done for a large proportion of articles, my guess in the order of half. It will have to be done, and is probably the de facto default. It is easy, quick and will do no harm. It is totally compatible with option 5, for which it is the first step.
    Option 5 is starting with option 4 and applying ad hoc local solutions which can be shown to be useful. Any harm is localised, Wikidata descriptions can be used when they are appropriate, extracts from leads can be used when appropriate, mashups from infoboxes can be used when appropriate, and manual input from people who actually know what the article is about can be used when appropriate. I think there is no better, simpler, and more practical option than this, and suggest that projects should consider how to deal with their articles. WPSCUBA already has manually entered short descriptions ready for use for more than half of its articles, which I provided as an experiment. It is fairly time consuming, but gets easier with practice. Some editors may find that this is a fun project, others will not, and there will inevitably be conflicts, which I suggest should be managed by BRD as simple content disagreements, to be discussed on talk pages and finalised by consensus. In effect, option 5 is the wiki way. It is simple and flexible, and likely to produce the best results with the least amount of damage. · · · Peter (Southwood) (talk): 11:26, 8 December 2017 (UTC)
    (There was an edit conflict here and I chose to group all my comments together · · · Peter (Southwood) (talk): 11:26, 8 December 2017 (UTC))
  • #5. Whether a Wikidata description is suitable or not is very different across many groups of articles. It should be decided (and possibly bot populated) per group, sometimes per small group, and for that we need to start from blank descriptions.--Ymblanter (talk) 11:23, 8 December 2017 (UTC)
  • Start with not using it anywhere, only use it as override per situation. —TheDJ (talkcontribs) 12:09, 8 December 2017 (UTC)
    TheDJ, To clarify, is this an Option 6: Other that you are proposing here? i.e. Only add the magic word to articles where the Wikidata short description is unsuitable, and use Wikidata description as default in all cases until someone finds a problem and adds a magic word, after which the short description will be taken from the magic word? If this is the case, what is your opinion on reverting to Wikidata description for any reason at a later date? · · · Peter (Southwood) (talk): 14:53, 8 December 2017 (UTC)
    @Pbsouthwood: Correct. I have no opinions on reverting at a later moment. —TheDJ (talkcontribs) 13:42, 11 December 2017 (UTC)
  • #5. That doesn't deal with all the issues, but it comes closest to my views, given the choices. See also my comments at WT:Wikidata/2017 State of affairs/Archive 12 and WT:Wikidata/2017 State of affairs now archived to WT:Wikidata/2017 State of affairs/Archive 13. - Dank (push to talk) 14:06, 8 December 2017 (UTC)
  • #5 but don't wait too long to fill in where possible. Fram (talk) 14:14, 8 December 2017 (UTC)
    Fram Are you recommending a massive short term drive to produce short descriptions to make the system useful? · · · Peter (Southwood) (talk): 15:09, 8 December 2017 (UTC)
    • Yes, although this isn't in my view necessary to proceed with this, only preferable. No descriptions is better than the current situations, but decent enwiki-based descriptions is in many cases better than no descriptions. No need to throw out the baby (descriptions to be shown in search and so on) with the bathwater. Fram (talk) 15:21, 8 December 2017 (UTC)
  • #6 - don't use it. There has been no consensus to have this magic word in the first place - that is the question that should have been asked in this RfC (see discussion here). I personally think it is a bad idea and a waste of developer time. It's better to focus on improving the descriptions on Wikidata instead. Mike Peel (talk) 15:27, 8 December 2017 (UTC)
  • #6 — Find a solution that monitors and updates Wikidata descriptions — If a description is good enough for Wikipedia(ns), it should be on Wikidata. If vandalism is blocked on Wikipedia, it should be simultaneously reverted on Wikidata. Wikidata is the hub for interwiki links and a storage site for both descriptions and structured data that then are harvested by external knowledge-based search engines (think Siri, Alexa, and Google's Knowledge Graph). For interwiki purposes, we should want to ensure that short descriptions at Wikidata are accurate, facilitating other language Wikipedias when they interlink to en.wiki. For external harvesting, we should want to prevent vandalism from being propagated. The problems regarding vandalism and sourcing on Wikidata are real, but the solution is for Wikipedians and our anti-vandalism bots to be able to easily monitor and edit the relevant Wikidata material. Possible solutions would include: (a) Implementing a pending changes-like functionality for changes to descriptions on high-traffic or contentious pages; (b) Make changes to short descriptions prominently visible on Wikipedia watchlists, inside the VisualEditor, and as a preference option for Wikipedia editors; (c) Develop and implement in-Wikipedia editing of Wikidata short descriptions using some kind of click-on-this-pencil tool.--Carwil (talk) 15:46, 8 December 2017 (UTC)
    • After those solutions are implemented, you are free to ask for an rfC to overturn the consensus of the previous RfC which decided not to have these descriptions. This RfC is a discussion to get solutions which give you what you want on enwiki (descriptions in VE, mobile, ...) without interfering in what Wikidata does (they are free to have their own descriptions or to import ours). Fram (talk) 16:01, 8 December 2017 (UTC)
    Carwil, How do you propose that Wikipedia controls access by vandals to Wikidata? Are you suggesting that Wikipedia admins should be able to protect Wikidata items and block Wikidata users?
    The easy options are… "undo" functionality for Wikidata descriptions in Wikipedia watchlists, and option (a) I proposed above, something like pending-changes that protects pages on Wikipedia from unreviewed changes from Wikidata. Transferring anti-vandalism bots from Wikipedia to Wikidata would also be helpful.--Carwil (talk) 16:47, 8 December 2017 (UTC)
    Cluebot already runs on Wikidata. ChristianKl (talk) 15:19, 17 December 2017 (UTC)
  • Strongly oppose 4 and strongly oppose 5—Let's reject any solution that mass-blanks short descriptions: these are a functional part of mobile browsing and of the VisualEditor. As an editor and a teacher who brings students into editing Wikipedia, the latter functionality is a crucial timesaver. Wikipedia is increasingly accessed by mobile devices and short descriptions prevent clicking through to a page only to find it's not the one you are looking for.--Carwil (talk) 15:46, 8 December 2017 (UTC)
    Have you analysed the overall usefulness of Wikidata descriptions and found that there are more good descriptions than bad, or found a way to find all the bad ones so they can be changed to good? If so please point to your methods and results, as they would be extremely valuable. What methods have you used to indicate the comparative harm done by bad descriptions versus the good done by good descriptions? · · · Peter (Southwood) (talk): 16:14, 8 December 2017 (UTC)
Yes, I have analyzed a sample here. I found that 13 of 30 had adequate descriptions (though 7 of them could be improved), 13 had no descriptions at all, 1 was incorrectly described (not vandalism), 2 were redundant with the article title (i.e., they should be overridden with a blank), and 1 represented a case where the Wikipedia article and the Wikidata entity were not identical and shouldn't share the same description. The redundant descriptions would cause no harm. Mislabelling "Administrative divisions of Bolivia" with the subheading "administrative territorial entity of Bolivia" would cause mild confusion. The legibility provided by descriptions easily outweigh the harms. (The only compelling harm is due to vandalism, which should be addressed by improving vandalism tools not forking the descriptions between the projects.)--Carwil (talk) 16:47, 8 December 2017 (UTC)
The options 4 and 5 are not to blank anything, they are to put short descriptions, which are text content, into the article they describe, where they can be properly, (or at least better), maintained, by people who may actually know what the article is about. Wikidata can use them if their terms of use allow, and if they are actually better for Wikidata's purposes, which is by no means clear at present. · · · Peter (Southwood) (talk): 16:14, 8 December 2017 (UTC)
Options 4 and 5 involve starting with blanks everywhere. The whole proposal assumes that we should fork a dataset describing Wikipedia articles into two independently editable versions. Forking a dataset always creates inconsistencies and reduces the visibility of problems by splitting the number of eyes to watch for problems. Better to make Wikipedians' eyes more powerful and spotting problems (which are unusual) rather than to throw up wall between the two projects. My sample suggests that 90% of the time, or more, the two projects are working towards the same goal here.--Carwil (talk) 16:47, 8 December 2017 (UTC)
They do, but it is unlikely the WMF will switch until the Wikipedia results are no worse than the Wikidata results, though I have no idea how they would measure that, since they don't seem to have much idea of the quality they will be comparing against, or if they do, are not keen on sharing it.
The dataset does not suit Wikipedia. We should not be forced to use it. A dataset that suits Wikipedia may not suit Wikidata. Should we force it on them? Two datasets means Wikipedia can look after their own, and Wikidata can use what they find useful from it, and Wikipedians are not coerced into editing a project they did not sign up for. Using shitty quality data on Wikipedia to exert pressure on Wikipedians to edit Wikidata may have a backlash that will harm either or both projects, not a risk I would be willing to take, if it could affect my employment, unless of course I was being paid to damage the WMF, but that would be conspiracy theory, and frankly I think it unlikely.
I also did a bit of a survey, my results do not agree with yours, and they are also from such a small sample as to be statistically unreliable. I also wrote short descriptions for about 600 articles in WPSCUBA, but did not keep records. Most (more than half) articles needed a new description as the Wikidata one either did nor exist or was inappropriate. There were some which were perfectly adequate, but less than half of the ones that actually existed, from memory. It would be possible to go back and count, but I think it would be a better use of my time to do new ones, if anyone is willing to join such a project. Maybe Wikiproject Medicine, or Biography, where quality actually may have real life consequences, but I don't usually work much in those fields and hesitate to move into them without some project participation. I have already run into occasional unfriendly reactions where projects overlap, but fortunately very few. · · · Peter (Southwood) (talk): 18:18, 8 December 2017 (UTC)
  • @Pbsouthwood: I don't think there's much daylight between Wikipedia's purpose for these descriptions (which hasn't been written yet), the value of them for the mobile app, the value of them for the VisualEditor (as disambiguators for making links), and the value for Wikidata as discussed here. There, the requirements include: "a short phrase designed to disambiguate items with the same or similar labels"; avoiding POV, bias, promotion, and controversial claims; and avoiding "information that is likely to change." Only the last one seems likely to differ from the ideal Wikipedia description and only marginally: e.g., "current president of the United States" would have to be replaced with "45th president of the United States."--Carwil (talk) 22:11, 12 December 2017 (UTC)
  • There have been extensive discussions between community and WMF on the description issue. I wish this RFC had gone through a draft stage before posting. There may be other options or issues that may need to be sorted out, potentially affecting the outcome here. A followup RFC might be needed.
    The previous RFC[3] consensus was clearly to eliminate wikidata-descriptions, and that is definitely my position. An alternate option would be to skip creating a description-keyword at all, and just take the description from the lead sentence. That has the benefits of (1) ensuring all articles automatically have descriptions (2) avoiding any work to create and maintenance on descriptions and (3) it would avoid creating a new independent independent issue of description-vandalism. The downside is that the lead sentence doesn't always make for a great short description.
    If we go with a new description keyword, #5 #2 and #1 are all reasonable. (#3 and #4 are basically redundant to bot approval in #5). However as I note in the question below, #5 can be implemented with a temporary wikidata-default. This gives us time to start filling in local-descriptions before the wikidata-descriptions are shut off. This would avoid abruptly blanking descriptions. Alsee (talk) 21:49, 8 December 2017 (UTC)
  • #2, with #5 as a second preference. The autogenerated descriptions look like they're good enough for most purposes. Sandstein 16:14, 10 December 2017 (UTC)
    Sandstein, How big was your test sample, and how were the examples chosen? · · · Peter (Southwood) (talk): 16:36, 10 December 2017 (UTC)
  • 5. Mass-importing WD content defeats the purpose of getting rid of WD descriptions. James (talk/contribs) 16:30, 11 December 2017 (UTC)
    Only true for a limited period until someone gets round to changing them where necessary. If the problem is big enough, there will be bot runs to do fixes, so over a medium term it does not make much difference as once the descriptions are in Wikipedia we can fix them as fast as we can make arrangements to do so and will no longer be handicapped by WP:ICANTHEARTHAT obfuscations from WMF. The important part is to get them where we have the control so we can start work getting them right. · · · Peter (Southwood) (talk): 07:47, 13 December 2017 (UTC)
  • 5 Basically what Peter said. In some areas, the wikidata descriptions will be good. In others the first sentence stripping will be good. In some data from infoboxes can be used. Etcetera. Galobtter (pingó mió) 16:20, 19 December 2017 (UTC)
  • 5 - Having just read the discussions on this I'm absolutely astounded that so much vandalism has taken place, Anyway back on point Wikidata is beyond useless when it comes to dealing with vandalism and as such 5 is the best way of dealing with it!. –Davey2010 Merry Xmas / Happy New Year 23:24, 27 December 2017 (UTC)
  • Combination of 1 and 5. Important but keep hidden until reviewed on WP. Doc James (talk · contribs · email) 06:19, 30 December 2017 (UTC)
  • #6 Retain Wikidata descriptions and bypass only those not needed Eventually all Wikipedias will have to use Wikidata. Moving back and forth make no much sense. The only thing we could do is possibly add functions to update Wikidata directly and retain functionality to bypass magic word locally. -- Magioladitis (talk) 15:56, 30 December 2017 (UTC)
  • 5 - For reasons stated by others above. Tony Tan · talk 04:17, 17 January 2018 (UTC)
  • 5 - Per above. (Edit just to clarify, at the point I edited this subsection, Fish had not closed the discussion - as they closed the entire thread, there was no conflict and my !vote does not affect outcome anyway) Only in death does duty end (talk) 13:02, 6 February 2018 (UTC)

What to do with blanks

What should we do when there is no magic word, or the magic word has no value?

  1. Show the Wikidata description instead
  2. Show no description
  3. Show no description for a predefined list of cases (lists, disambiguation pages, ...) and the Wikidata one otherwise (this is the solution advocated by User:DannyH (WMF) at the moment)
  4. Other
  5. A transition from #1 to #2. In the initial stage, any article that lacks a local description will continue to draw a description from Wikidata. We deploy the new description keyword and start filling in local descriptions which override Wikidata descriptions. Once we have built a sufficient base of local descriptions, we finalize the transition by switching-off Wikidata descriptions completely. (Note: Added 16:34, 6 January 2018 (UTC). Previous discussion participants have been pinged to discuss this new option in subsection Filling in blanks: option #5.)

Discussion on blanks

  • #2 – comes closest to having no description per initial aborted RfC; those who want them can write them, or fill in automatically (per usual bot approval procedures). --Francis Schonken (talk) 10:28, 8 December 2017 (UTC)
    • No reasonable variant/alternative/compromise has been proposed since I supported #2 a month ago. Please replace DannyH as an intermediary (not the first time I suggest this): they have been pretty clear about their inability to propose anything tangible. The well-being of the Wikipedia project should not be left in the hands of those who are a paid to improve the project but can deliver next to nothing. --Francis Schonken (talk) 10:37, 8 January 2018 (UTC)
  • #5 as a reasonable compromise per various discussions below. · · · Peter (Southwood) (talk): 18:24, 6 January 2018 (UTC) #2 The Wikidata description should not be allowed as a default where there is no useful purpose to be served by a short description. An empty parameter to the magic word must be respected as a Wikipedia editorial decision that no short description is wanted. This decision can always be discussed on the talk page. Under no circumstances should WMF force an unwanted short description from Wikidata as a default. Nothing stops anyone from manually adding a description which is also used by Wikidata, but that is a personal decision of the editor and they take personal responsibility as for any other edit. Automatically providing no description for a predefined list of classes has problems, in that those classes may not be as easily defined as some people might like to think. For example, most list articles don't need a short description, but some do. The same may be true for disambiguation pages. Leaving them blank as the first stage and not displaying a short description until a (hopefully competent) editor has added one is easy to manage for the edge cases, and may be managed by other methods per option 5 of population. It is flexible and can deal with all possibilities. There is no need to make it more complicated and liable to break some time. Ideally the magic word could be given a comment in place of a parameter where an explanation of why there should not be a short description would be useful. In this case the comment should not be displayed and is there to inform editors who might wonder if it had been missed. · · · Peter (Southwood) (talk): 11:43, 8 December 2017 (UTC)
  • #1 - Show the wikidata description in stead. —TheDJ (talkcontribs) 12:10, 8 December 2017 (UTC)
  • #2. No magic word (and magic word with no parameter) should result in no description, not some non-enwiki data being confusingly shown to readers (while being missed by most vandalism patrollers apparently). Today, for 8 hours, we had this blatant BLP violation on a page with 10,000 pageviews per day. Using these descriptions by default (or at all) is a bad idea, and was rejected at the previous RfC. Fram (talk) 14:21, 8 December 2017 (UTC)
 
  • #1 - From the WMF: We're proposing using Wikidata as the fallback default if there isn't a defined magic word on Wikipedia, because short descriptions are useful for readers (on the app in search results, in the top read module, at the top of article pages) and for editors (in the Visual Editor link dialog). For example: in the top read module from September pictured here, 3 of the 5 top articles benefit from having a short description -- I don't know who Gennady Golovkin and Canelo Álvarez are, and having them described as "Kazakhstani boxer" and "Mexican boxer" tells me whether I'm going to be interested in clicking on those. (The answer on that is no, I'm not really a boxing guy.) I know that Mother! is a 2017 film, but I'm sure there are lots of people who would find that article title completely baffling without the description. Clicking through to the full list of top read articles, there are a lot of names that people wouldn't know -- Amber Tamblyn, Arjan Singh, Goran Dragić. This is a really popular feature on the apps, and it would be next to useless without the descriptions.
We want to create the magic word, so that Wikipedia editors have editorial control over the descriptions, which they should. But if the magic word is left blank on Wikipedia -- especially in the cases where Wikipedia editors haven't written a description yet -- then for the vast majority of cases, showing the description from Wikidata is better than not showing anything at all. As a reader looking at that top read module, I want to know who Gennady Golovkin is, and the module should say "Kazakhstani boxer," whether that text comes from Wikipedia or Wikidata.
I know that a big reason why people are concerned about showing the Wikidata descriptions is that the Wikidata community may sometimes be slower than the Wikipedia community to pick up on specific examples of vandalism. The example that Fram cites of Henry VIII of England showing "obey hitler" for two hours is disappointing and frustrating. However, I think that the best solution there should be to improve the community's ability to monitor the short descriptions, so that vandalism or mistakes can be spotted and reverted more quickly. The Wikidata team has been working on providing more granular display in watchlists on Wikipedia, so that Wikipedia editors can see edits to the descriptions for the articles that they're watching, without getting buried by other irrelevant edits made to that Wikidata item. That work is being tracked in this Phabricator ticket -- phab:T90436 -- but I'm not sure what the current status is. Ping for User:Lydia Pintscher (WMDE) -- do you know how this is progressing?
Sorry for only getting back to this now. It slipped through. So we have continued working on improving which changes show up in the recent changes and watchlist here from Wikidata. Specifically we have put a lot of work into scaling the current system, which is a requirement for any further improvements. We have made the changes we are sending smaller and we have made it so that less changes are send from Wikidata to Wikipedia. We have also rolled out fine-grained usage tracking on more wikis (cawiki, cewiki, kowiki, trwiki) to see how it scales. With fine-grained usage tracking you will no longer see changes in recent changes and watchlist that do not actually affect an article like it is happening now. The roll-outs on these wikis so far looks promising. In January we will continue rolling it out to more wikis and see if it scales enough for enwiki. At the same time we will talk to various teams at the developer summit in January to brainstorm other ways to make the system scale better or overhaul it. --Lydia Pintscher (WMDE) (talk) 09:31, 19 December 2017 (UTC)
We've talked in the previous discussions about types of pages where the Wikidata descriptions aren't useful for article display, because they're describing the page itself, rather than the subject of the article. The examples that I know right now are category pages (currently "Wikimedia category page"), disambiguation pages ("Wikimedia disambiguation page"), list pages, and the main page. Those may be helpful in the case of the VE link dialog, especially "disambiguation page", but there's no reason to display those at the top of the article page, where they look redundant and kind of silly. We're proposing that we just filter those out of the article page display, and anywhere else where they're unnecessary. I'd like to know more examples of pages where short descriptions aren't useful, if people know any.
For article pages, I don't know of any examples so far where a blank description would be better for the people who need them (people reading, searching or adding links on VE). If we're going to build the "show a blank description" feature, then we need to talk about specific use cases where that would be the best outcome. That's how product development works -- you don't build a feature, if you don't have any examples for where it would be useful. If people have specific examples, then that would help a lot. -- DannyH (WMF) (talk) 14:58, 8 December 2017 (UTC)
"For article pages, I don't know of any examples so far where a blank description would be better " Check the two examples of vandalism on pages with 10K+ pageviews per day I gave in this very discussion, including one very blatant BLP violation which lasted for 8 hours today. In these examples, a blank description would have been far preferable over the vandalized one, no? Both articles, by the way, are semi-protected here, so that vandalism couldn't have done by the IPs here (and would very likely have been caught much earlier). "specific use cases where that would be the best outcome." = all articles, and certainly BLPs. Fram (talk) 15:19, 8 December 2017 (UTC)
 
Better than no description?
If you want another example of where no description would be preferable over the Wikidata one, look to the right. This is what people who search for WWII (or have it in "related articles", the mobile app, ... see right now and have seen since more than 5 hours (it will undoubtedly soon be reverted now that I have posted this here). This kind of thing happens every day, and way too often on some of our most-viewed pages. Fram (talk) 15:38, 8 December 2017 (UTC)
I agree that the vandalism response rate on Wikidata is sometimes too slow. I think the solution to that is to make that response rate better, by making it easier for Wikipedia editors to monitor and fix vandalism of the descriptions. I disagree that the best solution is to pre-emptively blank descriptions because we know that there's a possibility that they'll be vandalized. I'm asking for specific examples where editors would make the choice to not show a description on the article page, because a blank description is better than the majority of good-to-adequate descriptions already on Wikidata. -- DannyH (WMF) (talk) 16:10, 8 December 2017 (UTC)
And I am saying that this is a red herring. Firstly, you claim that there exists a majority of good-to adequate descriptions on Wikidata, without any convincing evidence that this is the case. I am stating that out of several hundred short descriptions that I produced, there were a non-zero number of cases where a short description made no apparent improvement over the article title by itself. · · · Peter (Southwood) (talk): 16:21, 8 December 2017 (UTC)
DannyH (WMF), Filtering descriptions out of the article page view means that they will be invisible for maintenance which is very bad, unless they are filtered out based on content, not on page type, which may be technically problematic - you tell me, I don't write filter code. Can you guarantee that no vandalism can sneak through by this route?. As long as they are visible anywhere in association with the Wikipedia article they are a Wikipedia editorial issue. · · · Peter (Southwood) (talk): 15:39, 8 December 2017 (UTC)
We are not asking for a development feature to leave out descriptions that don't exist, it is the simplest possible default. Please try to accept that simply displaying whatever content is in the magic word parameter is the simplest and most versatile solution, and that if we leave it blank that is because we prefer it to be left blank. If anyone prefers to have a short description in any of these cases, they can edit Wikipedia to put in the one they think is right, and if anyone disagrees strongly enough to want to remove it, they can follow standard procedure for editorial disagreement, which is get consensus on the talk page. It is not rocket science, it is the Wikipedia way of doing these things. If it is difficult for the magic word to handle a comment in the parameter we can simply put the comment outside. There may be a few more cases where people will fail to notice that it is there, but probably not a train smash. Is there any reason why a comment in the parameter space should not be parsed as equivalent to no description? I have asked this before, and am still waiting for an answer.· · · Peter (Southwood) (talk): 15:39, 8 December 2017 (UTC)
  • #1 - and focus on improving the descriptions on Wikidata. Mike Peel (talk) 15:27, 8 December 2017 (UTC)
    • See discussion at Wikipedia_talk:Wikidata/2017_State_of_affairs#Circular_"sourcing"_on_Wikidata - I've posted a random sample of 1,000 articles and descriptions, of which only 1 description had a typo and none seemed to be blatently wrong - although 39% don't yet have a description. So let's add those extra descriptions / improve the existing ones, rather that forking the system. Thanks. Mike Peel (talk) 00:14, 12 December 2017 (UTC)
      • That sample includes the many typical descriptions which are right on Wikidata and useless (or at least very unclear) for the average enwiki reader: "Wikimedia disambiguation page" (what is Wikimedia, shouldn't that be Wikipedia, and even then, I know I'm on Wikipedia, and we don't use "Wikipedia article" as description for standard articles either...) There are also further typos ("British Slavation Army officer"), useless descriptions ("human settlement", can we be slightly more precise please), redundant ones (Shine On (Ralph Stanley album) - "album by Ralph Stanley")... And the basic issue, that language-based issues shouldn't be maintained at Wikidata but at the specific languages, is not "forking", it is taking back content which doesn't belong at Wikidata but at enwiki. Fram (talk) 05:45, 12 December 2017 (UTC)
    • You may add "Descriptions not in English" to the problems list from that sample: "Engels; schilder; 1919; Londen (Engeland); 1984". Fram (talk) 06:01, 12 December 2017 (UTC)
    • And "determined sex of an animal or plant. Use Q6581097 for a male human" is not really suitable for use on enwiki either (but presumably perfect for Wikidata). Neeraj Grover murder case - "TV Executive" seems like the wrong description as well. Stefan Terzić - "Team handball" could also use some improvement. Fram (talk) 07:56, 12 December 2017 (UTC)
      • OK, so maybe 4/1000 have typos/aren't in English/are wrong - that's still not bad. Most of the rest seems to be WP:IDONTLIKEIT (where I'd say WP:SOFIXIT on Wikidata, but you don't want to do that). Yes, it is forking - the descriptions currently only exist on Wikidata (we've never had them on Wikipedia), and they aren't going away because of this - so you want to fork them, and in a way that means the two systems can't later be unforked (due to licensing issues). That's not helpful, particularly in the long term. Mike Peel (talk) 19:58, 12 December 2017 (UTC)
        • I gave more than 4 examples, some 40% don't have a description (so can hardly be wrong, even if many of those need a description), and many have descriptions we can't or shouldn't use. Basically, you started with 0.1% problem in your view, when it is closer to 50% in reality. Please indicate which licensing issues you see which would make unforking impossible. It seems that these non-issues would then also make it impossible to import the Wikidata descriptions, no? Seems like a red herring to me. By the way, have you ever complained about forking when Wikidata was populated with millions of items from enwiki (and other languages), where from then on they might evolve separately? Or is forking only an issue when it is done from Wikidata to enwiki, and not the reverse? Fram (talk) 22:28, 12 December 2017 (UTC)
          • Only a few of your example problems seem to be actual problems, the rest are subjective. You're proposing that we switch to 100% without description, so I can't see how you can argue about the 40% blank descriptions (and they weren't a problem at the start of this discussion). I'm not saying 0.1%, but ~1% seems reasonable here. Enwp descriptions are CC-BY-SA licensed, which means they can't be simply copied to Wikidata as that has a CC-0 license (and yes, this isn't great, and copyrighting the simple descriptions doesn't make any sense, but it is what it is) - although that means that we can still copy from Wikidata to here if needed. I'm complaining that we're forking things here to do the same task (describing topics), and that we're trying to do so using the wrong tool (free text with hacks) rather than a better tool (a structured database). Mike Peel (talk) 23:01, 12 December 2017 (UTC)
            • Ah, the old "structured database" vs "free text with hacks" claim, I wondered why it wasn't mentioned yet. In Wikidata, you are putting free text in a database field, which then at runtime gets read and displayed. In enwiki, you are putting free text in a "magic word" template, which then at runtime gets read and displayed. Pretending that the descriptions in Wikidata aren't free text and in enwiki are free text is not really convincing. However, what is the wrong tool for the task is Wikidata, as that is not part of the enwiki page history and wikitext, and thus can't be adequately monitored, protected, ... The only "hack" is the current one, using Wikidata to do something enwiki can do better (and which philosophically also belongs on enwiki, as it is language-based text, not some universally accepted value). Fram (talk) 07:53, 13 December 2017 (UTC)
  • #2, or transition from #1 to #2. I have engaged significant discussions with the WMF on the descriptions-issue on the Wikidata/2017 State of affairs talk page. The WMF has valid concerns about abruptly blanking descriptions, and we should try to cooperate on those concerns. Temporarily letting a blank keyword default to wikidata (#1) will give us time to begin filling empty local descriptions before shutting off wikidata descriptions (#2). But in the long run my position is definitely #2. Alsee (talk) 21:02, 8 December 2017 (UTC) Adding explicit support for #5, which is essentially matches my original !vote. Alsee (talk) 16:36, 6 January 2018 (UTC)
    This could work. While we are filling in short descriptions, whenever we find an article that should not have a short description, we could put in a non-breaking space to override an unnecessary Wikidata description. We will need to see the actual display shown on mobile on desktop too, so we can see what we are doing. As long as there is a display of the short description in actual use on desktop, it might be unnecessary to switch. That would reduce the pressure to rush the process, which may be a good thing, but also may not. · · · Peter (Southwood) (talk): 10:12, 9 December 2017 (UTC)
    Alsee, thanks. I've been staying out of conversations about if/when/how the magic word gets used/populated, because I think those are the content decisions that need to be made by the English WP community. I want to figure out how we can get to the place where Wikipedia editors have proper editorial control over the short descriptions, without hurting the experience of the readers and editors who are using those descriptions now. -- DannyH (WMF) (talk) 23:29, 11 December 2017 (UTC)
You can enable a view of the Q-code, short description and alias via this script: [[4]].--Carwil (talk) 13:01, 9 December 2017 (UTC)
Carwil, This is exactly the kind of display I had in mind. It is easily visible, but obviously not part of the article per se, as it is displayed with other metadata in a different text size. To be useful it would have to be visible to all editors who might make improvements to poor quality descriptions, so would have to be a default display on desktop. This may not be well received by all, but it would be useful, maybe as an opt-out for those who really do not want to know. It still does not deal with the inherent problems of having the description on Wikidata, in that it is not Wikipedia and we do not dictate Wikidata's content policies, control their page protection, block their vandals etc, but it does let us see what is there, and fixing is actually quite easy, though maybe I am biased as I have done a fair amount of work on Wikidata. I would be interested to hear the opinions of people who have not previously edited Wikidata on using this script. I can definitely recommend it to anyone who wants to monitor the Wikidata description. Kudos to Yair rand.· · · Peter (Southwood) (talk): 16:15, 9 December 2017 (UTC)
It also does not solve the problem of different needs for the description. When the Wikidata description is unsuitable for Wikipedia, we should not arbitrarily change it if it is well suited to Wikidata's purposes, but if it is going to be used for Wikipedia, we may have to do just that.· · · Peter (Southwood) (talk): 16:21, 9 December 2017 (UTC)
  • #2. Any Wikidata import should be avoided because that content is not subject to Wikipedia editorial control and consensus. Sandstein 16:16, 10 December 2017 (UTC)
    Sandstein, My personal preference is that eventually all short descriptions should be part of Wikipedia, and not imported in run time, however, as an interim measure, to get things moving more quickly, I see some value in initially displaying the Wikidata description as a default for a blank magic word parameter, as it is no worse than what WMF are already doing, and in my opinion are likely to continue doing until they think the Wikipedia local descriptions are better on average. If anyone finds a Wikidata description on display that is unsuitable, all they have to do is insert a better one in the magic word and it immediately becomes a part of Wikipedia. If you find a Wikidata description that is good, you can also insert it into the magic word and make it local, as they are necessarily CC0 licensed. The only limitation on getting 100% local content is how much effort we as Wikipedians are prepared to put into it. Supporters of Wikidata can improve descriptions on Wikidata instead if that is what they prefer to do, and as long as a good short description is displayed, it may happen that nobody feels strongly enough to stop allowing it to be used. I predict that whenever a vandalised description is spotted, most Wikipedians will provide a local short description, so anyone in favour of using Wikidata descriptions would be encouraged to work out how to reduce vandalism and get it fixed faster, which will greatly improve Wikidata. Everybody wins, maybe not as much as either side would prefer, but more than they might otherwise. As it would happen, WMF win the most, but annoying as that may be to some, we can live with it as long as we also have a net gain for Wikipedia and Wikidata. · · · Peter (Southwood) (talk): 16:58, 10 December 2017 (UTC)
  • 2. We have neither the responsibility nor the authority to enforce WP guidelines on a project with diametrically opposed policies. Content outside of WP's editorial control should not appear on our pages, period. James (talk/contribs) 16:34, 11 December 2017 (UTC)
  • 2 comes closest to my views, given the choices. See my comments at WT:Wikidata/2017 State of affairs/Archive 12 and WT:Wikidata/2017 State of affairs. Also see the RfC from March; most of what was said there is equally relevant to the current question. - Dank (push to talk) 21:01, 11 December 2017 (UTC)
  • Comment from WMF: I want to say a word about compromise and consensus. I've been involved in these discussions for almost three months now, and there are a few things that I've been consistent about.
First is that I recognize and agree that the existing feature doesn't allow Wikipedia editors to have editorial control over the descriptions, and it's too difficult for Wikipedia editors to see the existing descriptions, monitor changes, and fix problems when they arise. Those are problems that need to be fixed, by the WMF product team and/or the Wikidata team.
Second: the way that we fix this problem doesn't involve us making the editorial decisions about the format or the content. That's up to the English Wikipedia and Wikidata communities, and if there's disagreement between people in those communities, then ultimate control should be located on Wikipedia and not on Wikidata. In other words: when we build the magic word, we're not going to control how it's used, how often, or what the format should be. I think that both of these two points are in line with what most of the people here are saying.
The third thing is that we're not going to agree to a course of action that results in the mass blanking of existing descriptions, for any meaningful length of time. I recognize that that's something that most of the people here want us to build, but that would be harmful to the readers and editors that use those descriptions, and that matters. This solution needs to have consensus with us, too, because we're the ones who are going to build it. I'm not saying that we're going to ignore the consensus of this discussion; I'm saying that we need to be a part of that consensus. -- DannyH (WMF) (talk) 15:13, 12 December 2017 (UTC)
How many people have actually complained in the 8 months or so that descriptions have now been disabled in mobile view? "readers and editors that use those descriptions": which editors would that be? Anyway, basically you are not going to interfere in content decisions, unless you don't like the result. But at the same time you can't be bothered to provide the necessary tools to patrol and control your features (and your first point is rather moot when this magic word goes live and works as requested anyway). Which is the same thing you did (personally and as WMF) with Flow, Gather, ... which then didn't get changed, improved, gradually accepted, but simply shot down in flames, at the same time creating lots of unnecessary friction and bad blood. Have you actually learned anything from those debacles? Most people here actually want to have descriptions, and these will be filled quite rapidly (likely to a higher percentage than what is provided now at Wikidata). But we will fill them where necessary, and we will leave them blank where we want them to be blank. You could have suggested over the past few months a compromise, where either "no magic word" or "magic word with no description" would mean "take the wikidata description", and the other meant "no description". You could have suggested "after the magic word is installed, we'll take a transitional period of three months, to see if the descriptions get populated here on enwiki; afterwards we'll disable the "fetch desc from wikidata" completely". Instead you insisted that the WMF would have the final say and would not allow blanks unless it was for a WMF-preapproved list of articles (or article groups). Why? No idea. If the WMF is so bothered that readers should get descriptions no matter what (even if many, many articles don't have Wikidata descriptions anyway in the first place), then they should hire and pay some people to monitor these and make sure that e.g. blatant BLP violations don't remain for hours or days. But forcing us to display non-enwiki content against our will and without providing any serious help in patrolling it is just not acceptable. Fram (talk) 15:44, 12 December 2017 (UTC)
Fram, those compromises are what I'm asking for us to discuss. I'm glad you're bringing them up, that's a conversation that we can have. I'm going to be talking to the Wikidata team next week about the progress on building the patrolling and moderation tools. We don't have direct control over what the Wikidata team chooses to do, but I want to talk with them about how the continued lack of a way to effectively monitor the short descriptions is affecting this conversation, this community, and the feature as a whole. English Wikipedia editors need to have the tools to effectively populate and monitor the descriptions, and you need to have that on a timeline that makes sense. I need to talk to more people, and keep working on how to make that happen. I'm going to talk with people internally about the transitional period that you're suggesting. -- DannyH (WMF) (talk) 16:04, 12 December 2017 (UTC)
I think the major concern is the lack of control over enwp content. There are currently only two outside sources of enwp content over which the local community has no control: Commons and Wikidata; it has taken some years for Commons to build a level of trust over their content policies and failsafes to prevent abuse at enwp through Commons. The only reason today that the use of Commons materials here is two-fold 1) they've proven they can handle their business, and 2) there exists local over-rides that are transparent and easy to enact. For Wikidata to be useful and to avoid the kind of acrimony we are seeing here, we would need the SAME thing from Wikidata. Point 1) can only occur over time, and Wikidata is far too new to be proven in that direction. Recent gaffes in allowing vandalism off-site at Wikidata to perpetuate at enwp does not help either. If the enwp community is going to feel good about allowing Wikidata to be useful going forward, until that trust reaches what Commons has achieved, we need point 2 more than anything. Defaulting to local control over off-site control is necessary, and any top-down policy that removes local control, either directly or as a fait accompli by subtling controlling the technology, is unlikely to be workable. If Wikidata can prove their ability to take care of their own business reliably over many years, the local community would feel better about handing some of that local control over to them, as works with Commons now. But that cannot happen today, and it cannot happen if local overrides are not simple, robust, and the default. --Jayron32 17:32, 12 December 2017 (UTC)
"English Wikipedia editors need to have the tools to effectively populate and monitor the descriptions, and you need to have that on a timeline that makes sense." You know, yiou have lost months doing this by continually stalling the discussions and "misinterpreting" comments (always in the same direction, which is strange for real misunderstandings and looks like wilful obstruction instead). You just give us the magic word, and then we have the tools to monitor the descriptions: recent changes, watchlists, page histories, ... plus tools like semi- or full protection and the like. We can even build filters to check for these changes specifically. We can build bots to populate them. From the very start, everyone or nearly everyone who was discussing these things with you has suggested or stated these things, you were the only one (or nearly the only one) creating obstacles and finding issues with these solutions where none existed. "I'm going to be talking to the Wikidata team next week about the progress on building the patrolling and moderation tools." is totally and utterly irrelevant for this discussion, even though it is something that is sorely needed in general. Patrolling and moderating Wikidata descriptions is something we are not going to do; we will patrol and moderate ENWIKI descriptions, and we have the tools to do so (a conversation may be needed whether the descriptions will be shown in the desktop version or not, this could best be a user preference, but that is not what you mean). Please stop fighting lost battles and get on with what is actually decided and needed instead. Fram (talk) 17:54, 12 December 2017 (UTC)
I'm talking to several different groups right now -- the community here, the WMF product team, and the Wikidata team -- and I'm trying to get all those groups to a compromise that gives Wikipedia editors the control over these descriptions that you need, and doesn't result in mass blanking of descriptions for a meaningful amount of time. That's a process that takes time, and I'm still working with each of those groups. I know that there isn't much of a reason for you to believe or trust me on this. I'm just saying that's what I'm doing. -- DannyH (WMF) (talk) 18:02, 12 December 2017 (UTC)
Indeed, I don't. I'm interested to hear why you would need to talk to the Wikidata team to find a compromise about something which won't affect the Wikidata team one bit, unless you still aren't planning on implementing the agreed upon solution and let enwiki decide how to deal with it. Fram (talk) 22:28, 12 December 2017 (UTC)
Fram, you're saying "us", "we", etc. here rather freely. Please do not speak for all editors here, particularly when putting your own views forward at the same time. There's a reason we have RfC's... Thanks. Mike Peel (talk) 21:11, 12 December 2017 (UTC)
Don't worry, I'm not speaking for you. But we (enwiki) had an RfC on this already, and it's the consensus from there (and what is currently the consensus at this RfC) I'm defending. There's indeed a reason we have RfC's, and some of us respect the results of those. Fram (talk) 22:28, 12 December 2017 (UTC)
I'm glad you're not speaking for me - but why are you trying to speak for everyone except for me? What consensus are you talking about, this RfC is still running (although I'm worried that potential participants are being scared off by these arguments in the !vote sections)? And what consensuses are you accusing me of disrespecting? Mike Peel (talk) 22:40, 12 December 2017 (UTC)
FWIW, Fram definitely speaks for me. James (talk/contribs) 23:24, 12 December 2017 (UTC)
You don't really seem to care about the results of the previous RfC on this, just like you didn't respect the result of the WHS RfC when your solution was not to revert to non-Wikidata versions, but to bot-move the template uses to a /Wikidata subpage which was identical to the rejected template. Basically, when you have to choose between defending Wikidata use on enwiki or respecting RfCs, you go with the former more than the latter. Fram (talk) 07:53, 13 December 2017 (UTC)
ok... I propose that from this point on, DannyH, User:Mike Peel and User:Fram, cease any further participation in this RfC. You three and your mutual disagreements are again completely dominating the discussion, the exact thing that the Arbcom case was warning against. This is NOT helping the result of this discussion. —TheDJ (talkcontribs) 14:24, 13 December 2017 (UTC)
  • #3 – This makes the most sense to me for reasons I stated above. I would amend #3 only by saying: Immediately populate a local description for any pages being actively protected from vandalism which could just mean protected pages, or could mean (where appropriate) pages subjected to arbitration enforcement as well.--Carwil (talk) 18:02, 13 December 2017 (UTC)
  • #1 This whole idea is just adding complexity over a rather small problem. The less duplication on the datas, the better. We should focus on ways to follow more project add glance and focus on better tools to follow change on Wikipedia rather than splitting the Wikimedian forces on all the different project. Co-operation and sharing are the essence of these projects, not control, defiance and data duplication. TomT0m (talk) 16:34, 15 December 2017 (UTC)
  • #1 per DannyH. Additionally, we can configure protected articles to never display data from Wikidata. It's worth noting that this option allows you to run a bot that puts " " as description for a specific class of articles when you don't like the kind of descriptions that Wikidata shows for those articles. ChristianKl (talk) 15:29, 17 December 2017 (UTC)
    For clarification, Is your claim that we can configure protected articles to never display data from Wikidata based on knowing how this could be done, and that it is a reasonably easy thing to do, or a conjecture? Bear in mind how WMF is using the data on the mobile display. I ask because I do not know how they do it, so cannot predict how easy or otherwise it would be to block from Wikipedia side. Ordinary logic suggests that it may not be so easy, or it would already have been done. · · · Peter (Southwood) (talk): 04:44, 18 December 2017 (UTC)
    Without the magic keyword being active, it's not possible to easily prevent the import. However, once the feature is implemented you will be able to run a bot quite easily that creates "magic keyword = ' '" for every article that's protected or for other classes of articles where there's the belief that the class of article shouldn't import Wikidata and is better of with showing the user a ' ' instead of the Wikidata description.
    Additionally, I think the WMF should hardcode a limitation that once a Wikipedia semiprotects an article the article stops displaying Wikidata derived information. That would take some work on the WMF, but if that's what they have to do to get a compromise I think they would be happy to provide that guarantee. ChristianKl (talk) 12:31, 18 December 2017 (UTC)
    I would be both encouraged and a bit surprised to see WMF provide a guarantee. So far they have been very careful to avoid making any commitments to anything we have requested. I will believe it when I see it. I have no personal knowledge of the complexity of coding a filter that checks whether the article is protected or semi-protected and using that to control whether a Wikidata description is used that is fast and efficient enough to run every time that a short description may be displayed. but I would guess that this is an additional overhead that WMF would prefer to avoid. Requiring such additional software could also delay getting the magic word implemented, which would be a major step in the wrong direction. This needs to be simple and efficient, so the bugs will be minimised and speed maximised. Putting in a blank string parameter that displays as a blank string is easy and simple and requires no complicated extra coding. This can be done by any admin protecting an article where there is no local short description. · · · Peter (Southwood) (talk): 16:33, 18 December 2017 (UTC)
    ChristianKl, Wouldn't it make the most sense to produce a description for each protected article, rather than produce a blank? We're talking about a considerable shorter list than all articles here. Then we would have functionality (what WMF says they want) and protection from vandalism on Wikidata.--Carwil (talk) 15:23, 19 December 2017 (UTC)
    I agree with you that writing description in those cases makes sense, but the people who voted #2 seem to have the opinion that this isn't enough protection from vandalism on Wikidata and don't want that Wikidata content is shown even if the 'magic word' is empty. ChristianKl (talk) 17:55, 19 December 2017 (UTC)
  • #1 Most of the time, for most purposes, the Wikidata descriptions are fine. #1 gives a sensible over-ride mechanism for cases where there is particular sensitivity. Jheald (talk) 20:55, 17 December 2017 (UTC)
    Unless you have a reasonably robust analysis to base this prediction on, it is speculation. However it will make little difference in the long run, as it will be easy to override the Wikidata descriptions through the magic word. It is just a question of how tedious it would be, depending on what proportion of 5.5 million will have to be done.· · · Peter (Southwood) (talk): 04:44, 18 December 2017 (UTC)
  • #2 Agree with Alsee. Most sensible. Wikidata entries can be imported initially, if needed. It's easiest if things are kept on the same area, with the same community and policies. I really don't see the advantage of having things on Wikidata - the description is different for every language. Galobtter (pingó mió) 11:54, 19 December 2017 (UTC) Peter Southwood makes excellent practical points - an interim measure, to get things moving more quickly, I see some value in initially displaying the Wikidata description as a default for a blank magic word parameter, as it is no worse than what WMF are already doing too. Galobtter (pingó mió) 11:58, 19 December 2017 (UTC) Addendum, #5 is basically it, yeah. What matters is in the long-run it's there here. Galobtter (pingó mió) 07:21, 10 January 2018 (UTC)
Galobtter—Wikidata has a data structure that maintains separate descriptions for each language (or at least each language with a Wikipedia).--Carwil (talk) 15:23, 19 December 2017 (UTC)
I know that, what I'm saying is why have all that when it's mostly useful only to that language wikipedia. Other data on wikidata is useful across wikipedias - raw numbers, language links, etc Galobtter (pingó mió) 15:27, 19 December 2017 (UTC)
If the description is defined in the article text, then it can only be used in that article, on that language Wikipedia. If the description is on Wikidata, then you can access it from other articles (e.g., list articles), and places like Commons (descriptions for categories on the same topic as the articles), or on Wikivoyage etc. If it's in a data structure like on Wikidata, then it's a lot more easy to automatically reuse it than if it's embedded in free-form text. Thanks. Mike Peel (talk) 15:34, 19 December 2017 (UTC)
The description on Wikidata will remain available, if other projects or environments want to use it and think it suits their purpose (and can better access it than Enwiki magic words). This is rather out-of-scope for this discussion though, which won't change anything on Wikidata. Fram (talk) 15:47, 19 December 2017 (UTC)
  • 2 because Wikidata is shite and shouldn't ever be used regardless of how good it is in some articles, WMF have had ample oppertunity to patrol that site and do something about it however instead they've ignored requests time and time again so it's about time we did something ourselves, Anyway back on point using blanks is better - If someone adds a silly description they'll be reverted and no doubt one will be added. –Davey2010 Merry Xmas / Happy New Year 23:31, 27 December 2017 (UTC)
  • Combination of 2 and 1: Wikidata descriptions are off (except for maybe a brief run in) but will be possibly shown when ONLY changes to them can occur in the watchlist. Doc James (talk · contribs · email) 06:24, 30 December 2017 (UTC)
  • 2, because we need complete editorial control over this type of content. Alternatively, #5 as a compromise. Tony Tan · talk 04:21, 17 January 2018 (UTC)
  • I am convinced that vandalism affects how brief descriptions in search results are displayed, making me reluctant to vote for #1. True, showing no descriptions would be less helpful for readers searching for lesser known topic with less recognizable titles/names. Nevertheless, I see the consensus favoring local control over central control. #5 is nice, but switching Wikidata descriptions off is not helpful. Rather there would be too much local work manually and/or by bot. Whether #2 or #5, local descriptions would be vandalized (unless protected in some sort of way?) in the same way Wikidata has been vandalized. Well, let's go for #2 for most people per others. Either also or alternatively, how about #4 - Have options #1 and #2 via user preferences, but use #2 as default choice? If anyone wants to use Wikidata descriptions, why not provide the option to "user preferences"? George Ho (talk) 12:23, 24 January 2018 (UTC)

Filling in blanks: option #5

Pinging everyone who !voted in the What to do with blanks discussion: Francis Schonken, Peter (Southwood), TheDJ, Fram, DannyH (WMF), Mike Peel, Sandstein , James, Dank, Carwil, TomT0m,ChristianKl, Jheald, Galobtter, Davey2010, Doc James.

If the debate is viewed as a simple option #1 vs option #2 outcome, the situation is rather unpleasant. By my count there is currently a majority for #2, however it's not exactly overwhelming. On the other hand #1 has substantial minority support, and the WMF is strongly averse to the possibility that descriptions get mass-blanked before we can repopulate with new descriptions.

There has been substantial discussion of an alternative that was not presented in the original RFC choices: a transition from #1 to #2. In the initial stage, any article that lacks a local description will continue to draw a description from Wikidata. We deploy the new description keyword and start filling in local descriptions which override Wikidata descriptions. Once we have built a sufficient base of local descriptions, we finalize the transition by switching-off Wikidata descriptions completely.

I believe multiple people above have expressed support above for this kind of compromise. People on opposing sides may consider this less-than-ideal for opposing reasons, however I hope everyone will consider this plan in an effort to build a collaborative compromise-consensus. Alsee (talk) 16:24, 6 January 2018 (UTC)

I don't really care how the transition is done (as long as it's done within a few months ish) - main goal is to eventually have everything on enwiki and rely little or not at all on wikidata. Galobtter (pingó mió) 16:58, 6 January 2018 (UTC)
I would grudgingly support a transition as long as WMF commits to a hard temporal deadline for switching off Wikidata descriptions. James (talk/contribs) 18:06, 6 January 2018 (UTC)
Yes, I would accept this compromise. It does not really matter in the long run, as long as WMF will switch off when Wikipedians decide that the local descriptions are adequate. It will be up to Wikipedians to get the descriptions populated. Anyone who wants to get Wikidata descriptions shut down sooner can make it happen by adding more short descriptions. · · · Peter (Southwood) (talk): 18:11, 6 January 2018 (UTC)
It still seems like a waste of time to me. Just use the Wikidata descriptions, and improve them there. Mike Peel (talk) 19:05, 6 January 2018 (UTC)
As we discussed below, the WMF plan is to switch from a Wikidata-fallback to full enwiki control when there are 2 million non-blank short descriptions on enwiki, which is roughly comparable to the number of existing descriptions on Wikidata. That will help to ensure that the readers and editors who use these descriptions won't notice a sudden degradation of the feature. -- DannyH (WMF) (talk) 19:16, 6 January 2018 (UTC)
I don't remember seeing any consensus regarding the 2 million quoted above.· · · Peter (Southwood) (talk): 15:22, 7 January 2018 (UTC)
As I think about this more, it would really just be easier if we could have that one item of Wikidata appear in our watchlists (for those who wish). I would than support simply continuing to use WD.
Would also be good to show the short definitions within Wikipedia text for those who wish and are logged in. Doc James (talk · contribs · email) 06:51, 7 January 2018 (UTC)
That should be the default then - that's the only way can get the same amount of people looking at it. It should also be more visible - appearing somewhere when editing on enwiki (maybe even "editable" there) but stored on wikidata. It's currently pretty obscure where it is stored; should be known to most people editing where they can change the description. Then it's somewhat reasonable. Objections would be enwiki control over them. Galobtter (pingó mió) 07:00, 7 January 2018 (UTC)
@Doc James: We're already half-way there with Preferences -> Watchlist -> "Show Wikidata edits in your watchlist" to see Wikidata edits in enwp watchlists, and this javascript to show the wikidata descriptions at the top of articles. It would make much more sense to me if we spent the developer time that would be spent on this pointless magic word on instead improving that functionality. Thanks. Mike Peel (talk) 07:58, 7 January 2018 (UTC)
Ah very nice user script. Have turned it on. Doc James (talk · contribs · email) 08:05, 7 January 2018 (UTC)
That script it quite nice; however it needs to be default (or atleast show somwhere when editing) so people can spot vandalism. Galobtter (pingó mió) 08:09, 7 January 2018 (UTC)
Keeping the description on Wikidata keeps it under the control and policies of a different project. It is unfair to Wikidata if English Wikipedia imposes their policies on what can be content in Wikidata which will happen if the descriptions are stored there. Moving text that is specific to a Wikipedia article to Wikipedia avoids that can of worms. · · · Peter (Southwood) (talk): 15:22, 7 January 2018 (UTC)
Yeah i did point out that addition objection above; will definitely need enwp policies on it though unsure how much clash will be there. Probably will have standardizations for it etc Galobtter (pingó mió) 15:24, 7 January 2018 (UTC)
That is not a deal breaker for me. I am happy to edit Wikidata and help develop polices their if needed. Doc James (talk · contribs · email) 03:48, 5 February 2018 (UTC)
Only having it in our watchlists isn't enough. For starters, we need to monitor other Wikidata changes as well, for as long as using Wikidata dierctly in our articles is allowed (infoboxes, templates like "official website", authority control), so we wouldn't only have the descriptions in our watchlist, but these others as well (and at the moment and for a long time already al irrelevant changes as well, while some relevant ones are missing). It would need to be in the page history as well, it would need to be immediate (now there often is a delay, somtimes of hours, before it appears in our watchlists). And then there are the protection and block issues, as Wikidata allows circumventing our blocks and protections. Finally, these descriptions are also meant for internal Wikidata use, but these too often clash with our purpose (e.g. the "Wikimedia list" descriptions, or descriptions including advice on which Q-number to use). Enwiki and Wikidata are two different worlds, with different policies, practices and purposes, and forced mixing of them is a bad idea (now and in the long run). Fram (talk) 10:11, 8 January 2018 (UTC)
  • Oppose any "compromise" forced by WMF bullying and selective reading. Enwiki should decide when to turn off the automatic use of Wikidata descriptions, not the WMF. Fram (talk) 10:15, 8 January 2018 (UTC)
  • I stand by my original choice. —TheDJ (talkcontribs) 11:58, 8 January 2018 (UTC)
  • I still agree with Mike Peel on this: the best use of WP editors' time would be to make good descriptions that then also appear on Wikidata. And the best use of WMF developer's time is not making special workarounds for en.wp but rather making tools that allow all Wikipedias to monitor and protect the short description fields stored on Wikidata. But given that we're likely to have such as system let me suggest: (1) Either WMF should find a way to immediately freeze edits on short descriptions for protected pages and the most-viewed pages (which might be technically difficult) OR the en.wp community should do a drive to write such descriptions, inserting them BOTH on en.wp using the magic word and on Wikidata. (2) The magic word system should include an "intentional blank" that is different from a not-yet-written blank space. The WMF should count these intentional blanks as part of its 2 million goal. Community members should only use this system when the description genuinely adds nothing to page because its title is already clear (3) Lists and disambiguation pages should be handled in context specific way (e.g., it might make sense for "Wikipedia disambiguation page" to appear on VisualEditor but not on a mobile or desktop view of the page).--Carwil (talk) 19:15, 9 January 2018 (UTC)
  • For 12 months the description for Bo Scarbrough was "Clemson gave him that 'l'" - it has been viewed 140+thousand times (not obscure) since- until I saw it using that script that shows the wikidata description. See here. 100%, definitely need it to be visible - very visible - on enwiki so that vandalism can be detected, if it isn't stored in a magic word. Needs to be visible in the history of the article and there are also problems with protections, blocks etc etc as Fram points out. Currently the security seems to be by obscurity, which is pretty awful (and also means that descriptions are far less likely to be added). I think for reasons like enwiki control (and so having standardization, guidelines by enwiki and making it more suitable for enwiki) and detecting vandalism it should be in a magic word. Galobtter (pingó mió) 07:29, 10 January 2018 (UTC)

Other discussion

I don't understand any of this. Could someone please explain this to me? Hydra Tacoz (talk) 21:49, 18 December 2017 (UTC)
@Hydra Tacoz: Some uses of Wikiepdia have short descriptions of articles (e.g. in the Wikipedia app or for search engines). Where and how we get or store this info is what is being discussed. A logical place to keep it is Wikidata but there are some problems with that. ―Justin (koavf)TCM 21:55, 18 December 2017 (UTC)
@Koavf|Justin Thank you! So, how is the Wikidata a safe place? What exactly is the Wikidata? Hydra Tacoz (talk) 22:04, 18 December 2017 (UTC)
@Hydra Tacoz: Wikidata is a sister project of Wikipedia: it is a wiki but it is not an encyclopedia like this is but a place to store structured data. If you aren't familiar with databases, it may seem confusing at first but imagine a musician (e.g. John Coltrane) and at Wikipedia, we would write a biography of him, Wikiqoute would have quotes by and about him, Commons would have photos or recordings of or about him, etc. Wikidata would store individual facts such as his birth date, citizenship status, record labels signed to, etc. One function of Wikidata internally for projects like Wikipedia is to store short descriptions of the subject--in this case, something like "American jazz saxophonist and composer". ―Justin (koavf)TCM 22:07, 18 December 2017 (UTC)
To clarify: The function of the short description on Wikidata is an internal Wikidata function. It is not, (or was not originally), intended for use as a description of a Wikipedia article for use when displaying a Wikipedia article name by a Wikimedia project other than Wikidata. WMF currently use it for this purpose because they consider it the best available alternative (pretty much the only existing non-zero option). The intention is reasonable, as it helps users to identify which of a selected group of articles is most likely to be useful, but has the problem that it is outside the direct control of the Wikipedia editors of the articles it is used to describe, and there are problems with persistent vandalism, inappropriate or sub-optimal descriptions, which can only be edited on Wikidata, and that some Wikipedians are not keen on being coerced into editing and maintaining Wikidata to prevent vandalism appearing in connection with Wikipedia articles. There is also a technical problem in that the short description is currently not visible from desktop view,and does not show up usefully on watchlists, so vandalism can go undetected by Wikipedians. The proposed solution is to provide a short description on Wikipedia for each article which can be drawn on for any purpose where it may be useful, and the WMF devs say that must be done by a new "magic word" which as far as I can make out is like a more efficient template function. Whether the magic word is initially populated with blanks or Wikidata short descriptions is relatively unimportant over the long term, as once it exists, Wikipedians can watch and change the short descriptions as and when editors feel that it is needed, from within the Wikipedia editing environment of the associated article - The short description will be part of the article itself, and changes will show up on watchlists. · · · Peter (Southwood) (talk): 08:22, 19 December 2017 (UTC)
Hydra Tacoz if you click this search link and type Bar into the search box (do not press enter) you will see a list of articles. The first entry will probably be Bar: establishment serving alcoholic beverages for consumption on the premises. That text is the short-description being discussed here. To help small-screen mobile users, that description appears at the top of an article when it's read in the Wikipedia App. It used to appear appear on the mobile-browser view as well. It appears in the link-tool in Visual editor, and it might appear elsewhere. That text is not written anywhere at Wikipedia. It comes from the Wikidata entry for bar. You can edit it there. As others noted, Wikidata is a sister project of the Wikimedia family. Wikidata has been creating those descriptions for their own use, and the Foundation decided it would be convenient to re-use those descriptions here. The EnWiki community was rather surprised when we realized it was added and how it works. There are concerns that most EnWiki editors never see those descriptions, and even if they do see them, they often don't know who to fix them. There is concern/debate about how well the Wikidata community can catch and fix vandalism. There are concerns that the descriptions are not subject to EnWiki policies, page-protection, or user-blocks. Edits at wikidata (including vandalism, biased edit-warring, or otherwise) will bypass any page-protection we put on the article, and we can potentially be blocked from editing a description if a Wikidata admin disagrees with EnWiki policies such as BLP. Descriptions intended for Wikidata-purposes also may not always be best suited for our purposes. The discussion here is for adding something like {{description|a retail business establishment that serves alcoholic beverages}} at the top of our articles, and using that instead of Wikidata's descriptions. Then the description can be seen, edited, and controlled just like any other article wikitext. The downside is that EnWiki and Wikidata would have parallel systems managing similar descriptions for similar purposes. Alsee (talk) 11:25, 6 January 2018 (UTC)

WMF two-stage proposal for Wikipedia-hosted descriptions

We've been talking for a long time about how to give Wikipedia contributors editorial control over the short descriptions. I've got a new approach to a solution here, and I'd like to know what you think.

First up, to establish where this approach is coming from:

  • English Wikipedia editors need to be able to see, edit and effectively moderate the short descriptions on desktop and mobile web, without becoming active Wikidata editors. That requires meaningful integration in Wikipedia watchlists and page history. Those are features that the Wikidata team is working on, but they don't currently exist, and they don't have a timeline for it.
  • The short descriptions are very useful for readers of the mobile apps and for editors using VisualEditor, and blanking a significant number of descriptions for a meaningful period of time would harm the experience for those readers and editors.
  • English Wikipedia editors should make the content decisions about how to actually populate the descriptions, including being able to specify pages where a description isn't helpful.

That last point is the one we've been wrestling with for a while. I was asking for examples of article pages that shouldn't have a description, and several people brought up examples where the article had a disambiguation phrase, and the short description was only repeating information from that disambig phrase. Here's some examples:

I said above that I didn't think the redundancy was harmful, but some folks pointed out that I was moving the goalposts -- asking for examples, and then saying they didn't matter. I was trying to make content decisions about the format, and I'm not one of the people who are doing the actual work of writing and moderating them. Fair point.

So I wanted to estimate how many useful descriptions there currently are -- taking out "Wikimedia list page" and "Wikimedia disambiguation page", and also taking out pages where the description is just repeating a disambiguation phrase.

Last week, Mike Peel generated a list of 1,000 random articles and descriptions, which helped us to survey the quality of descriptions on a decent cross-section of pages. I wanted a bigger sample, so we generated a list of 10,000 articles and descriptions.

Here's the breakdown from that larger sample of 10,000 random articles. This is my current definition of "not useful" descriptions:

  • 39.82% have no short description on Wikidata
  • 7.76% have descriptions that include "Wikipedia" or "Wikimedia"
  • 1.16% have a disambiguation phrase in the article title, which is entirely duplicated in the description (116/10,000, marked on the page linked above)

Putting those together, that makes 48.75% blank/not useful descriptions, and 51.25% useful descriptions.

Extrapolating that out to 5.5 million articles, it suggests that about 2.82 million Wikipedia articles have useful descriptions. (I'm open to continuing to iterate on the definition of useful vs not useful, if people have thoughts about that.)

Now, from the WMF side, the thing that we want to avoid is switching suddenly from 2.8 million useful descriptions to a much lower number, which is what would happen if we built a magic word that's blank by default. That would hurt the experience of the readers and editors who rely on the descriptions.

So, the solution that I'm proposing is: WMF builds a magic word that Wikipedia editors can populate with descriptions.

  • Stage 1: Initially, the display of the descriptions pulls from the Wikipedia-hosted magic words -- but if there isn't one on Wikipedia, or the Wikipedia one is some version of blank, then it falls back to showing the Wikidata-hosted description.
  • Stage 2: When there are non-blank Wikipedia-hosted magic words on a number of articles that's roughly comparable to 2.8 million, then WMF switches to only pulling from the Wikipedia-hosted magic words, and we don't fall back to Wikidata-hosted descriptions. At that point, descriptions that Wikipedia editors leave blank will actually be blank on the site.

The decisions about how to generate ~2.8 million descriptions will be made by Wikipedia editors, and the timeline is up to you. I'll be interested to see how that process develops, but it's not our process. Our role is to switch to full Wikipedia-hosted descriptions when there are enough descriptions that the folks who use them won't notice a meaningful degredation. So that's the idea.

One final thing that I want to say is that there are a lot of people in the movement, including the WMF, who believe that more integration and interdependence between Wikidata and Wikipedia is going to be key to the movement's growth and success in the future. We're going to keep working on helping Wikidata to build features that make that integration realistic and practical.

Right now, Wikidata doesn't have the working features that would make short descriptions easy to see and moderate from Wikipedia. In the future, as the Wikidata team builds those kinds of features, we'll want to keep talking to folks about how to encourage productive interdependence between the two projects. I'm hoping that a positive resolution on this short descriptions question helps to keep the door open for those future conversations. What do you think? -- DannyH (WMF) (talk) 00:32, 22 December 2017 (UTC)

I'm having trouble following this. Admittedly I'm not the sharpest crayon in the box, so could you help by giving the Cliff's Notes version of this proposal? Shock Brigade Harvester Boris (talk) 01:54, 22 December 2017 (UTC)
The WMF claims, based on a sample of 10,000 articles and some dubious mathematics (116/10,000 is not 0.01% obviously, but 1.16%) that about 2.8 million enwiki articles have useful Wikidata descriptions; based on this, they will continue to use the Wikidata description when there is no enwiki magic word description, until enwiki has populated 2.8 million magic words. At that time, they will switch off the "use the Wikidata description when there is no enwiki description" and switich to "show a blank description when there is no enwiki magic word description". 08:09, 22 December 2017 (UTC)
Your count is a bit optimistic. Apart from the maths error explained above, I see in your sample "12. en:German International School New York - school". I also see that in yor count of the disambiguations, you don't count ones that aren't identical, even if it is less specific and useful than the disambiguation: "3. en:William McAdoo (New Jersey politician) - American politician", or add something which is hardly useful: "15. en:Matt Jones (golfer) - professional golfer". That's 3 out of the first 15 you don't count as useless descriptions, or some 20% more. Which would drop your 2.88 million to 1.8 million or so... Fram (talk) 08:09, 22 December 2017 (UTC)
And Fram's math is also wrong here. Just to be precise, DannyH (WMF) found 116 uninformative (because duplicating) parenthetical disambiguations. There are 1197 of those, 116 of which are already counted as faulty, so adding 20% of the remainder gets us to 116+216=332. So the faulty descriptions for parenthetical citations are only 3.32% of the total. We're still at about half valid descriptions. (From my perspective, "professional golfer" is more informative than "golfer" in the same way that "American politician" is less informative that "New Jersey politician.")--Carwil (talk) 12:46, 23 December 2017 (UTC)
I found 20% of the first 15 overall, not 20% of the remainder. Of course, nothing guarantees that this percentage will remain the same across all 10K, but it is not the calculation you are making. My 20% was not only about disambiguations, e.g. my first example (number 12) is not a disambiguation. Fram (talk) 15:03, 23 December 2017 (UTC)
I agree with Fram that the above-mentioned statistical analysis does not bear close inspection and claims a significantly inflated number of useful descriptions. Instead of pointlessly arguing the red herring of how many are good and how many are bad, I think we would actually be better off with populating the magic word with Wikidata descriptions at the start, and immediately switching to only showing descriptions from Wikipedia, as then we could get rid of the garbage more easily, and would be free of interference and externally imposed vandalism at the soonest possible date. This path should also reduce the coding to the minimum achievable required complexity, as all it would have to do is produce a description string from Wikipedia, without any conditionals. Blank description => Blank display, Non-blank description => Non-blank display. This should be the lowest possible overhead with the lowest probability of bugs, and should allow us to get started with fixing earlier. If the Wikidata description is good enough that no-one bothers to improve it, then it can stay. Empty descriptions will be empty at Wikidata too, so no disadvantage. Vandalism copied over from Wikidata will be absent from mobile and VE display after being deleted once. Crap copied over from Wikidata can be seen on Wikipedia and eliminated, either by providing a better short description, or simply deleting it. Way better than continuing to pull dubious material from Wikidata until a somewhat arbitrary number of non-blank descriptions have been produced on Wikipedia. Once the magic word syntax has been defined, a single bot run authorised by Wikipedians can populate the articles, even before the display code has been finalised. · · · Peter (Southwood) (talk): 15:01, 22 December 2017 (UTC)
DannyH (WMF), I don't think your proposal is an improvement on what I have described here, it prolongs the lack of internal control over Wikipedia content unneccessarily and to no advantage. · · · Peter (Southwood) (talk): 15:19, 22 December 2017 (UTC)
Pbsouthwood and Fram: We can get into the weeds on duplicates, but ultimately I don't think it's going to provide a lot of clarity for the amount of time and work it would take. This is the fairest option that we can provide, and I can't keep coming up with new solutions just because you say it's not an improvement. We need to bring this to a conclusion. -- DannyH (WMF) (talk) 17:56, 22 December 2017 (UTC)
DannyH (WMF), Firstly I have no idea what "get into the weeds on duplicates" is supposed to mean, so cannot comment on it further. Secondly, I don't think it is a fairer option than the one I have described, depending on how you define "fair" in this case. Thirdly, I don't see that your options are actually "solutions". Partial solutions, perhaps. Compromises, yes, but so are most of the other options discussed. I think you are missing the point again. Please read my suggestion above, and instead of a blanket dismissal, explain where it has technical problems and what they are. · · · Peter (Southwood) (talk): 03:40, 23 December 2017 (UTC)
DannyH, where did I say "it's not an improvement"? I have only commented on your poor calculations, that's all. If you can't accept any comments, then just shut down this RfC and declare what the WMF will do. Fram (talk) 11:11, 23 December 2017 (UTC)
Fram and Pbsouthwood: We've been talking about this for months, and I have agreed with many points that both of you have made. This compromise that I'm proposing will result in fully Wikipedia-hosted descriptions, with control over individual pages where WP editors think the description should be blank. This is the thing that you said that you wanted. The only compromise that I'm asking for on your side is for Wikipedia editors to actually write the short descriptions that you said that Wikipedia editors want to write. Do you think that this is an acceptable compromise? -- DannyH (WMF) (talk) 22:44, 23 December 2017 (UTC)
DannyH (WMF), The compromise you are suggesting requires Wikipedians to produce 2.8 million descriptions before you will agree to turn off Wikidata as the empty magic word default, meaning that the problems of vandalism remain for that period, which may be a long time, as Wikipedia is edited by volunteers who will edit as and where they choose. Fram disputes the validity of this number, and I consider a simpler option preferable which could result in a much earlier shutoff of Wikidata, without any loss of useful functionality for WMF. You have not made any reply to my query regarding technical objections, so should I assume there are none? Have you read and understood my suggestion above which I have now set in italics so you can find it more easily? These are not rhetorical questions, I ask them in order to get answers which may be relevant to getting closer tom a solution. I cannot force you to answer them, but you might find that discussions go more smoothly and productively when you answer questions instead of changing the subject.
In answer to your question, No, until you have answered our questions I cannot accept your proposal as an acceptable compromise because I am lacking what I consider to be important data for making that decision. However, I speak only for myself. Others may agree or disagree with my opinions, and I will go with the consensus. What I am trying to do is get us there by getting the options as clear as possible. I also think that most, if not all of the regular Wikipedians here are doing the same. · · · Peter (Southwood) (talk): 16:57, 24 December 2017 (UTC)
Pbsouthwood, what you're describing above is absolutely within the bounds of the proposal that I made. We can turn off the Wikidata fallback when there's a comparable number of good descriptions hosted on Wikipedia. We're not going to decide how to populate the magic words; those are content decisions that Wikipedia editors can make. If people want to copy all of the existing Wikidata descriptions, then that would hit the threshold of descriptions, and we could turn off the Wikidata fallback at that point. Does that answer your question? -- DannyH (WMF) (talk) 18:30, 26 December 2017 (UTC)
DannyH (WMF), That answers my question partly. If WMF is willing to commit to switching off the Wikidata fallback when Wikipedia decides that all the acceptable descriptions from Wikidata have been transferred, or have provided better ones, then I would accept the proposal, but not if WMF plans to hold out for an arbitrary and poorly defined number based on a small sample and a dubious analysis. · · · Peter (Southwood) (talk): 19:01, 26 December 2017 (UTC)
  • Support for either the 'two phase' solution (filling descriptions first and switching off wikidata later), or copying the wikidata descriptions and shutting off wikidata more quickly.
    After reviewing the 10k random Wikidata descriptions I see DannyH's calculation of 2.8 million 'useful' descriptions at wikidata is somewhat of an overestimate. It includes few percent that have zero value, and another few percent with negligible value. However I expect we all have reasonable flexibility on the vague threshold for local descriptions to credibly substitute for wikidata descriptions. Alsee (talk) 23:02, 27 December 2017 (UTC)
Pbsouthwood and Alsee: Okay, good. Yeah, we can work together on what the threshold for switching would be. I was hoping at first that we could pull the description for every page (5.5m), but the query would take days to run, and I wanted to go through a sample by hand anyway. That's why I used the random 10,000. If somebody wants to get a better estimate somehow, that's cool, or we just say a round number like 2 million. Or maybe someone has a better idea for how to judge that threshold. What do you think? -- DannyH (WMF) (talk) 21:21, 29 December 2017 (UTC)
DannyH (WMF), I think that if we can agree on a reasonably objective way to establish that the usefulness of Wikidata as a source for short descriptions has been effectively exhausted, we don't have to agree on any specific number. At some stage Wikipedians will suggest that we have taken as many of the short descriptions as is reasonably practicable and are actually useful, and request a shutdown of the Wikidata fallback. We would establish this point by internal discussion and consensus, as is traditional. At this point I suggest that WMF be allowed a two week period to check, and show statistically convincing evidence that it is worth the effort of finding and extracting more, and a way to find them, or do the shutdown without further delay. There is nothing to stop anyone who thinks that scavenging the last few useful descriptions is worth further effort from doing so at any time after the shutdown. Nobody gains by haggling over a specific number in the absence of reliable evidence. A few weeks or months of actually creating and transferring short descriptions is likely to inspire a whole range of more useful ideas on how to estimate the cutoff point. · · · Peter (Southwood) (talk): 05:08, 30 December 2017 (UTC)
Pbsouthwood, I think we need some kind of goal to shoot for. If we just say that the community will decide when they feel like it's done, then we're going to find ourselves in exactly the same place, however many months away it is. I'd like to have a clear line that we can agree on, so we can go through the rest of the process in amicable peace. -- DannyH (WMF) (talk) 19:08, 1 January 2018 (UTC)
DannyH (WMF). There are two problems with this latest proposal:
  1. What makes you think it will be easier to come up with a reasonable, generally acceptable fixed number now than later?
  2. Who is going to produce a credible number without generally acceptable evidence of statistical validity, or an actual count? Your proposals so far have appeared ingenuous. I doubt that Wikipedians would accept your suggestions without fairly convincing evidence, and it is you who is asking for a fixed number, therefore your burden to find one that we can accept. If you are trying to delay things as much as possible, this looks like a very effective method of stonewalling any progress. · · · Peter (Southwood) (talk): 05:16, 2 January 2018 (UTC)
Please confirm that development of the magic word is not being delayed until a final decision on numbers is reached. · · · Peter (Southwood) (talk): 05:16, 2 January 2018 (UTC)
Pbsouthwood, we are both operating in good faith here. I posted a list of 10k random Wikidata descriptions, with an explanation of the methodology I used to arrive at an estimate of 2.8 million useful descriptions. I've marked all of the descriptions in that list of 10,000 which are only repeating information from the disambiguation phrase, and not counting them. If somebody else wants to go through it and mark ones that they think I missed, that's fine, and I'll adjust the estimate.
What we're measuring is partially a judgment call -- what counts as a repeat of the disambiguation phrase? -- so it's not a task that a script can do accurately. It needs a person who can go through descriptions and make that judgment call. I've done that for 10,000 descriptions, and it took me a couple of hours. I can't spend more time looking at a bigger sample.
Development of the magic word is not being delayed until a final decision on numbers is reached. We will build the magic word that overrides the Wikidata description. Making the switch to shut off Wikidata descriptions and only pull from the Wikipedia descriptions will depend on the number that we're talking about. I am suggesting 2 million descriptions, which is significantly lower than my estimate of existing descriptions. -- DannyH (WMF) (talk) 18:58, 2 January 2018 (UTC)
DannyH (WMF), in my personal life I'm not a fan of carving long term specifics in stone, and the wiki-community also generally deals with things on the fly. If the descriptions get filled in quickly then any target number isn't going to matter much. If things proceed too slowly then some sort of examination of what's happening would probably be warranted anyway. However I can understand some people may be more comfortable if there is a clear target in place. If it's important, I guess I could sign onto a 2-million-or-earlier target. If necessary we could always meet that target by copying wikidata descriptions. Alsee (talk) 18:35, 5 January 2018 (UTC)
Alsee, I think it's helpful to have an estimate to shoot for, so that the community can make the kind of decision that you're referring to -- do we need to copy Wikidata descriptions, or should we try to write them all ourselves? "We need to write 2 million descriptions" is very different from "we need to write a lot of descriptions." -- DannyH (WMF) (talk) 22:40, 5 January 2018 (UTC)
Alsee, I agree with your most recent analysis and comment.· · · Peter (Southwood) (talk): 05:08, 30 December 2017 (UTC)

This is the key bit:

English Wikipedia editors need to be able to see, edit and effectively moderate the short descriptions on desktop and mobile web, without becoming active Wikidata editors. That requires meaningful integration in Wikipedia watchlists and page history. Those are features that the Wikidata team is working on, but they don't currently exist, and they don't have a timeline for it.

Integration is not possible until this capability is created. The current situation exposes us to prolonged vandalism and thus harms our reputation. Doc James (talk · contribs · email) 06:12, 30 December 2017 (UTC)

I agree with Doc James. We need local editorial control over what is displayed on this Wikipedia. Tony Tan · talk 04:24, 17 January 2018 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

place "nobots" on inactive user's talkpage

It would be a good idea if {{nobots}} is placed on the talkpages of users who havent edited in more than 13 months. or maybe 12.

Currently there are many many users who fit this criteria. And they have subscribed to many newsletters, and whatnot. These things get delivered by bots. And if the user has set-up a bot for archiving then it means a lot of resources are being wasted, which can be saved.

If we are going to place {{nobots}}, then I think we should include {{not around}} as well. —usernamekiran(talk) 05:25, 7 February 2018 (UTC)

Most newsletters get delivered by WP:MMS, not really "bots" - and MMS doesn't respect that. — xaosflux Talk 05:41, 7 February 2018 (UTC)
@Xaosflux: Most of the talkpages of inactive users, that I have come across, get flooded with messages from feedback service, and newsletters. I was not aware MMS doesnt respect nobots. Does Legobot respect nobots? —usernamekiran(talk) 09:00, 7 February 2018 (UTC)
If users want to use such a feature themselves, they are free to. Otherwise, however, we should not be deciding for users what it means by inactive, and whether or not they want to receive messages. --Jayron32 11:50, 7 February 2018 (UTC)
Basically this. There are many benefits of an account besides editing, and receiving messages is just one of them. ~ Amory (utc) 12:04, 7 February 2018 (UTC)
No, that's not what {{nobots}} is for. Anomie 14:25, 7 February 2018 (UTC)
What Anomie said basically. There's nothing wrong with flooding of TPs as far as I can see, the user has opted-in to such messages (until they don't) and that includes times of inactivity. Also if you're going to argue with wastage of resources, there's areas which need far more efficiency than saving a few kilobytes that the bot will append to a talk page. --QEDK () 14:43, 7 February 2018 (UTC)

Add a "remove all redirect pages from watchlist button"

Over half of the pages in my watchlist are redirects, and I'm tired of them crowding up the top whenever I create a new batch, then 15% or so are changed by a bot because they're double redirects. I've stopped adding redirects to my watchlist when I create them, but that doesn't do anything to fix the redirects that are already on my watchlist. Care to differ or discuss with me? The Nth User 03:01, 30 January 2018 (UTC)

Not a bad idea, for that matter, what happened to the request (that I saw somewhere) for a button for one click removal of items from your watchlist? Is there a script that someone has written that puts a 'remove' button next to listings in the watchlist (or similar)? — Insertcleverphrasehere (or here) 03:37, 30 January 2018 (UTC)
@The Nth User: Try adding .watchlistredir { font-style: italic; } to your CSS. Then go to Special:EditWatchlist. You will be able to pick out the redirects by their italics. --Izno (talk) 04:05, 30 January 2018 (UTC)
@Izno: That worked, but what about category redirects? Care to differ or discuss with me? The Nth User 04:19, 30 January 2018 (UTC)
Soft category redirects, there's nothing you can do but guess at the ones that need removing (or removing them as they show up on your watchlist).
Hard category redirects should be caught by the above CSS, but I haven't checked. --Izno (talk) 04:26, 30 January 2018 (UTC)

categorys

please creating categorys for birth by day example november 15 birth — Preceding unsigned comment added by 5.219.149.1 (talkcontribs)

In a sense, we already have this tool. If you type a date in the year into the box in the top right-hand corner, you will get to a list of people born on that date in the year. Vorbee (talk) 19:37, 7 February 2018 (UTC)

Bot to change referencing system used

Hi, I'm proposing a bot that moves from having the standard inline references to using List Defined References. Now, this is a change that has no effect on the wikitext. Instead, it moves all references to their own section, for those unfamiliar. Clearly, the usage (or not) thereof is purely stylistic. Hence, the bot would only operate this process on request by a user. I believe this to be a valuable process, due to the ease with which large articles otherwise become unmanageable. For example, I had to make this change to fix the removal of a reference in another part of the article. The result was that a second usage of the reference became orphaned. It is on a large article like this, the very articles that pose most problems, that making the change manually is tedious. I therefore propose my bot. Bellezzasolo Discuss 22:49, 6 February 2018 (UTC)

Dangerous as proposed On paper this might be OK, but this could be a potentially very controversial bot. A user might request "Let's do all articles in Category:Physics and that would wreck havoc on the whole category. I'm not entirely against a bot like this if stronger safeguards were in place. Something like "following a discussion on the article's talk page [or Wikiproject]" / "after a 7-days WP:SILENCE period", the bot may facilitate implementing such a consensus. Headbomb {t · c · p · b} 23:13, 6 February 2018 (UTC)
Undo? Would the proposed bot be able to revert its action after the fact (and after new article edits) if there were issues, or would revision reversion be needed? — xaosflux Talk 23:30, 6 February 2018 (UTC)
Comment If you hadn't made that change yourself, User:AnomieBOT would have come along and done it once people stopped editing it so frequently. As for the proposal itself, I'd like to see a strong consensus before approving a bot, even an on-demand bot. Anomie 00:10, 7 February 2018 (UTC)
In reply to Headbomb, if a category were requested from the bot, it would only fix references on the category page itself, as I'm envisioning it would work.
In reply to Xaosflux, I'm certain the bot could revert itself on detection of errors - I'm guessing there is an API for that, although I'm not too familiar. I'd personally expect it to be operating as more of a semi-automated tool with a Wikipedia interface, and don't see any reason why users shouldn't be held responsible for ensuring correctness, as with WP:HG. It would be quite easy to prevent abuse by limiting the number of pages that can be requested by a user, based on experience.
In reply to Anomie, I didn't realize AnomieBot performed that function.
Bellezzasolo Discuss 00:22, 7 February 2018 (UTC)
@Bellezzasolo: I don't mean just "reverting" to the version before the bot's edit, I mean if the styling needed to be reversed somewhere after it was done, would the bot be able to un-convert they styling that it applied programmatically? — xaosflux Talk 00:56, 7 February 2018 (UTC)
@Xaosflux: I'm sure I could program that in - it's certainly possible, although it may mean a complete move from one style to the other if the previous article was some kind of Frankenstein mix. Bellezzasolo Discuss 01:05, 7 February 2018 (UTC)
Oh I agree, and this is not a "requirement" right now - just thinking if people begin arguing about ref styles. I understand from a bot perspective it would mostly need to be all one or the other. — xaosflux Talk 02:16, 7 February 2018 (UTC)

The proposal is inconsistent with the following passage from WP:CITE [emphasis added]:

Avoiding clutter
Inline references can significantly bloat the wikitext in the edit window and can become difficult and confusing. There are two main methods to avoid clutter in the edit window:
As with other citation formats, articles should not undergo large-scale conversion between formats without consensus to do so.

So you would have to gain consensus to change WP:CITE. I think it would a bad idea because there are several possible approaches to avoiding clutter in any particular article, so the alternatives should be discussed on the article's talk page rather than employing a bot. Jc3s5h (talk) 02:43, 7 February 2018 (UTC)

  • Oppose - What Jc3s5h said; i.e., there is a community consensus that WP:CITEVAR applies to the choice between inline and LDR. Furthermore LDR is not clearly superior to inline anyway. It has its downside, including the following.
    When editors remove content that leaves a ref unused, the software puts a BIG red cite error notice in the references section. This looks ugly to readers until an editor notices it and removes or comments out the unused ref. Considering that correct referencing is already too complicated for a majority of editors, it is not reasonable to ask editors to check for other uses of the refname and remove/comment out the LDR if their removal renders it unused.
    Then, sometimes, the original removal edit is reverted, and then we have a BIG red cite error notice for undefined refname—and a broken citation. Rinse, repeat. Years ago I raised the idea of preventing this effect by removing the BIG red cite error for unused references, but it gained no traction.
    In summary, there are two main obstacles: CITEVAR, and the lack of community consensus that LDRs are a net positive. ―Mandruss  03:20, 7 February 2018 (UTC)

OK, perhaps a less contentious area that AnomieBOT doesn't cover then:

Different, but related, proposal

A bot to fix unused LDRs that result in big red errors in the references section. This would work by detecting the errors, then commenting out the offending reference. By commenting out the reference, it is still available for review (and, if the article is being copy-edited, it doesn't dissapear on somebody). It would then post to their talk page notifiying them of its action. Bellezzasolo Discuss 04:09, 7 February 2018 (UTC)

Better alternative: Don't generate the big red message in the first place. This will result in some unused LDRs being retained indefinitely, wasting a little space, but so what? The message could be replaced with a different, smaller red one that is seen only if the registered user's CSS contains a certain statement, similar to the first line of my CSS. Then, editors interested in doing this kind of cleanup could watch for those messages and comment out the unused LDRs. (We would need to find an appropriate place to document the new CSS control.) Unused LDRs already place the article in a tracking category for cleanup, the only difference would be a less-visible message for a minor error. Either way, I don't see a compelling need for a bot. ―Mandruss  05:40, 7 February 2018 (UTC)
The reason for the big error messages is that this is something that needs to be fixed, not something that should be hidden or bypassed. Headbomb {t · c · p · b} 12:02, 7 February 2018 (UTC)
Exactly, just fix the reference. Only showing it to auto confirmed might be an argument, but disabling the reference goes completely counter to the point of the errors (i mean we could simply NOT generate any content if there is any error, and that would have the same effect as commenting it, but there's a reason we don't do that). —TheDJ (talkcontribs) 14:45, 7 February 2018 (UTC)
@TheDJ: Sorry, I don't follow. What does just fix the reference mean in the context of an unused reference? In the majority of cases the LDR became unused when the content that used it was removed. How is this an "error" as to verifiability? You do understand that we are not talking about undefined refnames in citations, right? ―Mandruss  20:09, 7 February 2018 (UTC)

Template:Non-free reduce bot

(copied from Wikipedia talk:Non-free content/Archive 68#Suggestion for new bot as there are not many contributors]]
I would like to propose a new bot to tag all new oversized non-free images with {{non-free reduce}}. I would like to explain a little history to this idea, and how many images it's likely to affect.

  • Back in October 2016, there were roughly 550,000 non-free images of which approx 300,000 were greater than the NFCC guideline, although it was not easily possible to find them! At that same time wiki made changes to the search engine (see mw:Help:CirrusSearch#File_properties_search), allowing one to search against various file parameters (width, length, resolution - where the programmers have decided that "resolution" equals the square root of the pixel count - strange but true). Following that change it has been possible to slowly work though this outstandingly large number, and now there are about 600,000 non-free images, of which there are less than 1000 which are oversized (for various reasons) - see Category:Non-free images tagged for no reduction, although I think some of these are slightly "gaming the system" and could be reduced - but it's a very small minority.
  • During this time it has become evident that every day there are, typically, around 70-80 new, oversized, non-free images uploaded. The majority of these (about 80%) are completely new, with the rest being updated images. About 10% already have {{non-free reduce}} added by the uploader, but the majority are not flagged in any way. A list of the last 7 days uploads (excluding those tagged immediately by the uploader for reduction) can be found at User:Ronhjones/7days
  • The proposal therefore to target the these new and untagged uploads, within a day of uploading. Thus we know that the uploader is active, and can therefore discuss the size, if appropriate. A new bot (I have roughly written it, but untested) could
    1. Using the wiki search engine, find all images with a fileres: >=325 (105625 pixel - reducing bots will not reduce below 105,000 pixels anyway)
    2. Calculate the correct pixel count of the last upload (unlike the wiki search, which also finds older versions if not yet RevDel.), and skip if size is OK.
    3. Check for the presence of {{non-free reduce}}; {{non-free no reduce}}; {{Permission OTRS}} - if any are present then skip.
    4. Add the {{non-free reduce}} to the top of the page
    5. Add a template to the last uploader's talk page to say that the image will be reduced and why, with instructions what to do and what not to do if a bigger image is needed. Possibly a choice of template to add depending of file type (e.g. bitmap vs. vector).

Pinging a few editors I know are interested in this area @BU Rob13, Stefan2, and Diannaa:
Time for some discussion...

  • Support. Seems a good idea. The one thing I would suggest, per the recent discussion on this page, would be to slightly up the threshold, say fileres: >=380 -- ie only reducing when the reduction is more than about a linear 20%. The 0.1 Mpx figure was only ever intended to be a loose indication; reducing by less than 20% seems to me overly fussy and scratchy to the uploader, for a change in size that's pretty much imperceptible, certainly makes no legal difference, and may produce artefacts in the image quality. It's the images much larger than this that we really want to control. Moving on such images straight after upload, while the uploaders' eyes are still on them, IMO makes a lot of sense. Jheald (talk) 22:16, 30 January 2018 (UTC)
    • How low a threshold is an obvious item to gain a consensus. 316 would be at the guideline, I said 325 above as that is the current point that the reducing bots stop working, 380 is 144400, and would eliminate 79 images from my 7 day list Ronhjones  (Talk) 22:53, 30 January 2018 (UTC)
      • @Ronhjones: A useful number to compare might be the total number non-free images added in the period. Can't tell from User:Ronhjones/7days because that only appears to list images over 105625 px.
      But I do think this is about right -- the images that are notably larger than norm would be reduced, and straightaway, while they were newly uploaded; without being unnecessarily scratchy to uploaders who were pretty much on the norm with changes that would be barely noticeable. Jheald (talk) 23:44, 31 January 2018 (UTC)
  • I can't get the exact number - but below "325" is very small. An API call gives me 1160 non-free images in those 7 days, take off my 498 leaves 662, take off 502 files that DatBot reduced is 160, then remove 91 (SVGs I manually reduced) leaving 69 non-free images uploaded below "325". SO new uploads I estimate are 498 oversize + 69 undersize = 567 total, so 88% were oversize. Ronhjones  (Talk) 01:40, 1 February 2018 (UTC)
(End of copied content)Ronhjones  (Talk) 14:24, 3 February 2018 (UTC)
The threshold does need to be agreed. I will note that whatever threshold is used, I'm sure there will be users who will game the system - with the current reducing bot set to 105000 pixels, I have seen far too many images uploaded at just a fraction below that to be a co-incidence (and often after a proper reduce to <100,000 pixels, then uploading a new image with a comment of something like "correct colours", or "clean up"). For the (as yet not created) advisory templates, I've a few ideas at User:Ronhjones/Sandbox4. Ronhjones  (Talk) 14:28, 4 February 2018 (UTC)
Agree - which is why I proposed to send the uploader a message as well. Ronhjones  (Talk) 22:34, 6 February 2018 (UTC)
  • Oppose Are you assuming all images are square? Because otherwise your bot would incorrectly flag an image that was, say, 400x10, and would miss an image that is 200x2000. In addition, I would be opposed to any tagging of images that small. 0.1 megapixels is a guideline, not a policy, and the language used is most common pictorial needs can be met with an image containing no more than about 100,000 pixels (and Masem, who added that to the guideline, has said that it was never intended to be a hard limit). However, the determination if the needs can be met by an image of that size need to be made by a human, not a bot, and unleashing the bot as you've described it is going to result in the needless lossy resizing of images that arguably meet the "minimal use" criterion. If you're going to do this, I would say that Fbot 9's threshold of 164,025 should be an absolute minimum, and frankly I'd be more comfortable with a number closer to 307,200. Anything less than that could get tagged with a "needs human review" tag, but not with something that's going to cause DatBot to come along and automatically resize the image without human intervention. --Ahecht (TALK
    PAGE
    ) 23:18, 5 February 2018 (UTC)
Fbot9 refers to DashBot - which used to give quite a bigger image, that task was taken over by Theo's bot in 2011 (and makes images <100,000 pixels). I don't understand your reasoning - 400x10 is 4000 pixels and would not be flagged, 200x2000 is 400,000 pixels and would be flagged - high ratios images are not the current norm. The main point of the idea is to alert the user (who we know is active) and give them advice on how to act if the image needs special treatment.Ronhjones  (Talk) 22:31, 6 February 2018 (UTC)
If the idea is to alert the user, have the bot post a message on their talk page. {{non-free reduce}} goes far beyond just alerting the user, since it will also alert a bot to automatically resize the image within 24 hours. --Ahecht (TALK
PAGE
) 16:43, 7 February 2018 (UTC)
Even if reduction is before the uploader checks, they still have seven days to revert, as always. Ronhjones  (Talk) 22:46, 7 February 2018 (UTC)
  • If there is such a bot it should also check if the file was already reduced, or that the previous reduction has been reverted. There are various reasons for not reducing the size, and a bot will not be expected to understand legibility or previous debates on the topic. For new oversized images it would be a useful activity. Quite a few images also would qualify for pd-simple, and even though labelled fair use, do not need that. Yet others actually are released under licensed that permit us to use a big size, eg CC-BY-ND (or -NC) but still do not qualify for compatible licenses. Graeme Bartlett (talk) 06:27, 7 February 2018 (UTC)
This can only work on new oversized images as there are no old ones that are oversized (except those already tagged with {{non-free no reduce}}. I'm currently doing the batch of new files manually each night. I'm sure any fine details can be sorted out at a WP:BRFA Ronhjones  (Talk) 22:46, 7 February 2018 (UTC)
  • Unrelated metacomment Could the title of this discussion be changed to something more descriptive and less vague? There are several bot-related discussions going on at the same time, and the rest of them at least hint on their purpose. {{Anchor|Suggestion for new bot}} can be placed for continuity.   ~ Tom.Reding (talkdgaf)  20:51, 9 February 2018 (UTC)

Providing some way to easily spot when controversies or criticisms are removed or reduced

Hi

I have found several articles over the years where the Controveries or Criticisms sections of articles (mainly companies and politicians) has been deleted by IP editors (most probably who have very direct COI) with innocuous editing summaries like 'copyediting' or 'grammar fixes' and it hasn't been caught by people watching the pages (or no one is watching the pages).

I wonder if there is some way to flag this? Perhaps there can be a bot that checks for where section headings or large chunks of text related to controversies and criticisms have been wiped and flags them somewhere for review?

Thanks

John Cummings (talk) 13:10, 11 February 2018 (UTC)

  • Oppose per Wikipedia:Criticism: in many cases separate Controversies or Criticisms sections are symptomatic of bad article writing (with "troll magnet" as a further symptom) – it is not up to a bot to impose such bad writing. At least, the bot would not be able to distinguish between cases where the controversies and/or criticisms should better be integrated in the main narrative of the article, and where they deserve a separate section. The separate section option should be a very rare exception (while indeed, "troll magnet"): the fact that it is not a too rare exception speaks to the lack of quality of many of these articles: the overall quality of the article is what needs addressing in most cases, and in a way that can hardly ever be operated by a bot or otherwise bot-assisted. I'd support a bot removing "Controversies" and "Criticisms" section titles for those articles where there never was a talk page discussion on whether such separate section should be present in the first place. That would be a bit unrealistic as a bot task, but still I'd rather support that than what is proposed in the OP. --Francis Schonken (talk) 13:43, 11 February 2018 (UTC)
This seems to misunderstand the proposal, which talks about flagging such edits for human review, not about automatically reverting them. As for the essay Wikipedia:Criticism, it makes valid points about formatting and style, but the reality is that the section layout that it criticizes is still widespread for the time being, and I understand the proposal is about detecting content removals, not layout changes. Regards, HaeB (talk) 14:56, 11 February 2018 (UTC)
Still, oppose the whole idea: an edit *starting* a separate Controversies or Criticisms section should be flagged, not the edit that deals with the more often than not undesirable separate section. Imho "... still widespread ..." is part of the problem (as I explained above), not something for which we should seek a status quo: the flagging would not occur when a troll expands a Controversies or Criticisms section with WP:UNDUE details, but when such trolling would be removed it would get flagged. A no-no for being too supportive of more often than not deplorable article content & layout. --Francis Schonken (talk) 15:37, 11 February 2018 (UTC)
@Francis Schonken:, I'm sorry I'm not being clear, I'm looking for ways to flag for a human to look at instances where controversies and criticisms have been removed, which is often done by IP editors with probable COI (see the UK parliament IP edits as a good example of continued PR removal of information over many years). I'm suggesting these common sections are a good place to start looking for this kind of activity, rather than being the only place to look for it (however you feel about their existence). — Preceding unsigned comment added by John Cummings (talkcontribs)
No, you were perfectly clear and I oppose it, for rather protecting WP:UNDUE criticism than countering it; and not protecting criticism that is in the article conforming to the Wikipedia:Criticism guidance, and protecting (often trollish) content in a separate section. The approach is unbalanced towards bad habits ("troll magnet" is a far more widespread problem, and established as a problem for a much longer time than IPs removing criticism that would pass WP:DUE). --Francis Schonken (talk) 17:03, 11 February 2018 (UTC)
I agree that such edits (with deceptive edit summaries) are a problem. FWIW, there are already the "section blanking" and "references removed" edit tags. Regards, HaeB (talk) 14:56, 11 February 2018 (UTC)
Still: rather protective of the often undesirable separate Controversies or Criticisms section instead of addressing the layout issue. --Francis Schonken (talk) 15:37, 11 February 2018 (UTC)
@Francis Schonken:, I appreciate your point that criticism and controversy sections may be being overused and sometimes should be moved to different sections, but this isn't a discussion about that. Its a discussion about finding a way to better find and flag for review COI edits which remove information that is accurate and within the scope of Wikipedia. I am suggesting that one way of surfacing these edits is by paying attention to what happens in these sections. Thanks, John Cummings (talk) 16:47, 11 February 2018 (UTC)
Re "this isn't a discussion about that" – that's why the point needs to be mentioned, while your proposal would mess (i.e. in a less than desirable way) with something that is already a long-time problem. --Francis Schonken (talk) 17:03, 11 February 2018 (UTC)
Thanks @Kb.au:, does the existing section blanking filter account for if the whole section is removed including the heading? John Cummings (talk) 16:33, 11 February 2018 (UTC)
Specifically it detects the removal of a section heading. Personally I don't see any advantage to highlighting the removal of Criticism sections over any other. I also agree with those who think these sections are usually problematic anyway. -- zzuuzz (talk) 17:48, 11 February 2018 (UTC)
Thanks @Zzuuzz:, so the abuse filter already displays this kind of activity, its just that people aren't catching it. Is it that there are just too many edits happening for someone to see them all? I can see that some of the ones I've seen could be seen as OK edits, but not many, thinking about UK PMs for instance, many edits happen where people try and delete the expenses scandal section and sometimes it doesn't get reverted. John Cummings (talk) 18:24, 11 February 2018 (UTC)
I can't say if people are catching it - knowing people they probably are - but it's correct they're already flagged along with other section blanking. Who's to say that removal of a criticism section is worse than removal of a section by any other name - criticism, praise, achievements or otherwise? (I note "Expenses scandal" is a common section header for UK politicians). Such a filter may be useful to detect problematic sections, but I would not like to think of it as a 'revert list', as it would become, which whether intentional or not is precisely how you just described it. -- zzuuzz (talk) 19:09, 11 February 2018 (UTC)

Not a good idea Such sections are generally a bad idea anyway. North8000 (talk) 18:54, 11 February 2018 (UTC)

  • Interested could we get a trial edit filter to see what the scope of the issue is? If editors are restructuring controversies, i.e. incorporating elsewhere in the article as suggested, that's one thing. But I've seen advocacy editing more than once that simply deletes such material. Examples here and here. It's fairly common for this kind of wikiwashing job to be advertised on one of the job boards. ☆ Bri (talk) 19:40, 11 February 2018 (UTC)
  • Problematic at best "Criticism" sections tend to be pretty much entirely negative and of UNDUE weight, pretty much in every case. In many cases, they use sentences giving what appears to be a Wikipedia imprimatur to the "criticism" which is contrary to a bunch of policies. Typically we see "George Gnarph claims A, but everyone knows that A is false." type edits. Collect (talk)

  Comment: Perhaps an alternative proposal to try and address the same problem would be to look at edits with a large number of characters removed where certain words appear, e.g criticism, scandal, illegal, conviction etc. John Cummings (talk) 09:42, 12 February 2018 (UTC)

Re. "... removed ...": "... removed or added ..." I'd say. I'd oppose any approach that is more sympathetic to futile criticism being added than such criticism being removed. Again, adding exaggerated & WP:UNDUE criticism is more often than not the elephant in the room, and across many articles the bigger problem. The "section blanking" tag, which is over-all a good idea, already has a bit of an undesirable side-effect of rather protecting against deletion of unwanted criticism sections, than against creation of such sections based on dodgy verifiability (often introducing WP:BLP problems, etc, into an article). I can live with that side-effect because of the over-all advantages of that tagging. But oppose further imbalances in maintenance systems favouring *the more problematic* additions over often *less problematic* deletions: currently, if someone merges criticism from a separate section to the main narrative of the article, and deletes the undesirable section header, that gets tagged as "section blanking": don't push it, I'd say. --Francis Schonken (talk) 10:09, 12 February 2018 (UTC)

A proposal to permanently semi-protect the Template space

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.



Following a series of vandal edits to the template space (here and here, and the ANI about it), MusikAnimal template-protected (without opposition) templates with 5k+ transclusions and semi-protect all those with 1000+ transclusions.

Earlier today a template with 956 transclusions was vandalised.

Due to the fact that the template space isn't widely patrolled, and the potential for great harm to be done in a very short amount of time, I am proposing that all templates be semi-protected. This would likely involve a software change, but I think it's the only way to ensure that major vandalism on a relatively-unused template doesn't sit around for ages, to be seen by the unsuspecting public who might not know how to let us know to fix it.

I briefly discussed this off-wiki with Cyberpower678, who said they would support if there was a minimum transclusion count (10+ was discussed) so I am amenable to that option, but from an administrative standpoint I think it's easier to just batch-protect everything (unless we can get a protect-bot that can autoprotect templates with 10+ transclusions). Primefac (talk) 18:18, 21 December 2017 (UTC)

NOTE: This would not require a "software" change (to apply to an entire namespace), but would require a configuration change in InitialiseSettings.php. See example of auto confirmed being used for the Module: namespace on eswiki (scroll to the bottom of the page). — xaosflux Talk 19:30, 21 December 2017 (UTC)
I mean, that is a vague problem. But rare. Galobtter (pingó mió) 03:58, 22 December 2017 (UTC)
I don't find zzuzz's argument convincing. "Autoconfirmed socks are easy to come by" If you bother getting your mouse over to the create an account button. Driveby vandals aren't usually socks. And I didn't think this was being proposed as a cure for socking, just vandalism which can be pretty hard to detect. I'm fine for the base limit to be raised to a thousand transclusions or something like that, but SEMI is a very effective anti-vandal lockdown. Also "encyclopedia anyone can edit" doesn't apply here. The template space is maintenance and framework of the encyclopedia, not the content. L3X1 (distænt write) 20:19, 21 December 2017 (UTC)
Oh the number of autoconfirmed socks I've blocked - these are dedicated regular vandals who do this, not drive-bys. But that's a minor point - I disagree with you strongly about the use of templates. Some are indeed maintenance and framework templates (particularly those targeted), but the vast majority of templates (particularly those edited by unregistered users) contain lists, and names, and numbers, and very much form part of the content. -- zzuuzz (talk) 20:36, 21 December 2017 (UTC)
  • Comment Primefac, wouldn't it be easier just to roll out an ACTRIAL like technical restriction rather than semi-protect every new template? TonyBallioni (talk) 19:08, 21 December 2017 (UTC)
    TonyBallioni, I'm not overly concerned about the creation of templates, I'm concerned about vandalism to templates. The vandalism to Template:Redirect-multi was done almost two years after the page was created. Primefac (talk) 19:24, 21 December 2017 (UTC)
    Right, but what I'm saying is that there may be a way simply to prohibit any editing to templates to non-AC users on en.wiki without having to manually protect them, which would be ideal. TonyBallioni (talk) 19:26, 21 December 2017 (UTC)
  • Oppose basically because of this. This proposal is fine in theory for long standing prominent, widely-used and heavily transcluded templates - typically maintenance templates. However there's a notable maintenance of less common templates by new and unregistered users, including those with dozens of transclusions. I find it's especially notable in sports templates, but several other topics - politics, media, software, all sorts actually. Many of these templates are almost exclusively maintained by unregistered users. Several hundred transclusions would be my floor. I would prefer instead that the current system of caching and purging is improved to reduce any vandalism's longevity. Also, autoconfirmed socks are incredibly easy to come by. -- zzuuzz (talk) 19:12, 21 December 2017 (UTC)
I do see that problem. Don't see why it has to be several hundred transclusions though. Most of those have less than 10 transclusions. Galobtter (pingó mió) 03:58, 22 December 2017 (UTC)
But plenty have more than ten transclusions. It's a figure based on experience. Take some random obscure examples recently edited by unregistered users: Template:Philip K. Dick - 184 transclusions; Template:Syracuse TV - 37 transclusions; Template:Miss International - 64 transclusions; Template:Cryptocurrencies - 109 transclusions; Template:Pixar - 110 transclusions; Template:Flash 99 transclusions. There's a lot like this. -- zzuuzz (talk) 07:22, 22 December 2017 (UTC)
  • Oppose Wikipedia is the encyclopedia that anyone can edit but there are only 159 editors with template editor right. I don't have this right myself and would resent being considered a suspicious vandal by default. We have far too much creeping lockdown of Wikipedia which is tending to kill the project. If such paranoia had been allowed to prevail in the early days, Wikipedia would never have been successful. Andrew D. (talk) 19:51, 21 December 2017 (UTC)
    @Andrew Davidson: This proposal is to semi-protect the templates, not template protect them. Just thought I would clarify that.—CYBERPOWER (Merry Christmas) 20:09, 21 December 2017 (UTC)
  • Oppose zzuuzz's argument is convincing. Jo-Jo Eumerus (talk, contributions) 20:01, 21 December 2017 (UTC)
  • Sort of oppose, sort of support. 10 transclusions is way too low a limit and the devil is in the details. Maybe 100+, or 1000+ and I'd be OK with this. But I feel a better approach might be to simply be identifying templates that do not need to be edited regularly, and semi-protect those (e.g. semi-protect maintenance templates, infoboxes, etc...). But I do not see a need to semi-protect templates like navboxes, for instance. Headbomb {t · c · p · b} 21:13, 21 December 2017 (UTC)
  • comment would it be better, perhaps, to use PC1? seems to me, aht it's a useful way to allow ipusers and new registered to make productive edits, without them going live immediately. (this may, however, require technical changes. -- Aunva6talk - contribs 22:32, 21 December 2017 (UTC)
  • Support semi-protting, but ONLY for templates with 250+ transclusions. The vandals are going to be drawn to those templates that are widely used in order to cause chaos, so the simplest thing to do is to semi-protect only those templates that are used in many articles at once. As per usual I oppose CRASHlock on ideological grounds, not to mention that only an idiot would want that many pages CRASHlocked all at once.Jeremy v^_^v Bori! 22:57, 21 December 2017 (UTC)
  • There must be some vague way to figure out if something is a maintenance or content template. Maybe make templates that take parameters semi-protected? Some sort of solution of 10+ transclusions and that. But it'd also have to be designed to prevent abuse. Galobtter (pingó mió) 03:58, 22 December 2017 (UTC)
  • Oppose. Aside from zzuuzz's points, it would vastly increase the workload on WP:Template editors (and admins who respond to requests that TE's can also do). Yes, non-TEs could respond to template-editing requests on templates that are only semi-protected, but most of them don't know that, and it would likely not be obvious what the protection level was when it came to any particular request. I think a more reasonable proposal would be to semi-protect templates with X number of transclusions. 100? 250? 1000? I'm rather agnostic on the matter. A good thing to do would be to find the sports-specific template that is used on the largest number of pages that is not an infobox (we do not want anons adding or deleting parameters from an infobox, because those templates are a massive WP:DRAMA factory as it is) and not a navobox (because we have rules about them, and anons mostly will not have read them). There are likely football (soccer) templates relating to team roster, uniforms ("kit"), league standings, etc., used on hundreds of articles at least, possible 1000+, to which anons with some experience could meaningfully contribute. Might give us an idea what number we should be thinking about. Anyway, I would actually like to see automatic semi-protection on somewhat-high-use templates as an anti-vandalism method, and also as a means for reducing the number of templates that have template-editor protection for which semi-protection would actually be sufficient. That will not only waste less TE time, it will get us more template development by anons and non-anons.  — SMcCandlish ¢ >ʌⱷ҅ʌ<  05:07, 22 December 2017 (UTC)
  • Oppose. Excellent arguments made by zzuuzz. I read through the recent changes link, and surprisingly few of the diffs listed were vandalism. We might want to protect highly-used maintenance templates. Perhaps 5000 transclusions would be a good floor for this (from what I've seen at the most-transcluded templates report). Enterprisey (talk!) 05:16, 22 December 2017 (UTC)
  • Support Vandalism on template is affecting several pages at the same time. Any IP user can propose any meaningful change via edit request which normally doesn't last 24 hours without getting response. –Ammarpad (talk) 09:15, 22 December 2017 (UTC)
    Based on recent data this would affect around 1,500 edits every week. I think most wouldn't even bother making requests, but if even a proportion did I would expect that to increase. -- zzuuzz (talk) 10:53, 22 December 2017 (UTC)
  • I would support a proposal for using a bot to give all templates with more than 10 transclusions semi-protection, or pending changes protection if possible; and to remove semi-protection if templates, having been protected by the bot, have their number of transclusions reduced below 10. Automatic semi-protection of all templates is definitely overkill (having a bot find the number of transclusions is definitely possible), and 5000 as a minimum for semi is definitely too high (I'd say even 500 would work for template protection). Note that none of these would increase the workload of template editors, since they are only needed for editing template-protected templates and modules. The bot could also avoid protecting templates consisting solely of navboxes or sidebars, since they are supposed to be transcluded on many pages by design and are relatively easy to edit. Jc86035 (talk) 11:38, 22 December 2017 (UTC)
  • OP Comment Okay, a few things I've seen that keep popping up:
    First, all templates with 1000+ transclusions are already semi'd (and 5k+ TE'd). This proposal is talking about those with 0-999 transclusions being potentially vandalised.
    Second, the proposal is for semi protection, which would not increase Template Editor's jobs in any way, shape or form. They would of course be welcome to patrol the edit requests for Template-space issues, but not required.
    Third (going off the previous note) the TEs aren't exactly hammered under the weight of responsibility. We get maybe three TPER requests a week.
Just felt I should clarify those things going forward. I will admit that IPs aren't all bad with their changes, especially to Sports templates, but those type of frequently-updated templates are (for the most part) single-season templates that are rarely used on more than 10 pages (which would mean the 10+ option of semiprot wouldn't affect them). If Cyberpower says he can make a bot to do it, then it wouldn't involve any software changes. Primefac (talk) 13:22, 22 December 2017 (UTC)
And for what it's worth, I don't particularly care if we decide on 10+ or 250+ or 500+, but I think there should be some threshold for semi-protecting templates. Primefac (talk) 13:24, 22 December 2017 (UTC)
So... if we (assuming we can) run the numbers, looking at frequency of edits and number of transclusions, what does that graph look like? Is there some evidence based threshold beneath which most IP editors can happily plod along, while the rest of us can avoid having a cock and balls transcluded on a few hundred pages every few weeks? GMGtalk 13:44, 22 December 2017 (UTC)
  • Oppose, current protection level works generally well, and vandalism level on templates isn't very high. Openness ("anyone can edit") is more important than restricting vandalism to sleeper accounts. —Kusma (t·c) 18:17, 22 December 2017 (UTC)
  • Oppose blanket protection. zzuuzz's argument is compelling and this is the free encyclopedia that anybody can edit. At a certain number of transclusions, the tradeoff points in the direction of protecting. So I would support an x+ transclusions rule. Malinaccier (talk) 18:28, 22 December 2017 (UTC)
  • Oppose, there are template such as sports competitions, sport team lineups, list of stations etc, where IP contributions are mainly constructive and helpful. I could support the x+ protection, though 10 inclusions looks like a low bar for me (a football team lineup is at least 23 + the team + the coach), I would more think about fifty or so.--Ymblanter (talk) 00:00, 25 December 2017 (UTC)
  • Oppose This is the encyclopedia anyone could edit. Functionally, semi-protecting everything would be great, but its not what we do and it's fundamentally against our ideas. Don't do this kneejerk action. Be smart, and judge it on a case-by-case basis. !dave 08:17, 28 December 2017 (UTC) moved to support
  • Oppose. There are many templates that would be perfectly legitimate for newer editors to edit, especially navboxes and the like. I would support semi-protecting everything with 100+ transclusions. We probably should create an edit filter logging newer editor's edits to template namespace, as an aside. ~ Rob13Talk 08:24, 28 December 2017 (UTC)
    @BU Rob13: If you use the new filters form on RC/WL/RCL (see beta preferences), this is all non-EC edits to template space. --Izno (talk) 13:39, 28 December 2017 (UTC)
  • Support on templates used more than 100/200 times but Oppose generally. Doc James (talk · contribs · email) 05:59, 30 December 2017 (UTC)
  • Oppose entirely per my response to BU Rob13. The majority of changes made in the template space by non-EC users are good changes; routine updates to navboxes seemingly the majority of such changes. --Izno (talk) 06:32, 1 January 2018 (UTC)
  • Oppose Counting transclusions isn't that useful in template vandalism, you really care about page views. If a template is transcluded 30 times, but one of those pages is Obama, it's a highly viewed template and should be protected. But if a stub template is used on a thousand pages that barely get read, protecting it is of little use. Beside that, protecting an entire namespace is an extremely dangerous road to go down IMO. We should always default to open. Legoktm (talk) 07:11, 2 January 2018 (UTC)
  • Support Semiprotection is not an insurmountable hurdle. The benefit of this proposal outweighs concerns, imo. James (talk/contribs) 14:43, 2 January 2018 (UTC)
  • Support with a reasonable threshold of transclusions (say 200?). Peter coxhead (talk) 16:01, 2 January 2018 (UTC)
  • Support Templates that are used in more than 200 pages should be permanently semi-protected. However, it doesn't seem to be reasonable to semi-protect all templates. I have seen constructive edits made by IP editors to templates. Extended confirmed protection for all the templates that are used to warn users about their edits. Pkbwcgs (talk) 16:49, 2 January 2018 (UTC)
  • Support as long as a) there is a transclusion threshold, b) the bot removes protection when a page drops below the transclusion threshold, and c) there is a mechanism to request that a template be unprotected (and a flag that the bot will obey). --Ahecht (TALK
    PAGE
    ) 22:28, 2 January 2018 (UTC)
  • Oppose per zzuuzz, Kusma et al. Better to apply whatever protection is required manually on an individual basis. At the end of the day, newby vandals are unlikely to be aware of template space anyway, and those out to deliberately disrupt the project would have no problems about waiting till they're auto-confirmed. Optimist on the run (talk) 11:22, 4 January 2018 (UTC)
  • Oppose per zzuuzz et al, Not all IPs are vandals and the other point is "We're an Encyclopedia that anyone can edit" .... we'd be defeating the purpose of the object if we locked all templates, Whilst in theory I agree with this proposal unless we ban all IP editing (which sounds great!) then I think it's best to just stick with semi protecting here and there and blocking here and there. –Davey2010Talk 16:47, 5 January 2018 (UTC)
  • Support, no particular opinion on best number for cutoff. --SarekOfVulcan (talk) 19:02, 8 January 2018 (UTC)
  • Oppose on ideological grounds. This is the free encyclopedia that anyone can edit. Protection of so many templates runs contrary with Wikipedia's principles. AdA&D 19:28, 8 January 2018 (UTC)
  • Support. Templates are a gaping hole in our antivandal protection; one edit can splash an offensive image, a WP:BLP-violating message, or even a link to malware across dozens or even hundreds of pages. I feel for our IP editors, but that horse left the barn years ago (and wouldn't even be an issue at all if the WMF listened to its editors with regard to SITE, but that's an entirely different kettle of surstromming). This is something that must be done, because the alternative is to wait until we're forced to - under terms we probably won't get to dictate. - The Bushranger One ping only 07:10, 13 January 2018 (UTC)
  • Support. WP:IP addresses are not people. IPs are free to do basic editing, but this is a situation where the risk for damage outweighs the benefit of the edit to a template. My second choice would by limited the protection to templates that are used in a dozen or more articles. Dennis Brown - 18:56, 13 January 2018 (UTC)
  • Support semi-protecting templates with ~250+ transclusions (or some other reasonable number determined by consensus). Tony Tan · talk 03:59, 17 January 2018 (UTC)
  • Support This is becoming an increasingly easy way to make a small edit and do a lot of damage. There are parts of Wikipedia we just don't trust with editors who haven't shown an ability to edit. These are important to the inner workings of Wikipedia. Templates, I feel, fall under that distinction. This won't stop until they are protected from vandalism. --Tarage (talk) 22:22, 22 January 2018 (UTC)
  • Support Template vandalism is an increasing problem, and as the more commonly transcluded templates get protected, it is the more obscure templates - often on pages with less traffic and/or watchers looking after them - which will be targeted more frequently. Template vandalism is harder to fix for most editors than regular article vandalism, and does more damage to Wikipedia as a result. IMHO, the damage done by not protecting the Template namespace like this greatly outweighs the inconvenience caused by preventing IP users from editing what is, in all fairness, a fairly niche and specialist area of the project. Yunshui  23:18, 22 January 2018 (UTC)
  • Support Template vandalism is a real problem, and it's easy for a vandal to stumble across a template - they don't need an understanding of the software. We already restrict access to many templates to template editors. Hawkeye7 (discuss) 00:11, 23 January 2018 (UTC)
  • Weak oppose We really need something to help with this, the amount of damage that can be done is bewildering, and usually we're told about it from a random person on IRC, after the spammer has garnered 1000s of views. I don't think we should limit template editing because of our own technical shortcomings. I know there's a plan in the works to do away with the need to purge, but a "recursive purge" is a necessity at this point. Drewmutt (^ᴥ^) talk 00:53, 23 January 2018 (UTC)
  • Support : Same suggestions as Ahecht. Template talk pages should be excluded of course.   —  Hei Liebrecht 16:02, 25 January 2018 (UTC)
  • Weak support I don't like this proposal, because it restricts one of our principles, but recent template vandalism has made me change my mind. !dave 17:44, 27 January 2018 (UTC)
  • Oppose: WP:IPs are human too. Besides the ideological grounds upon which WikiMedia was founded, this proposal essentially adds significant impediments to the development of templates by anonymous IP users who are already censured en masse far too much. Though a significant amount of vandalism is done by this class of users, so too is a significant amount of useful content generated by such users. The natural progression is to extend this to apply similar mass protections of Module space. Please consider that this proposal penalizes a large class of editors for the actions of a few (albeit persistent) in this class. I understand there is a balance between the work needed to police vandalism vs. the value generated by the class of users penalized but I believe this proposal goes too far. We hear from registered users about the increasing issues of vandalism but remember so too is there the increasing issues of censure to a group of users that already have a difficult time making their issues heard. Remember this sort of proposal not only blocks the vandals but also blocks anonymous IP users (which are a large number of eyeballs) from reverting vandalism so this could in fact increase the damage done by more determined vandals. 50.53.21.2 (talk) 05:17, 28 January 2018 (UTC)
  • Oppose - templates should be semiprotected individually. --NaBUru38 (talk) 19:39, 31 January 2018 (UTC)
  • Support At whatever transclusion number comes by consensus. Ronhjones  (Talk) 17:59, 3 February 2018 (UTC)

Discussion (permanently semi-protect the Template space)

  • If the proposal is adopted, then the transclusion count cut-off point should be somewhere above 200. The vast majority of templates are navboxes: they don't get vandalised often and they do requite quite a bit of maintenance that anons are generally able and willing to help out with. – Uanfala (talk) 15:46, 24 December 2017 (UTC)
  • I havent observed the editing history of templates much. But Uanfala's comment above makes sense. Also, by doing this we will take away one more thing from "anybody can edit" thingy. Yes, anybody can edit, but you need an account. Also, you need to wait for 4-5 days, and make 11 edits before you can edit.
    Also, if a vandal who is going to vandalise through templates, i think he is already familiar with concepts of socks, and sleepers. So I dont see much point in implementing these proposals. courtesy ping to Uanfala to let them know that their comment was moved. usernamekiran(talk) 23:44, 24 December 2017 (UTC)
  • The template subspace not only contains templates but also their talk pages. New users should be able to ask for help on the talk pages of templates so the talk pages shouldn't be semiprotected. ChristianKl22:57, 31 December 2017 (UTC)
    • So far two ideas seem to be floating: one, that a bot protects any template with more than 10 transclusions (uses) or a sitewide configuration change for the template namespace. In neither case would talk pages be affected. :) Killiondude (talk) 03:31, 1 January 2018 (UTC)

Atleast can we have semi-protection for 100+ transclusions? Another incident occured today that shows why it really is needed. {{Sidebar person}} is transcluded on 564 pages, including Donald Trump. Yet it was completely unprotected, and was vandalized by someone so that a huge "fuck donald trump" showed for at least 40 minutes (probably more) at-least 2 hours (based on a reddit post), 2 hours after the vandalism was reverted (only for logged-out users, because apparently they get a cached version of the page - so regular editors did not see it) I'm thinking that after that automatic semi-protection, at admin discretion, maintenance templates that shouldn't be changed much can be preemptively template/semi protected. Galobtter (pingó mió) 11:25, 1 January 2018 (UTC) I'm wondering if there's a smarter way to do it. Templates that are transcluded onto other templates (like that one was) should be protected at a lower count. I'm sure there are reasonable ways so that sports stat templates etc are not protected while ones like these are. Galobtter (pingó mió) 11:52, 1 January 2018 (UTC)

I did not catch this particular vandalism, however, I want to underscore the fact that now that this template is protected, we have effectively restricted the number of editors that can respond to (i.e., revert) vandalism of this template (by more determined vandals). This can in fact cause more damage due to vandalism as the vandalism can potentially exist longer despite anonymous IP users seeing the issue but being unable to directly do anything about it. 50.53.21.2 (talk) 05:51, 28 January 2018 (UTC)
This sub-transclusion problem is of course a major one not addressed above. What today's vandalism shows, yet again, is that it's templates with not merely tens or a hundred of transclusions, but several hundred transclusions. The templates which generated this thread were around 1,000 transclusions. But also the real problem is not the vandalism but the caching and no effective means to bust the caches. -- zzuuzz (talk) 12:03, 1 January 2018 (UTC)
Another tool might be an admin ability to mass purge cache, perhaps cache's generated in a certain period of time (when the template was vandalized) linked to a certain template. Galobtter (pingó mió) 12:15, 1 January 2018 (UTC)
Has anyone confirmed or heard if that was seen on any other page or did it just happen to Donald Trump? Emir of Wikipedia (talk) 16:49, 1 January 2018 (UTC)
Yes others[5][6]. -- zzuuzz (talk) 16:54, 1 January 2018 (UTC)
Wikipedia:Purge#forcerecursivelinkupdate is interesting Galobtter (pingó mió) 16:56, 1 January 2018 (UTC)
Apparently can force cache update of all transcluded pages.. Galobtter (pingó mió) 16:58, 1 January 2018 (UTC)

If we want to lower the impact of vandalism, we would be far better off enabling as many users as possible to respond to vandalism rather than restricting both vandalism and any response to vandalism to ever smaller groups of users. Vandals by definition are a small set of users and as such can always be thwarted by a larger populace. Adding restrictions just means vandals will have to become more determined to bypass the restrictions. As such, restrictions are not a sustainable practice. Currently anti-vandalism tools are difficult if not impossible to access and use by anonymous IP users which are in fact the largest set of users and editors within WikiMedia projects. 50.53.21.2 (talk) 06:08, 28 January 2018 (UTC)

  • Comment: for some mystical reason, nobody has mentioned the Wikipedia:High-risk templates guideline. --NaBUru38 (talk) 20:02, 31 January 2018 (UTC)
  • I'm still of the opinion that everything in this namespace should be Pending Changes protection/Deferred changes/whatever we want to call it, for edits not made by sysops. Having 2 people look over each others code changes is no more than normal in ANY space where code is deployed to so many people at the same time and seems perfectly reasonable to me. I'm not sure why we are bothering with semi protection, it just feels like a a moving goalpost for the vandals, that will just go on to the next level of least resistance. —TheDJ (talkcontribs) 14:58, 7 February 2018 (UTC)
    @TheDJ: I think we'd want to exclude bots and templateeditors as well. The former as they are already tightly controlled by task approvals, and the later as they are specifically selected for this function and many are more skilled in this maintenance then some sysops. — xaosflux Talk 15:32, 7 February 2018 (UTC)

Hashing out a number

It looks like about half of the "oppose" votes are opposing the "blanket" semiprot, which I sorta get. That half also mentioned that an alternate option was an "X+ transclusions" option. Seeing as how the % of !votes who would be amenable to that is more than the "hard oppose", I think it's time to flesh out a number. The numbers that were thrown around the most were 10+, 100+, and 250+. So, despite my personal concern that we'll never agree on anything, I'd like to see if we can try. Primefac (talk) 02:21, 4 January 2018 (UTC)

  • 100+ - it's a high enough bar that the sports-type templates that frequently get updated by the helpful IPs won't be affected, but keep "bigger" templates from causing more harm than necessary (and <100 pages is a piece of cake for someone with AWB to null edit in a hurry). Primefac (talk) 02:21, 4 January 2018 (UTC)
    250+ per the comments below. Still a low bar for AWB/null edits. Primefac (talk) 13:56, 8 January 2018 (UTC)
  • 250+; I've made my reasons why clear above. —Jeremy v^_^v Bori! 02:27, 4 January 2018 (UTC)
  • 250+ or 10+ semi-protected pages. (I'm not sure this suggestion is feasible) Templates like Template:Duke Blue Devils men's basketball navbox should be able to stay unprotected. Templates transcluded on high-profile pages should have a lower threshold. power~enwiki (π, ν) 11:56, 4 January 2018 (UTC)
    Wait, are we talking semi-protection, or pending-changes protection? I could support pending-changes for the entire namespace. power~enwiki (π, ν) 12:17, 4 January 2018 (UTC)
    @Power~enwiki: I don't think the software supports pending changes in templates, as the software will always transclude the latest version. I can't find where I read that, though. -- John of Reading (talk) 07:23, 5 January 2018 (UTC)
    Not to mention that that would be too much of a strain on the hive of idiots that is CRASH. Like I said above, only utter fools would want so many pages CRASHlocked. —Jeremy v^_^v Bori! 20:36, 5 January 2018 (UTC)
  • No numbers please. If we came up with a rule that references a particular number I'm afraid this will have the effect of discouraging the exercise of common sense. If there's any take-home message from the above discussion, it is that the circumstances vary between templates and that some basic judgement should be exercised. If a template is unlikely to ever be edited – say, if it's simply a wrapper for a module, or it produces some very simple code that is unlikely to be changed – then it may be protected even if it has a low number of transclusions (say, 30). On the other hand, if it's a large template that is likely to need some sort of regular maintenance (like a navbox) then it usually doesn't make sense to protect it, even if it's got thousands of transclusions. – Uanfala (talk) 15:23, 11 January 2018 (UTC)
    The entire reason for this discussion is because of the increasing frequency of vandalism regarding templates with hundreds of transclusions, since it grossly disrupts articles the template is then transcluded on; hence the numbers. Blanket semi-protection of the Template: space isn't workable or viable, so the goal should be to eliminate the most tempting low-hanging fruit to prevent this sort of vandalism. —Jeremy v^_^v Bori! 21:43, 11 January 2018 (UTC)
  • My druthers would be to protect all of them, as only protecting "the most tempting low-hanging fruit" will simply make the fruit on the next branch up increase in temptation value. But if a number must be set, 10+. - The Bushranger One ping only 07:10, 13 January 2018 (UTC)
Unfortunately, the same is true with semi-protection, it isn't difficult to reach autoconfirmed level with trivial edits, even in a sandbox... —PaleoNeonate10:39, 13 January 2018 (UTC)
Template vandalism is virtually always drive-by, however. Setting up an autocon-buster takes time, and since the goal of these accounts is to cause disruption for a quick laugh then move on, that takes more time and dedication than they are willing to spend. —Jeremy v^_^v Bori! 17:48, 13 January 2018 (UTC)
Unfortunately it also takes the same determination for someone who sees vandalism to respond to it. Blanket protections at ever lower transclusion counts mean fewer people can respond to more determined vandalism. 50.53.21.2 (talk) 06:19, 28 January 2018 (UTC)
Most of the unregistered users/readers who see template vandalism generally don't recognise it as such and look in the article only to come up empty-handed - which is why template vandalism is disruptive to the point where blanket protection of all templates that are transcluded on X number of pages is a far better option than seeing dozens or hundreds of articles all vandalised at once. —Jeremy v^_^v Bori! 06:55, 28 January 2018 (UTC)
I am curious how you arrive at such a determination. Do you have some convincing metric? I also believe we need more than just a generic translusion count as a measure of impact. As an example Template:Sandbox other is currently semi-protected with over 3000 transclusions, but not a single one of them is in article space. 50.53.21.2 (talk) 08:07, 28 January 2018 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

A Bot for Creating Arthropod Stubs

I am interested in adding about 17,000 new stub articles for arthropod species. These can be added over a period of a several weeks or a few months with a bot. The article content is significantly better than the minimal stub frequently seen on these and similar Wikipedia topics.

In my mind, these stubs will serve three primary purposes:

  • They will give users some idea of the organism, including references and, if available, a photo or two.
  • They will make online and print references available to users.
  • They will contain enough material to make it convenient for editors to expand the article. A casual editor can spend significantly less time expanding an existing article than creating a new one, because it takes time to learn and work with the various templates and other "standards". An article with online references makes it even more likely for the article to be expanded into a start-class or better article.

I have requested and received positive input from the Arthropods and Tree of Life projects. I have been manually posting new stub articles generated for a variety of arthopods, primarily insects and spiders, over the past few weeks at the recommendation and encouragement of project members and interested editors. I have received positive feedback, and the stub quality has evolved and improved significantly. Several of the stubs have already been expanded by various editors.

Today I applied for approval for bot operation, and was directed to the Village Pump for more discussion.


Operation Details

VB source code is available at User:Qbugbot/source.

Species selection

The ITIS has most long-established arthropod species in its database, although it does not have many of the newer species and may not reflect recent reorganization of genera and higher taxa. The species in Bugguide generally reflect the latest research, and are limited primarily to species photographed or collected in North America by its users. By selecting the species that appear both in ITIS and Bugguide, we end up with a set of non-controversial species (from a taxonomic standpoint) that are not overly rare or obscure.

  • 35,000 arthropod species are in bugguide.
  • 23,000 of these are in ITIS (out of 250,000 total arthropod species in ITIS).
  • 17,000 of these have no Wikipedia article.

Article creation

Articles are created in the following steps:

  • A taxobox template is created using the (taxonomic) ancestry of the "bug", selecting appropriate ranks for the taxobox. For example, subfamily should be included except in Lepidoptera with no subfamily common name. An image is included if available. Synonyms are added if they appear in the ITIS database. (Other catalogs may have ridiculous numbers of synomyms.)
  • A text introduction is generated, such as "Andrena perarmata, the well-armed andrena, is a species of mining bee in the family Andrenidae," giving the scientific name, common names (if any), taxonomic rank, an ancestor's common name, and the scientific name of the family or order.
  • This is followed, as available, by the distribution range, the IUCN conservation status, Hodges number, ITIS taxonomic notes, additional images, and a list of taxonomic children (if any). If there are too many children, a link to a separate list page (created afterward) is included. The distribution data comes from ITIS, World Spider Catalog, or Odonata Central.
  • References may include inline citations, general references, further reading, and external links.
  • A Wikimedia Commons template is added if there are photos, a Taxonbar is added if Wikidata has this bug listed, the appropriate Wikipedia category is selected, and the proper stub template is selected for the talk page.

Article upload

Articles are created on demand for upload. An article will be uploaded only if no article exists for that title. If one does exist, the article will be skipped. No existing articles will be altered. If the taxonomic parent of an uploaded article does not exist, it will be generated and uploaded. If a list of more than 100 "children" is included in the article, it will be split off as a separate list article. A talk page with the proper stub template is created for each article.

Manual verification

During the test period, every article created will be viewed on Wikipedia to verify that it exists, it is the correct article, and the information is proper. Later on, the text of all the day's articles will be downloaded and verified manually or automatically. At least one article daily will be manually viewed on Wikipedia.

Sample articles

Here is a list of some random test articles generated and manually posted on February 1.

Bob Webster (talk) 05:27, 3 February 2018 (UTC)

Qbugbot discussion

In general, we strongly discourage the creation of one-line stubs, whether by human or by bot; experience has shown that they're rarely if ever subsequently expanded, and just end up being unwatched (and consequently vandal-magnet) clutter that clogs categories, makes Special:Random unusable, and gives Wikipedia a bad reputation. Is there a specific advantage in this case (a) to having an individual microstub article for each of these species rather than a few long lists from which individual articles can be created and linked if there's enough to say about a particular species to justify a stand-alone article, and (b) to doing it on Wikipedia (which really isn't designed to handle this kind of thing) instead of on Wikispecies (which is)? If you haven't already, I'd recommend reading Wikipedia:Village pump (policy)/Archive 66#Automated creation of stubs, which is the thread that created the current policy when it comes to mass article creation, and ensure that you have an answer for every objection that was raised back then. ‑ Iridescent 09:02, 3 February 2018 (UTC)
Many of these concerns/questions are addressed at m:Wikispecies FAQ (which I found at this BRFA by Ganeshk that occurred relatively shortly (7 months) after the archived VPP discussion linked above). I'm sure we can find many examples of good and bad bots. The key is rigorous community vetting, which I think has been, and is, going on.   ~ Tom.Reding (talkdgaf)  14:44, 4 February 2018 (UTC)
I strongly support use of this bot, after a proper careful test phase (as planned above). So far, what I've seen seems reasonable. The linked discussion is from 2009 and we're 9 years past that; the objections there seem vague. Articles are articles, it doesn't matter if a bot or a human or a dog or a devil makes them: they have to be judged by their merits. The stubs generated by this bot seems still better than having nothing. cyclopiaspeak! 11:44, 3 February 2018 (UTC)
Make sure that the bot is posting accurate information; we have had bad luck in the past with mass created species articles being full of errors. I am also minded to oppose any mass creation of one line stubs; please try to add some more information. Jo-Jo Eumerus (talk, contributions) 12:02, 3 February 2018 (UTC)
Support as long as you use Template:Speciesbox (edit | talk | history | links | watch | logs) . These seem better than stubs, almost start. Nessie (talk) 13:25, 3 February 2018 (UTC)
Today I changed over from Taxobox to Speciesbox (and Automatic taxobox for higher ranks). It involves adding to the Taxonomy templates, which should help reduce "unrecognized" messages editors might encounter using the Speciesbox. Bob Webster (talk) 05:22, 4 February 2018 (UTC)
Super! thanks @Edibobb:! Nessie (talk) 23:11, 4 February 2018 (UTC)
As noted in the previous discussions, I also support this bot concept. Those are high quality species stubs, rich in refs; you can't tell me something like Leptinus orientamericanus isn't useful for the reader, even if there's only one line of text proper. The genus and higher level articles with their taxa lists should be particularly welcome. - But we really should make sure these do not have to pass through the NPP queue; otherwise I think this would singlehandedly murder the backlog situation. --Elmidae (talk · contribs) 18:00, 3 February 2018 (UTC)
  • Tentative support: those seem like pretty high quality species stubs, but I am concerned what impact this would have on New Page Patrol. We have just fought down from a massive backlog (22,000) to something more reasonable. We still don't have a handle on it yet, with the backlog having stalled around 3.5k. These stubs will need to be reviewed by a human New Page Patroller for errors. At the very least the articles should be spread out over a period of several months so as not to overload NPP (for 90 days that is still 188 added per day, for comparison we currently have to review around 350 each day to keep up). Not banging the concept, but don't bury us in new articles (even if they turn out to be almost entirely good ones). I generally Oppose granting the bot autopatrolled. These articles should be reviewed by a human at least once IMO. Dragging it out over a few months will be fine (although I think it would be ideal if all the bot-added articles for the day were added at the same time on that day, so that they are stacked up and can be reviewed as a single block easily). Insertcleverphrasehere (or here) 19:43, 3 February 2018 (UTC)
Indeed. Unless you all trust the bot enough to grant it the Autopatrolled user right, please don't do it. You'll kill WP:NPP. Mduvekot (talk) 20:03, 3 February 2018 (UTC)
If the bot is granted bot it will have autopatrol, so the articles would not show up in the NPP queue. — JJMC89(T·C) 20:12, 3 February 2018 (UTC)
Support, and support granting the bot autopatrolled status, on the basis of the example articles shown above. One minor point: in Arhyssus scutatus, the markup included {{taxonbar|from=Q10417852}}; {{taxonbar}} will suffice, as shown here. It's a pity that {{Taxobox}} doesn't yet pull its data from Wikidata. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:28, 3 February 2018 (UTC)
Concerning adding the Q codes to {{taxonbar}} @Pigsonthewing:, and perhaps @Tom.Reding: can answer this better than i, but I believe the movement in this direction is to better track and maintain taxa with upcoming maintenance categories. Nessie (talk) 23:25, 3 February 2018 (UTC)
The change I suggest should make no difference to that. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 01:30, 4 February 2018 (UTC)
Thanks for the information on the Q codes (and the support). The problem I had was that unless the Q Code was specified, the taxonbar did not appear on about half the pages. I don't understand why, so I've just been adding the Q code to everything I can. Bob Webster (talk) 05:22, 4 February 2018 (UTC)
Could you specify a page this happened on (or will happen on the next time you notice it) so that we may try to recreate this behavior? Perhaps it's just lag between Wikidata, Wikibase, and Wikipedia for newly created pages?
It looks like it could just be a lag. I removed the Q code from some pages created a few days ago, and taxonbar showed up fine (except for a couple without Wikidata records, which had no Q code to begin with). I did the same on some pages created yesterday (Acinia picturata, Schizura badia, and Eucapnopsis), and the taxonbar was not visible without the Q code. Bob Webster (talk) 15:40, 4 February 2018 (UTC)
While the |from=/|from1= parameter is not required for {{Taxonbar}}, it is desired from a tracking perspective. |from=/|from1= should only be removed if the Wikidata entity (Q code) has no relation to the page (i.e. it was erroneously added to {{Taxonbar}}).   ~ Tom.Reding (talkdgaf)  06:14, 4 February 2018 (UTC)
The parameter - which affects the data displayed - should not be replied upon for tracking; to do so may have unintended consequences. Consider, for example, the case where a duplicate is detected and so Wikidata items merged, or the Wikipedia link is moved to a more apprpriate item in Wikidata. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:16, 4 February 2018 (UTC)
Yes, that is an intended consequence, and precisely what we want to track & check.
The behavior of {{Taxonbar}} has changed in the last month or 2, largely due to the efforts of Mellis, Ahecht, Peter coxhead, and community discussions. One of the improvements is that the primary Wikidata entity is always displayed as the top row, regardless of |from1=. Also, no duplicate entities are ever displayed. Discrepancies are shunted to one or more tracking categories (though not yet live). There is no tracking category for duplicate |from#= values, and it should probably be added in the future (though it's not a priority now since this is a very fringe case). For more info I encourage you to read through the discussions at Template talk:Taxonbar.   ~ Tom.Reding (talkdgaf)  15:56, 4 February 2018 (UTC)
Did you try purging the affected pages? And items? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:16, 4 February 2018 (UTC)
Thanks! I've converted the bot to use {{Cite journal}}, {{Cite web}}, and {{Cite book}} for all references. Bob Webster (talk) 04:10, 7 February 2018 (UTC)
Bob Webster, well done! I was just going to post to WT:TREE for more attention, but now there's no need. Could you link to 10+ pages your bot would have created—perhaps you can overwrite existing pages you created, that have no other intervening edits, with the new code incorporating CS1 templates? Preferably ones that are representative of the possible spectrum of the different sources drawn upon (i.e. all 10 using identical sources wouldn't be helpful). The templates can get very exacting sometimes. Creating a separate subsection below or within #Sample articles for these would be helpful too for others just coming to the discussion.   ~ Tom.Reding (talkdgaf)  18:28, 7 February 2018 (UTC)
I've added some sample articles below, and would welcome any comments or suggestions. Bob Webster (talk) 02:07, 8 February 2018 (UTC)
Also, in case you'd like to scan a lot of the references, I left a few hundred of them (about half I'm using) here. I was using Wikipedia to point out some invalid fields. I'll try to leave them for a couple of days. Bob Webster (talk) 06:55, 8 February 2018 (UTC)
Thanks! The bot now uses {{speciesbox}} for species and {{Automatic taxobox}} for higher level taxons. The higher level templates are added as necessary. Bob Webster (talk) 04:10, 7 February 2018 (UTC)
  • weak support The samples above do look to be adequate stubs and are better than some of the work people come up with. I would prefer not to put in pages that are too small. However someone should look at the page at some point to see if it is OK. Graeme Bartlett (talk) 06:33, 7 February 2018 (UTC)

I've added a new list of sample articles below. The latest source code is available at User:Qbugbot/source. Thanks for the suggestions! Please let me know if you run across anything else. Bob Webster (talk) 19:32, 12 February 2018 (UTC)

Qbugbot sample articles

Here is a list of some random test articles generated with qbugbot and manually posted on February 12. They now include changes recommended below by Trappist the monk, Tom.Reding, and Elmidae. These are the latest of about 2,000 arthropod stub articles that have been generated by qbugbot and posted manually since the first of the year. Bob Webster (talk) 19:29, 12 February 2018 (UTC)


  • Grabbing one at random - Tricholita chipeta - I do wonder if whatever algorithm selects the "Further reading" is really on the level. There's a number of "Further reading" entries here that seem irrelevant, e.g. this, which deals with two species in the same family but has no further connections. How are these determined? This kind of thing should be avoided. - Otherwise looks good, I think. --Elmidae (talk · contribs) 07:48, 8 February 2018 (UTC)
You're right. This reference and a lot of others have been assigned too broadly. The way it has been done is that the parent of the important taxon (the genus, in this case) is selected and the reference is assigned to all its descendants (in this case, the subfamily Noctuinae). But it's a huge subfamily, and there are over a thousand descendants! Needless to say, these two new species are not applicable to most of them. I'm reassigning a few hundred references to take care of situations like this. Bob Webster (talk) 15:27, 9 February 2018 (UTC)
  • I couldn't find anything automatically amiss with these; only 5 not-worth-the-edit minor WP:GenFixes were found on as many pages.
Manually,
  • Anthaxia viridifrons, Melanophilini, Phaenops californica: had CS1 main tags due to |date=2008-2009, which should simply use an en dash instead of a dash. Use of volume & number params too (tho I'm not sure whether or not an en dash was needed for |number= since "No." was used previously).
  • Smicronyx spretus, Lacinipolia triplehorni, and others: if the url is a doi, it'd be useful to include it as a doi too (I think there are bots that do this)
  • Hermetia hunteri, Lobophora nivigerata, and possibly others: are |pages=475/|pages=1016 the # of pages in the books? (they shouldn't be! should be the location of the supporting material only; otherwise it doesn't belong; I left it alone for now)
  • Lacinipolia triplehorni, Metalectra bigallis, and others: periods are used after initials in some templates, but not others. Can you standardize this usage on a page-by-page basis? (no explicit error here, just a suggestion)
  • Sphaeroderus bicarinatus: |pages=323 + 22 plates looks odd, but I have no suggestions for anything more appropriate. Can anyone else comment on this? (|id= exists, but doesn't look appropriate)
  • Also, trailing periods (i.e. |publisher=Houghton Mifflin Company.) aren't needed in CS1 templates (other than for abbreviations), as the template takes care of all otherwise-superfluous punctuation. No errors found regarding this, just fyi to help avoid potential future errors.
I should also say - well done! And so quickly too.   ~ Tom.Reding (talkdgaf)  17:59, 9 February 2018 (UTC)
Bellamy is not a journal so {{cite journal}} for that is inappropriate. Because it comes in five volumes, mixing book's volume designation with the publisher's series issue number seems non-nonsensical: 1–5 (76-80) implies that we mean issues 75–80 from each of the five volumes. cs1|2 does not support the notion of series numbers. Pensoft is the publisher; series, if it is necessary, is Faunistica. Similarly, Nelson, et al. is also a book, not a journal; The Coleopterists' Society is the publisher.
urls to dois should generally not be placed in |url= when there is |doi= because that constitutes overlinking and because most most dois are behind paywalls (|url= is generally considered free-to-read). When the doi is free to read, the citation should be marked with |doi-access=free
|pages=323 + 22 plates looks like the total-number-of-pages-issue that was mentioned so should not be done
publisher names do not include corporate designations: 'Company', 'LLC', 'Ltd', 'Inc', etc
Trappist the monk (talk) 13:21, 10 February 2018 (UTC)

Please mention classification system in taxobox

I have a request on improving taxobox. On a taxobox for a taxa; it should be explicitely mentioned; which system of classification has been used. If a mixture of system has been done (although that is highly unrecommended). It is important because classification systems change; where not only the taxa fusion and splits; but ranks of the taxa sometimes changes; and although quite rarely; rank names too changed. So whenever publish a taxobox; please mention which system of classification is followed. Best if a taxobox contain 2 or 3 columns for the hierarchies according to separate classification systems. This not only improve correctness of the articles; but also will work as better reference and would help literature search.

RIT RAJARSHI (talk) 16:55, 13 February 2018 (UTC)

Please clarify what you mean by classification systems. It's just ranked vs. unranked, somewhat like a common system vs. scientific system scenario. --QEDK ( 🌸 ) 19:29, 13 February 2018 (UTC)
If one uses Template:Speciesbox (edit | talk | history | links | watch | logs) and Template:Automatic taxobox (edit | talk | history | links | watch | logs) then the citations should be in the associated taxonomy template. Nessie (talk) 20:01, 13 February 2018 (UTC)


I noticed that the article about the German language has a spoken audio sample, but many other language-related articles (like Spanish) don't. Would it be good to add audio samples to other articles describing languages (including English dialects)? I find them useful since they help to form a picture of how the target language sounds. --Sir Beluga 16:26, 9 February 2018 (UTC)

You could tag the talk pages with {{Audio requested}}. Nessie (talk) 21:26, 10 February 2018 (UTC)
Hello, Sir Beluga! You can find audio samples here, and more specifically here. Good luck! --NaBUru38 (talk) 00:44, 15 February 2018 (UTC)

Disable messages left by InternetArchiveBot

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


This topic has seen a lot of debate, but I think we might want to consider turning off the function that leaves messages on talk pages.

See user:InternetArchiveBot for context. This talk page shows a typical instance of the bot's messaging process, which it does after redirecting a dead link to an archived copy of a dead web page hosted at the Internet Archive. The bot has posted hundreds of thousands of these messages on various Wikipedia article talk pages since 2016.

Why?

  • IABot's error rate has dropped to near negligible numbers.
  • Users really interested in checking the edit, will do so without being asked to.
  • If the bot did make an error, with IABot's newest upgrade in v1.6, simply reverting the bot is enough to report a bad edit as a false positive.

Question for you guys, should I turn off the bot's messaging feature? — Preceding unsigned comment added by Cyberpower678 (talkcontribs) 19:51, 11 February 2018 (UTC)

Yes

  1. Support The costs of the bot posting messages to the talk page are significant and outweigh the benefits. Every time a user engages with a message, either by reading it, scrolling past to ignore it, or following its instructions, that user is donating time and labor to Wikimedia projects. If a message on average consumes 1 second then if there are 500,000 messages, that is 140 hours of labor in this. Since the messages remain in place for years I think that they actually consume perhaps 30 seconds each on average over the course of their lives, meaning that the crowdsourced checks from this messaging system are consuming thousands of user volunteer hours. If anyone advocates for keeping this messaging system in place then I call on those people to make their best estimates of the labor costs to keep the messaging system. I previously raised this issue in March 2017. In October 2017 other people discussed this. Now the situation is even more clear, as we have evidence that the bot almost never makes mistakes. Also, now this bot can get alerts from history log revisions, which means that communication need not happen on the talk page and can instead happen by undoing or reverting its edits. Whenever any bot is going to do anything on wiki 100s of 1000s of times, we have to be very careful to estimate and measure how much wiki crowdsourced labor those bot actions will consume. In general, bot actions cannot be a time sink for human attention, and whatever a bot does needs to relieve human time, not accumulate it. The costs of keeping talk page messages outweigh any benefits which I have seen anyone describe. I feel that in the past, people who have advocated to keep messages have in their mind that the value of wiki volunteer time is 0, or that they refuse to acknowledge that it is possible for wiki editor human attention to have a value worth quantifying. In fact, our human labor is extremely valuable and we have to protect it. IABot is awesome, thanks everyone for developing this project both technically and socially. Blue Rasberry (talk) 20:01, 11 February 2018 (UTC)
  2. Support per nom and Blue Rasberry, and let's not forget the costs to the operator and the WMF of making, storing, and displaying those article talk page edits.   — Jeff G. ツ 20:32, 11 February 2018 (UTC)
  3. Support the standard revision history is sufficient. power~enwiki (π, ν) 20:35, 11 February 2018 (UTC)
  4. Support. I never quite saw the reason to do these as hardly anyone reviews them. The argument was that they would be... but I don't think there's any evidence for it. There's millions of dead links and thousands every month. I doubt these messages are being reviewed. —  HELLKNOWZ   ▎TALK 21:21, 11 February 2018 (UTC)
  5. Support It's essentially a duplication of effort for the bot and the readers of the talk page messages. Nihlus 21:23, 11 February 2018 (UTC)
  6. Support Yes please. They're a victim of the bot's own success, and have been rendered unnecessary. ~ Amory (utc) 21:57, 11 February 2018 (UTC)
  7. Support Every edit made by IABot, every archive link it adds, is checked and verified by WP:WAYBACKMEDIC. I happen to know exactly what the error rate is :) It's small enough that it would drive a human a little bonkers trying to look through the good ones for a bad one. Also, IABot has an online tool, API and database where much of this information is available. If someone wanted to know "Show me all archive links in an article modified by IABot", that information is available, a simple tool could be made to display it without needing clutter the talk pages. -- GreenC 22:30, 11 February 2018 (UTC)
  8. Support Assuming a very low error rate. @GreenC, Cyberpower678: what is it?   ~ Tom.Reding (talkdgaf)  23:15, 11 February 2018 (UTC)
    My error rate is based on the number of incoming false positive reports in relation to the overall edit rate. Ever since IABot v1.6 was deployed, it would automatically detect reverts made by a user, identify the reverting user, and report every link change reverted as a possible false positive for me to review. Of the reports that came in, only 3 or 4 were actually false positives. Since GreenC says he has the exact number, I'll let him answer that question.—CYBERPOWER (Be my Valentine) 00:07, 12 February 2018 (UTC)
    Well.. now someone had to ask :) It's complicated. One problem is I can't say this bad link was added by IABot and not someone else. It doesn't go through revision history. Roughly speaking though, WaybackMedic is able to "do something" with about 2% of the archive links, to fix problems. "Do something" might mean remove the archive because it doesn't work, adding an archive to a dead link, changing to a different archive service, changing the timestamp, modifying the URL encoding, removing garbage characters in the URL (eg. trailing ":" at the end of a URL), etc.. However many of these the root cause has nothing to do with IABot. Anyway, I can verify about 98% of the archive links in the system are working and WaybackMedic can then fix to bring it over 99% (except for soft404 or verifying content matches the cite, which require human checks). -- GreenC 03:36, 12 February 2018 (UTC)
  9. Support but last one I checked Talk:List of Mountain Bothies Association bothies#External links modified (January 2018) seemed entirely wrong to me. I expect the referenced website was down at the time of checking. There seemed to be no feasible way to report the problem so the message on talk didn't help. Thincat (talk) 23:37, 11 February 2018 (UTC)
    There is a feasible way. Revert the edit. IABot will catch it and auto-report on your behalf.—CYBERPOWER (Be my Valentine) 00:07, 12 February 2018 (UTC)
    That's good then because that is what I did. I didn't realise that would happen. Thank you. Thincat (talk) 08:14, 12 February 2018 (UTC)
  10. Weak support, or perhaps keep the notification process but put the text in an envelope so those that wish can open and review. GenQuest "Talk to Me" 04:23, 12 February 2018 (UTC)
  11. Support, thanks for the ping Bluerasberry. Chances are, it'll get checked sooner or later by someone invested in the topic. There may be times when it would be useful, but by and large this seems a superfluous feature. I reiterate my comment from the last time I weighed in on this by saying there's usually, if not always, plenty of space for the bot to write in a link for users to ping for false positives, à la ClueBot NG. With OP's comments that a simple reversion works exactly the same way, however, I find my comments have even more force to them. Zeke, the Mad Horrorist (Speak quickly) (Follow my trail) 05:28, 12 February 2018 (UTC)
  12. Support I see these messages all over the place, with scarcely any feedback. I have checked a couple, but I am unlikely to check any more. All the best: Rich Farmbrough, 13:14, 12 February 2018 (UTC).
  13. Support turning it off except for actually useful messages. We do not need pointless botspam about having done trivial things like provided an archive URL for a cite that didn't have one, and other non-controversial actions. If it is not highly likely to need human review, then don't pester the talk page about it. It triggers an endless stream of watchlists for no reason, and is very annoying.  — SMcCandlish ¢ 😼  17:47, 12 February 2018 (UTC)
  14. Support, this is mostly clutter on zillions of talk page. Use an edit summary link instead, pointing to an info page if needed. Headbomb {t · c · p · b} 18:36, 12 February 2018 (UTC)
  15. Support (slightly reluctantly). They're noisy, both on talk and in everyone's watchlist, eat precious editor attention, hang around forever, and serve very little purpose given a presumably very low error rate. I would very much like an alternate "gnoming facilitator", but on the balance I think the downsides far outweigh that one positive. Especially since, in my experience, the wast majority of the messages are simply ignored: a potential, but unrealised benefit does not outweigh actual downsides. --Xover (talk) 19:13, 12 February 2018 (UTC)
  16. Support for the same reasons others have voiced. It clutters talk pages and is not particularly controversial. Killiondude (talk) 05:38, 13 February 2018 (UTC)
  17. Summoned by ping Turn it off. While i understand Xaosflux's argument that the monetary/computing costs are low, i believe the human/time costs are too high. These messages quickly become mere noise, so are skipped over, i believe, leading to no great benefit in leaving the message; if a person is going to check what the bot has done, he is probably going to do so in the presence or absence of a message, likely having been drawn to the page by watchlist or by checking the history of the article. Happy days, LindsayHello 09:05, 13 February 2018 (UTC)
  18. Support It is very annoying and unnecessary with little benefit, (if there's any). I almost asked for this, but on searching archives and seeing many people had asked without success, I gave up. Please stop it ASAP. –Ammarpad (talk) 10:26, 13 February 2018 (UTC)
  19. Support per SMC. Ealdgyth - Talk 14:33, 13 February 2018 (UTC)
  20. Support as this is too trivial to bring up on an article's talk page. The overwhelming majority of these talk page posts I've seen are ignored, and to most editors they appear to serve no other purpose than to create clutter. The situation is similar to other cases where it's desirable for minor edits to be reviewed for errors, like Cluebot's reverting of vandalism, or human editor's fixing of links to disambiguation pages (for the dabfixes that end up on my watchlist, the average error rate is an order of magnitude higher than the stated upper limit to this bot's error rate, and yet no-one would want to see a talk page post for every dablink fixed). The only useful feature of these talk page posts is that they allow an editor to mark up the archive link as checked, so that other editors don't have to waste time checking again. But realistically, such surplus of eager editors is likely to available only on the most popular articles; and at any rate, any way of marking an archive link as "reviewed by human" had better happen in the link template itself (rather than on the talk page). – Uanfala (talk) 19:30, 17 February 2018 (UTC)

No

  1. leave it on I often check the archiving, and in contrast to what WAYBCK medic may think, a significant proportion, perhaps 10% of archiving has failed to produce a useful result. Sure there is often a page archived, but it may be a database query page with no database behind it, an error message saying the page does not exist, a domain squatter message, or a shell of a page with all the dynamic content missing. In these sort of cases it is much harder for a bot to figure out something went wrong, and a human can easily see. Then they need to track down where the page or database went. The bot message gives a suitable summary that enables observing the results easily. Just looking at the page edited is much harder, as you have to find the needle in the haystack of the modified link. Perhaps the message could be less verbose taking up less real estate on the talk page though! Graeme Bartlett (talk) 22:52, 11 February 2018 (UTC)
    This is a trivial problem. The solution to it is to have a page where the bot logs these things, and the gnomes who want to clean up after them can watch that page. On an article by article basis, it makes no difference. An article with a valid source is 100% as validly sourced if the bot adds an unhelpful archiveurl; this is not something that particular article's watchlisters need to be notified about, and they'll already see it in the edits to the article anyway; being verbally notfied about it is just spammy and pointless.  — SMcCandlish ¢ 😼  17:50, 12 February 2018 (UTC)
  2. leave it on, The bot’s activities are unique among bots, in that they often require checking. The bot is dumb. It can check for a dead link but it cannot check if there is a better link: maybe the site has reorganised its archives, or is at a new domain address, or even a search turns up a better one. For references this is not as important – a reference can be verified from an archive link. But external links are generally meant to be the best, so the current and most up to date, links for that subject. So having the bot report what it’s done is essential, to ensure editors have the chance to check it. If the messages are not posted the danger is the bots edits will not be noticed, and opportunities to fix old links will be overlooked. I updated a couple of articles’ links only two days ago - My Love Is a Bulldozer and My So-Called Life (Venetian Snares album) – and would almost certainly not done so if there was no talk page report, just an entry in the page history.--JohnBlackburnewordsdeeds 23:04, 11 February 2018 (UTC)
    I'm not sure how I feel about you calling my bot dumb. If you knew how much code and intelligence goes into this, just to do what it does now, you'd change your mind. Unfortunately, the bot can never determine what would be a better link as it would need google class infrastructure to pull that off.—CYBERPOWER (Be my Valentine) 23:15, 11 February 2018 (UTC)
    Dumb in the way I described it. It can check a link is dead but does not have the smarts to work out why and find where it’s moved to, or if there is a better replacement. Yes, that level of AI is beyond any bot. But that’s why it’s valuable to check its changes, something that would happen far less often if the bot did not leave a note. It would look like any other bot, which editors are used to ignoring.--JohnBlackburnewordsdeeds 23:59, 11 February 2018 (UTC)
    True. If the URL doesn't automatically redirect to the new location, then the bot will never know that it isn't dead.—CYBERPOWER (Be my Valentine) 00:09, 12 February 2018 (UTC)
  3. leave it on: Like other editors asking the same, I use it as an opportunity to expand citations (particularly when they are in a language other than English). Not only does the bot often link to an archived redirect to a top level page, it picks up 404 error pages, etc. It can't confirm that the citation is RS (there are plenty of instances where they're not), nor can it tell whether it's picked up the correct save on a dynamic page. I have a backlog of notifications I'm working through slowly (it's grunt work, but I've found plenty of replacements for dead links and other problem archives for articles). When I add another article to my watchlist, I go through these notifications where they have not been marked as checked. You'd probably be surprised at how many times I've found that content has been tampered with subsequently, or the citations are intentionally falsified, or - as happens uncomfortably often in articles dependent on languages other than English - AGF errors are made due to a poor knowledge of English leading to mistranslation. I'm sure I'm not the only editor who uses the talk page prompts as a method of catching up on the integrity of an article. --Iryna Harpy (talk) 00:41, 12 February 2018 (UTC)
  4. Leave it on: The talk page messages provide a handy way to get to the affected links and check them efficiently. I'm not a fan of the extra watchlist entries, however, and would be open to another system for noting the affected links. Graham87 02:30, 12 February 2018 (UTC)
  5. Leave it on The "costs" are entirely notional and vastly overstated based on zero research. The benefit is that the bot's actions are transparent. Posing a zero cost-real benefit CBA answers itself. Eggishorn (talk) (contrib) 15:50, 12 February 2018 (UTC)
  6. Leave it Bytes are cheap, and users are not required to read the messages if they don't want to, and I don't see the harm in leaving notice on talk pages. --Jayron32 16:00, 12 February 2018 (UTC)
  7. Leave it the transparency of the bot action with the notice and what response may be needed by humans is worth even the claimed downside, above. Response is also more likely lead to improvement of the bot. Alanscottwalker (talk) 16:07, 12 February 2018 (UTC)
  8. Leave it on The bot's talk page messages have provided a useful synopsis for checking links. If such a synopsis exists elsewhere, then link to it in the edit summary and let's see if that's a good replacement, before turning off the current messaging. I still find questionable edits by IABot, even though I'm not checking regularly. I hardly ever correct its edits by reverting, so a system of registering reversions isn't enough. In any sort of cost-benefit analysis, you would have to include the time it would take live editors to maintain the sort of link viability that IABot does automatically. Even at its most imperfect, it was more efficient that manual checking and updating of links. Dhtwiki (talk) 23:08, 12 February 2018 (UTC)

Discussion

I am pinging people who have engaged in previous conversation about the bot messages.

Blue Rasberry (talk) 20:07, 11 February 2018 (UTC)

@Xaosflux: I wish to avoid encouraging you to enter a conversation which you do not find interesting, but if you are open to conversation, then I would like to hear more about how you think of the costs. You are a wiki bureaucrat whose role includes reviewing bots, and I am interested in your opinions on the extent to which bot consumption of human labor factors into your decision to approve bots.
I am imagining an equation where
(amount of human time which bot consumes) * (value of human time) = cost of bot
Your personal view and default opinion might be
(bots always consume 0 human time) * (human time has a value of 0)= bots always have 0 cost
My view in this is
(10 seconds on average per IABot post) * 500,000 posts * US$15/hr = 5,000,000 seconds * 15/hr = bot consumes human labor with value of ~$20,000
All of this is a guess, but it is challenging for me to imagine how lower estimates could be reasonable. Any time estimate or labor value estimate multipled by 500,000 or 1,000,000 leads to a large cost. I am seeing an unsustainable call for human labor engagement in highly tedious bot activity and I would not want typical wiki contributors feeling a draw to engage in social interactions with bots in preference to more human activity which the wiki environment should promote instead. No human will make a loud request for help a million times, but it is possible for bots to do this, and as a general rule I advocate that we should not allow such loud automated requests to compete with human to human interaction. If you have another view of the value of bot/human interactions or other numbers to plug into the equation then I would like to hear what you think, because it is challenging for me to understand what it means to socialize with bots that seem vocal about making demands 10,000 at a time. What insight do you have to detect any misconception in my narrative here? Blue Rasberry (talk) 23:17, 11 February 2018 (UTC)
@Bluerasberry: I do review bots as part of BAG and I would not have introduced a requirement for this bot to do this. As for the "costs" of storage space on the WMF servers, edit rates/bandwidth etc - in general if this is causing a problem the server ops team will let us know (it rarely is). As far as talk page notices go - we fill those up will all sorts of things like project banners, continually re-evaluated class information, etc. I've never been bothered by any of these as an editor but I can see where others might not find it useful. Like I said above, if its not useful stopping it is the way to go. And this discussion is a good way to help determine that it is or isn't useful. I'm generally opposed to forcing a bot operator to make an edit they don't want to (or quit operations) unless there is a compelling community reason why it is necessary. In this instance the operator doesn't want to make these edits, so I'm generally supportive of their request. Hope that helps. — xaosflux Talk 23:25, 11 February 2018 (UTC)
@Xaosflux: I feel like I have failed to communicate clearly. I am talking about contributor labor and time, whereas you seem to have understood that I am talking about the cost of electricity. Please cease considering servers, processing, or anything unrelated to the time of individual humans.
The talk page posting which IABot currently does is in the family of Chatbot or Spambot behavior and the bots review process should identify and restrict spambot-style social interaction between bots and humans. Right now we are having a human to human conversation and you are giving your labor to this. The wiki environment should encourage these kinds of interactions. The wiki environment should not encourage human interactions with spam bots. Bots have the potential to draw humans into giving their attention to tasks which no human should do. From the start this IAbot had a near 0% failure rate, and now it has a failure rate much closer to 0, yet very loudly to many thousands of users, the bot requested their time, attention, and labor to check the bot work. I understand that the wiki community was cautious about the editing this bot does and that it originally made sense that the bot would be cautious and give disclosure when entering the wiki environment. I do not fault anyone for assuming that talk page posts would be useful, but how that we know how bots can work I never want to see any bot have in its design an objective to solicit a massive amount of human attention directed to social interactions with bots ever again. The wiki community should be cautious about letting anyone automate thousands of requests that human attention go to bots. If any bot has inherent in its design the intent to request 100s of Wikipedia contributor hours then I think that should be made clear in the review process and that there should be an evaluation about whether 1000s of Wikipedia contributors should all give their attention to the social experience which the bot is putting in front of everyone.
This bot has made not fewer than 500,000 talk page posts when a typical mass posting is 100 posts. This is crazy off the scale of what we normally address. If you feel that this bot is consuming an insignificant amount of time, or if you feel that human interaction with this bot's posts were typically beneficial for the wiki environment, then I would like to hear that from you.
To what extent am I making my concern easy to understand? Blue Rasberry (talk) 00:20, 12 February 2018 (UTC)
I already said I'm in support of letting this operator change this task barring any community consensus that it is needed, should the community decide these messages are a net positive then its not my place to override them. Personally, I don't mind if they stay or go. Community input to proposed bot tasks at WP:BRFA is always welcome, and BAG strives to ensure that community support is in place for large tasks, which are always available to be re-reviewed. It does appear a bit verbose, and I think the best improvement would be for the operators to change to using a link in their edit summary that goes to a more expanded FAQ subpage of the bot's userpage. This should be able to summarize the general usage and give instructions for people to better give feedback to the operators. — xaosflux Talk 02:41, 12 February 2018 (UTC)
  • Thinking out loud: I'm undecided on this question as yet. I'm generally inclined to accept the bot operator's position on the assumption that they have superior insight into the issue (and Cyberpower doesn't think it serves any purpose). And similarly, I subscribe to Xaosflux's principle above: if a bot owner doesn't want to make a type of edit, there should be a very compelling community interest present to justify forcing them. I also find Bluerasberry's and related cost arguments salient: both the raw technical cost of extra edits etc. (which is small but non-zero, and which may grow to be non-neglible over time), and the human cost of the wasted effort to deal with the notices (but not the effort when humans actually intervene for whatever reason).
    All that said, however, I am still ambivalent. As a foundational premise, I do not see every edit on every article in my watchlist. Or even most edits on most articles on my watchlist. There are simply too many edits and too many articles. Further, there are very few editors working in my area, so the odds are very low that any given edit gets sufficient, or even just any, review. This a major problem in general (and it applies to many areas, not just my little sub-specialty), but it also applies to IABot's edits (and absent meaningful review, you cannot tell what IABot's actual approximate error rate is).
    On that background, I think I want some way to keep track of IABot's activities, in my area, at some granularity coarser than individual edits, but not so coarse as to be binary. The current talk page notices are that: they persist past the individual edits scrolling off my watchlist. But they still deal with just one edit to one article. So maybe the current setup is just too fine-grained?
    Perhaps a sweet spot, for me, would be a periodical (weekly, say) report of IABot's activity on articles in my WikiProject? This would be analogous to Article Alert bot and Cleanup lists. That gives me some place to go check periodically, and the edits to the report page will show up in my watchlist if I'm actively working on Wikipedia; or if it gets updated at a predictable interval (as the Cleanup list is) I'll know to check it when time allows.
    And not every article on my watchlist is within the scope of my WikiProject, but it is the articles in the WikiProject that I care enough about to want to track this closely. For others there may be similar aggregations such as a category; but a tracking/interest aggregation greater than the individual article in any case. And a change aggregation greater than the individual edit (i.e. typically time based, even if the optimal interval is something other than my preference for one week).
    There are also issues of "discoverability" (how do you find the IABot tools interface? A link in the edit summary vs. a descriptive text permanently on the talk page are very different propositions!), and quality (IABot is very good, but not perfect; it does still need humans in the loop, for "false negatives" and "other" issues, in addition to what above gets called "false positives" for some reason). But these are of lesser import, I think.
    So where does this leave me? Well, it leaves me undecided, obviously. Perhaps cyberpower678 could chime in on the technical feasibility and their interest (or lack of, obviously) in implement some kind of report system? And it would be useful to know if others have any interest in something like this, or whether the vast majority are simply on one side or the other of the binary Yes/No to per-edit talk page messages? — Preceding unsigned comment added by Xover (talkcontribs) 06:18, 12 February 2018 (UTC)
Hmmm...I'm not inclined to create one right now. A lot of other plans for IABot right now. Though honestly, as GreenC said IABot's API allows tools to actually access similar information by going to https://tools.wmflabs.org/iabot/api.phpCYBERPOWER (Be my Valentine) 13:39, 12 February 2018 (UTC)
API documentation: https://meta.wikimedia.org/wiki/InternetArchiveBot/API -- GreenC 16:06, 12 February 2018 (UTC)
@Cyberpower678: Yeah, I kinda figured as much. Keep it in mind as possible long term wishlist kinda thing? In any case, I can't see anything in the API docs that would cover the kind of stuff I describe above. I'd be looking for something like "Which of 'my' articles had a link go dead, or had an archive added (that might be a bad archive), in the last 30 days?" The API, afaict, is entirely oriented around the URLs and their current state, not the articles and their history, if that distinction makes sense? --Xover (talk) 19:04, 12 February 2018 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Discussion: Use of AR-15 Style Rifles in Mass Shootings

A discussion is taking place at:

Interested editors are invited to participate. --K.e.coffman (talk) 20:29, 18 February 2018 (UTC)

Aliases for arbitration cases

Following up on the discussion at Wikipedia talk:Arbitration Committee/Noticeboard/Archive 36#Community feedback: Proposal on case naming, I created a proof-of-concept template to create aliases for the cases opened in 2017 and 2018. Please see Wikipedia talk:Arbitration/Requests#Aliases for arbitration cases for more details. isaacl (talk) 02:28, 21 February 2018 (UTC)

Budget for games and changes in the representation for film budgets

Hi everyone, I have two suggestions. Should we show budget in the wikis for computer games? This is already done in the film wikis.

Also, should have a special entry for marketing costs when talking about a film budget? For example it might say that a film budget is 25 mil. The box office might be 30 mil. A reader of this wiki might then believe that the made a profit of 5 mil. However, with film the marketing might very often be double of the budget. So, if a film budget is 25 mil + 25 mil marketing then you would have a loss of 30 mill. Not 5 mil. profit. Same could be done with computer games

Video game developers rarely have those figures published in reliable media. It is something we would like to include were it possible under the requirements of WP:RS. --Izno (talk) 12:10, 21 February 2018 (UTC)
couldn't these just be added into the respective infobox templates? Then if sourced information is found, it can be included. Nessie (talk) 14:41, 21 February 2018 (UTC)

Hi all thanks for replies suggestions. Is there an easy way to give for example all computer games a budget template? And yes I of course agree that the sources for these numbers should be reliable. Also, would eventualism be ok. I'm pretty confident that I can get reliable sources on the newer and bigger games for the older or smaller ones that might be a bit more difficult.

Template:Infobox video game (edit | talk | history | links | watch | logs) looks to be the one to edit. If you aren't sure how, ask on the talk page. Nessie (talk) 14:06, 24 February 2018 (UTC)

RFC: Autopopulate Category:Infobox templates when we can

Editors support this proposal to change Category:Infobox templates to a {{all included}}/{{tracking category}}.

Cunard (talk) 01:39, 12 March 2018 (UTC)

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

tl;dr version: I propose that we change Category:Infobox templates to a {{all included}}/{{tracking category}}, and get all the benefits this would entail.

Current situation/Proposed solution

Currently Category:Infobox templates is setup as a {{container category}} [with a handful of infoboxes which are mistakenly in the category], which is pretty much completely useless. Infoboxes are sub-categorized in an extremely incomplete way, so finding anything is almost impossible simply because this scheme is pretty much ignored, on top of just badly designed in general. Try to find something like {{Infobox journal}} from the main category, and you'll be spending a long time browsing things before realizing it's located in

or

And that's when it's categorized in the first place. You can't get to {{Infobox ABL team}}, because it's not categorized.

So to fix/improve this situation, I propose that we change Category:Infobox templates to a {{all included}}/{{tracking category}}. This would have several benefits, such as (non-exhaustive list):

The category could be automatically populated and correctly sorted by the {{Infobox}} template itself (and other similar templates) in the vast majority of cases (at the very least anything that starts with Template:Infobox foobar can be), with the rest being manually categorized. This wouldn't change the subcategory system, so editors could still find {{Infobox element}} via

if they prefer browsing that way. Headbomb {t · c · p · b} 12:43, 21 February 2018 (UTC)

!Vote (Infobox categorization)

As the preliminary talkpage discussions show, this example originated with "automation", not with "redefinition". A pity that has not been corrected now, though I must say some early incidental issues are addressed.
* 'Cornercase' examples are, each from a different reason/origin (i.e. each example here represents a single issue): {{Taxobox}}, {{Chembox}}, {{Infobox3cols}}, Wikipedia:Manual of Style/Infoboxes, {{PBB/2239}}, {{Outline header}}, {{Infobox Wikipedia user/sandbox}}, {{Infobox drug/licence}} . - DePiep (talk) 12:30, 25 February 2018 (UTC)
As explained above, not everything with the infobox class is an infobox, so that's why the proposal says all infoboxes, rather than all templates with a class=infobox in them somewhere. As for your cornercases, this has also been explained above. Those can be handled by adding [[Category:Infobox templates]] manually to the templates (or subpages) when appropriate. Headbomb {t · c · p · b} 23:09, 25 February 2018 (UTC)
Then what does define something being an infobox? Also, not all "cornercases" are solved above (and let's keep in mind that 'cornercases' are not exceptions. They may constitute 50% of all infoboxes. -DePiep (talk) 11:40, 26 February 2018 (UTC)
See WP:INFOBOX. Headbomb {t · c · p · b} 12:14, 26 February 2018 (UTC)
Self-contradicting: WP:INFOBOX is about articles, so off-mainspace boxes are not infoboxes by definition (etcetera: as I predicted, now all the 'cornercases' must be handled on a one-by-one argument). This knot tying could be prevented by first defining what infoboxes should be categorised. Also, still no answer on how the subcategory contents will be treated. (BTW the pattern of reasoning in this topic is this: A. Some statement is made, B. I point to a problematic issue, C. Reply: no that is not an issue and it would be solved in such and such way.) - DePiep (talk) 20:51, 26 February 2018 (UTC)
At this point, you're clearly trolling. WP:INFOBOX is about infoboxes, not articles. To quote "An infobox is a panel, usually in the top right of an article, next to the lead section (in the desktop version of Wikipedia), or at the end of the lead section of an article (in the mobile version), that summarizes key features of the page's subject." Concerning "no answer on how the subcategory contents will be treated", that's covered by "This wouldn't change the subcategory system" above, and "All other categories would remain unchanged" below. Start making sense or go away. Headbomb {t · c · p · b} 22:10, 26 February 2018 (UTC)
Another example. The OP says: "[with a handful of infoboxes which are mistakenly in the category]". Clearly, that is changing the definition of this category, you still don't want to acknowledge. All in all I listed eight different situations, each one is stumbling to an "explanation" that does not hold. - DePiep (talk) 11:00, 27 February 2018 (UTC)
Refuse to acknowledge? That's what the entire proposal is about. Changing Category:Infobox templates from a container category to a category that contains all infoboxes. Headbomb {t · c · p · b} 12:19, 27 February 2018 (UTC)
Make this a Permanent objection. The nom has shown no intention of improving the proposal after having been pointed to major design flaws. Also I listed some eight exception situations, and have to pull for each and every reply which then turns out to be "people know", assuming obviousness, contradiction, et ecetera — instead of creating a sound base. - DePiep (talk) 11:00, 27 February 2018 (UTC)
  • Support DePiep seems to be knowledgeable about cases where the general rule will not apply and those cases do merit extra and unusual attention. However, to the extent of my understanding, the general proposal made here does apply to many cases, and its implementation would usually bring improvement. It seems reasonable to enact this on a scale of 100s of infoboxes. Blue Rasberry (talk) 00:42, 26 February 2018 (UTC)
One of the arguments is that single-category listing is useful in automated tracking. Still no effort is made or proposed to include all infoboxes systematically. There is only one single automation proposal, the rest is left alone (both pages to include, and pages to exclude). The cornercases I mentioned do represent an uncovered issue each. Instead of implicitly assuming a dozen solutions ("what do we do with /sandboxes?"), the principled solution is to define 'what is an infobox to be categorised'. Also, omitting a sound category definition increases the chaos between the top-category and it's current subcategories. That is hardly an improvement. - DePiep (talk) 11:40, 26 February 2018 (UTC)
That's mostly because people know what infoboxes are, and are well aware that sandboxes aren't to be categorized in the main category, much like drafts are to be excluded in mainspace categories. Headbomb {t · c · p · b} 12:17, 26 February 2018 (UTC)
Replied under previous bullet. - DePiep (talk) 20:51, 26 February 2018 (UTC)
re people know what infoboxes are - apparently that knowledge differs. For example, it differs between you and me, and between template forms. In general, "people know" is not a strong base for a definition (category redefinition in this case). Also, I mentioned eight non-standard situations, about none of which have been answered wrt category definition. - DePiep (talk) 11:12, 27 February 2018 (UTC)
It only seems to differs between you and the rest of Wikipedians. If you don't know what an infobox is, then you should probably avoid commenting on them until you learn what they are. None of your 'non-standard' situations cause any issues with this proposal. {{Taxobox}}, {{Chembox}}, {{PBB/2239}}, are infoboxes and would be categorized and sorted accordingly. {{Outline header}} seems to be a navbox and would probably not be categorized (it's unused anyway, so it could probably simply be deleted). But if the community deems it an infobox, then it would be categorized. {{Infobox Wikipedia user/sandbox}} is a sandbox and would not be categorized (although {{Infobox Wikipedia user}} would be). {{Infobox3cols}} is an infobox metatemplate like {{infobox}} is and would get the same sort of update as {{infobox}} and would remain categorized as it currently is (at the top of the category). {{Infobox drug/licence}} is a sub-template used by an infobox and would not be categorized. WP:Manual of Style/Infoboxes is a style guideline about infoboxes, and while not an infobox itself, it is part of a small set of core guidelines related to infoboxes, and would remain categorized as it currently is (at the top of the category, along other similar pages). This is obvious to everyone but you. Headbomb {t · c · p · b} 12:02, 27 February 2018 (UTC)

Discussion (Infobox categorization)

Why would that be better? We have an existing category. There's no downside to using that one and making it useful. Headbomb {t · c · p · b} 22:39, 23 February 2018 (UTC)
@Bluerasberry: Not quite sure what's asked here exactly. The only thing different would be that {{Infobox journal}} would be categorized in Category:Infobox templates, sorted under 'Journal'. All other categories would remain unchanged. Headbomb {t · c · p · b}
@Headbomb: I am asking for you to talk through changes to {{Infobox journal}} as an example. You seem to be proposing to add [[Category:Infobox templates|Journal]] to {{Infobox journal}}, right? Blue Rasberry (talk) 23:13, 23 February 2018 (UTC)
@Bluerasberry: No, that would be done by {{Infobox}} automatically. No edits would be needed to {{Infobox journal}}. Headbomb {t · c · p · b} 23:19, 23 February 2018 (UTC)
@Headbomb:, Okay, so you will not add that category directly, but the change you are proposing will indirectly put that article into that category, correct? Blue Rasberry (talk) 23:21, 23 February 2018 (UTC)
Correct. There will be some corner cases that will need to be handled manually, but those are minority.Headbomb {t · c · p · b} 23:24, 23 February 2018 (UTC)
@Headbomb: Okay, and in general, the sorting will be of the format "Infobox Whatever", which is the name of the template minus the word "infobox", right? And you are saying that the minority of manual cases will be the ones which for whatever reason do not have that sort of typical name? Blue Rasberry (talk) 13:42, 24 February 2018 (UTC)
Pretty much, yup. Headbomb {t · c · p · b} 14:52, 24 February 2018 (UTC)
Thanks. Blue Rasberry (talk) 00:42, 26 February 2018 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

RFC here, please join -> MediaWiki_talk:Wdsearch.js#Add_link_to_search. --Superchilum(talk to me!) 10:50, 28 February 2018 (UTC)

April 1st... Wikipedia as Viewdata (Page 100 for Main Page?)

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


As many people here know, there are ocassional April 1st items posted on English Wikipedia,

However, this year I was wanting to ask if there would be any interest in doing something a little unusual.

During the 1980's in the United Kingdom there was a viewdata system called Prestel, which used a presentation

format not dissimilar to the Teletext system used by the BBC's "over the air" CEEFAX.

I'd be interested for April 1st to see what English Wikipedia might have looked like as a viewdata based system c. 1988.

I've posted a similar proposal on French Wikipedia (although for obvious reasons the basis system would be the french TeleTel/Minitel)

It would need a moderate amount of technical effort, as currently Commons doesn't necessarily support a view for the Teletext frame formats the GPL frame editor I found supports.

ShakespeareFan00 (talk) 10:59, 27 February 2018 (UTC)

  • I like it. The old "Lets post funny but true stuff" on April Fools is getting a bit old. Something fresh and different is welcome here. --Jayron32 13:00, 27 February 2018 (UTC)
  • If I am reading this right, the proposal is essentially a new skin that is then enabled by default on April 1st? To even be considered it would need a level of polish that might not be possible in just a month and a half, and would definitely need a big 'off' button clearly visible to users as well. There is also mobile to think about, etc. — Insertcleverphrasehere (or here) 13:30, 27 February 2018 (UTC)
Not necessarily a new skin, but development of derived content, given the 40x24 layout and numeric linking approach (Ceefax used 100-999 whereas prestel used a 32bit number range IIRC. Not sure how those would translate on wiki (article title or id-hashes maybe?) 13:53, 27 February 2018 (UTC)
There's a font called Bedstead which emulates the old style teletext font - http://bjh21.me.uk/bedstead/ , there are also some online frame editors, but I'm not sure of the license details.ShakespeareFan00 (talk) 13:53, 27 February 2018 (UTC)
A monospace font? --NaBUru38 (talk) 18:28, 27 February 2018 (UTC)
The font is monospaced. but see below.19:19, 27 February 2018 (UTC)
  • Strongly oppose. It took years to get rid of the idea that Wikipedia should celebrate April Fools Day and to see off the "strange but true" folks (except on DYK, where they're still grimly hanging on); we don't need yet another pack of self-appointed comedians. Wikipedia exists as a service to its readers, not for its editors to slap each other on the back about who can be the biggest smartass. Aside from anything else, if you're planning to impose a 40X24 character layout, you'll need to persuade the people who run all the existing elements of the main page to withdraw; I can't imagine someone who's spent years bringing an article up to FA standard, and is hoping to run it on 1 April because that's a significant anniversary (and there is currently a backlog of 6 articles waiting to run on this date at Wikipedia:Featured articles that haven't been on the Main Page/Date connection) is likely to be very impressed. ‑ Iridescent 18:38, 27 February 2018 (UTC)
This was not intended to replace the current Main Page on April 1st. It would be parallel content.ShakespeareFan00 (talk) 19:13, 27 February 2018 (UTC)
It seems there's little interest in this, Abandoned due to the reasoning stated above. ShakespeareFan00 (talk) 19:19, 27 February 2018 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.