Archive 25Archive 26Archive 27Archive 28Archive 29Archive 30Archive 35

Plagiarism discussion

Since the issue of plagiarism has come up here recently, I thought those following this page might be interested in joining or reading the discussions at Wikipedia talk:Plagiarism. Carcharoth (talk) 09:55, 27 June 2008 (UTC)

DYK poor fact checking issues -- does it matter when articles on main page are wrong?

Does it matter that the articles in DYK are often plagiarized and wrong? Today's list has a fact from Deux Balés National Park which maps the Black Volta River in far eastern Burkina Faso--it's not. The river on the map, in far eastern Burkina Faso is the Oti. The Black Volta is just west of center. I don't think that complaints about problems on the main page are welcomed. But DYK appears to be out of control. Do editors earn rewards for DYK contributions? There is not much time spent fact checking. Even Wikipedia could have been used to fact check this article, and see that it's wrong.

I think the rules could stand to be changed to include some fact checking and plagiarism checking time. --Blechnic (talk) 02:21, 25 June 2008 (UTC)

Great idea! We could really use some extra hands, does this mean you're volunteering? Gatoclass (talk) 02:44, 25 June 2008 (UTC)
We need some kind of special programming for that.--Bedford Pray 02:49, 25 June 2008 (UTC)
No, I'm not volunteering. I had my head bitten off, got attacked by half a dozen editors the first time I pointed out that an article in DYK was plagiarized, and was blocked for a week. Now it appears that articles by Carol Spears that appear in DYK have been heavily plagiarized, but the ones I've reviewed have also been wrong. I've seen quite a few wrong and plagiarized DYK articles. Again, it's been made abundantly clear by how I was treated that knowing the articles on the main page are factually inaccurate or plagiarized is unwanted shoot-and-attack-the-messenger information. I suggest, though, that it disgraces Wikipedia. Thanks for moving this Bedford. It doesn't help that it's almos

t impossible to find where to point out a wrong fact in DYK. You don't even have to be an African geographer to have seen that west of the Black Volta is not in eastern Burkina Faso. --Blechnic (talk) 02:56, 25 June 2008 (UTC)

I've volunteered with this for a very long time. I've pointed out or fixed errors in very many articles and never had any problems like you describe. I'm sorry you had such a bad experience. Of course it matters if articles on Wikipedia are plagiarized, and it's very important that we get them fixed. The way DYK works is that articles are submitted at T:TDYK and sit there typically for five days. During this time editors review the articles and try to catch these errors. It's a huge amount of work and more help is always appreciated. If you notice a problem on an article at T:TDYK then please leave a comment below the article stating the problems. Articles that get flagged as problematic will not be added to the T:DYK template until the issues have been fixed (and if the issues are not fixed, they will not be placed on the main page at all). Of course it's very difficult to fact check everything in every article that comes through the system, which is why more help is always needed; the editors running DYK are human and sometimes miss things or make mistakes. --JayHenry (talk) 03:04, 25 June 2008 (UTC)
Well it's not merely difficult to fact check everything, it's just plain impossible. We are struggling mightily just to check the facts in the hooks alone, which probably amounts to between 50-100 facts per day, to check every fact in every article would mean literally thousands of facts and we would need a fulltime staff of dozens to do that, along with considerable resources. Ain't gonna happen.
However, as I indicated above, if Blechnic has found so many mistakes he is more than welcome to report them, either at main page errors or better still on the suggestions page prior to posting, where we are chronically in need of more people to process submissions. Gatoclass (talk) 03:11, 25 June 2008 (UTC)
I write articles about tropical African agricultural pests, a rarified area on Wikipedia. I don't find, after reading so many plagiarized and inaccurate articles, that DYK is compelling. I usually go to the article after finding the DYK congrats tag on someone's talk page. However, do you think these unrelated articles are more important than some better coverage on African geology, soils, plants and their pathogens, rivers, and geography? Do you think it would be better for me to spend time fact-checking in an area I know nothing about, when I can run through 10 African plant stubs and at least add the information that it's a tree, a bush, a vine, in upper or lower tropical rainforest, a medicinal plant, used to make xylophones? And that I can quickly verify those facts with a glance at a book on my shelf, verify information not found on the web, or access private data bases to verify? Or would it be better for me to do a second rate job fact checking an unrelated pile of articles of varying importance outside of my area? This seems like a call against expertise: no matter what you know, have been trained and educated in, don't spend your time there, but randomly make sure that brand new articles on the front page are accurate?
Why not ask the projects to edit the articles before they go on the main page, give them 3 days after it has been selected to appear, to edit and correct and fact check the article. I'll embarrass them after that. You have almost no one writing about tropical African agricultural pests, don't ask the few editors who can do that to edit a church in New York, the Buddweiser Clydedales, and an industrial metal band and make Wikipedia far more American and limited world view than it already is. --Blechnic (talk) 03:39, 25 June 2008 (UTC)
I'm not sure if I understand your proposal. How exactly would the system work? [XXX] reviews the articles for DYK eligibility and selects them to appear on the main page. Then [XXX] would be responsible for finding active WikiProjects (bear in mind that the vast majority of projects are not highly active), notifying them of the article and asking them to vet the article over the course of the next three days. After that, [XXX] would promote the hooks which have been checked? -JayHenry (talk) 04:06, 25 June 2008 (UTC)
I'd like to think that involving wikiprojects more closely might lead to better articles, but I'm thinking it might just make things a lot more unmanageable. For one thing, someone is going to have to inform a relevant wikiproject or two whenever an article in its area of interest is submitted. Then we have to hope that someone at that project actually cares enough to vet the article (they aren't doing it on the front page now, will it make any difference if the article is posted on their own wikiproject?). And then at some stage the time for improvement is up and that article enters the DYK pool for promotion. I mean, I can see this becoming a bit of a management nightmare.
Apart from which - let's face it - quite a few wikiprojects are just wp:battlegrounds with groups of wilfully ignorant partisans slugging it out over the fence. Is involving such groups likely to lead to better articles? I doubt it. So I'm not at all persuaded of the benefits of a proposal like this. Gatoclass (talk) 04:48, 25 June 2008 (UTC)
And come to think of it, User:AlexNewArtBot already sorts new articles for WikiProjects, and the active WikiProjects actually monitor the lists the bot creates and check and improve these articles. So there's already a five-day fact checking period and a system to notify editors with relevant expertise who are active. I think I agree that the issue is a lack of manpower and perhaps a lack of understanding from the community regarding what DYK is all about: giving publicity to newly created or expanded Wikipedia articles, as a way of thanking the editors who create new content and to encourage other editors to contribute to and improve that article and the encyclopedia. The articles are allowed to be a work in progress. (Of course it goes without saying that absolutely no edit should be plagiarized or a copyright violation.) --JayHenry (talk) 05:11, 25 June 2008 (UTC)
But that's exactly what is happening, your lead sentences, entire articles, your hooks, the references: wrong, plagiarized, plagiarized, wrong.
On Wikipedia the first thing that always crops up in response to a suggestion, usually faster than anyone can think about it, is why it can't be done, why changing a really crappy method--namely, plagiarizing other people's work and highlighting it on the front page--is wrong, why nothing in the known universe can ever change, ever be invented, ever be imagined to occur in a different way.
And, look, that's about the only constant in the human race through history: the nay sayers. Yet here we are on the internet, and I drove to work today, in a car that gets 40mpg.
Make it the responsibility of the nominator to get the article to a WP or other place for fact checking.
Or, as someone suggested, reward good articles, instead of new articles.
Or block articles from editors who've submitted prior articles with poor fact checking, because DYK is a contest. Or make a tag for their page, or correct the damn articles when someone points out they're wrong instead of attacking and blocking editors who point out they're plagiarized.
I haven't figured out what the prize is yet, but there is one for getting your article in DYK, because that is how a small subgroup of editors act: precisely as if there is a prize, and people are treating it just like that: rack up the points no matter what.
I can't for the life of me figure out how the editor thought that the Black Volta was in eastern Burkina Faso. The article he used as a source didn't say that. No Wikipedia maps say that. None of the articles he references said that. But it didn't matter, because this encyclopedia isn't accurate and doesn't strive for accuracy, just encouraging bad editors. Yet, there it was, a quick and dirty and fastly written article with just that little factoid brightly opening it, a major drainage basin in Africa changed, because crap, apparently "encourages editors to contribute to and improve that article and the encyclopedia."
Sorry, JayHenry, no encyclopedia was ever improved by getting facts in its articles wrong, and if these are the editors you want to encourage, I think DYK is worse off than I could have imagined. But that explains why I've yet to examine a single DYK article that was factually accurate or not plagiarized. --Blechnic (talk) 05:50, 25 June 2008 (UTC)

First article I checked on DYK right now, Beth Wambui Mugo, is plagiarized this web page. It's rampant. --Blechnic (talk) 06:29, 25 June 2008 (UTC)

This is a stub ... way too short to go on the Main Page in any event. Daniel Case (talk) 13:35, 27 June 2008 (UTC)
Um... you DID see that that source was already listed as a reference and cited multiple times in that article, right? A re-reading of WP:Plagiarism may be in order, especially the "without attributing that material to the original author" bit. Jclemens (talk) 06:44, 25 June 2008 (UTC)
Um, did you see the quote marks in the article? They're not there, that's because it's plagiarized, which it is. If you use material verbatim from another source, you must use quote marks. --Blechnic (talk) 06:52, 25 June 2008 (UTC)
I DEFY YOU to show one of my articles was plagiarized from the web. That's 135 and, except for maybe part of a paragraph in one or two that came from a public domain source like the NPS, none of mine are plagiarized.--Bedford Pray 06:36, 25 June 2008 (UTC)
Oh, shoot me dead, bitch I am, for pointing out the plagiarisms on DYK. Come on, get me blocked, harass me, challenge me, mock me, assault me. Oh, my god! I'm going to go look at your articles now! No, wait, I don't care to read any of yours. I've read a couple of dozen DYK articles. Every single one has contained rampant plagiarism. If that's the company you want to keep, who am I to defy your desires to hang out with your buddies? I did just read an article, the first one, that appears not to have been plagiarized, not only that, the author appears to have used his sources well, a little heavy on the hiking guides, but these are supported by other sources I found. The article needs more references, but even the unreferenced lines don't appear plagiarized, and, wow, imagine that, he didn't copy lines of text verbatim without quotation marks. Oh, wait, I already said he didn't plagiariaze. He or she. --Blechnic (talk) 06:52, 25 June 2008 (UTC)
You really aren't going to help persuade anyone with wild exaggerations. I very much doubt you have found "rampant plagiarism" in "every single one" of the dozen DYK articles you have supposedly read. That sort of comment is only likely to damage your own credibility.
As I've said, anytime you find plagiarism problems on DYK, find an active DYK admin and they will deal with it. If you find a lot of it, you will quickly persuade us that we have a significant problem that needs addressing. But merely making unsubstantiated and dubious claims is not going to help your cause. Gatoclass (talk) 08:02, 25 June 2008 (UTC)
Oh, I think if I read a couple of dozen articles on DYK and they all contain plagiarism, that's rampant plagiarism on DYK. And, look, when I was discussing it last night, I didn't have to even go to the archives or my watch list to find examples, I simply went to the main page, picked an article at random, and it was plagiarized. It should not have been that easy to support my accusations--but it was. You ought to worry more about the quality of stuff on the main page.
And, you know what, when I was attacked and blocked for a week for complaining about a DYK plagiarism, that kinda put a damper on reporting it. It also, I suspect, is an indicator of how complaints on DYK are handled: shoot the messenger dead, get rid of them from Wikipedia. --Blechnic (talk) 14:20, 25 June 2008 (UTC)

(←) In response to: "Sorry, JayHenry, no encyclopedia was ever improved by getting facts in its articles wrong, and if these are the editors you want to encourage, I think DYK is worse off than I could have imagined." What on earth are you talking about? Where did I say anything where any possible interpretation of what I said by any remotely intelligent person could lead to the conclusion that you drew? I'm a volunteer here. I'm not in charge. Why would I want to volunteer with somebody like you who's going to distort the things I said? Yep, I'm going to go ahead and take a break from that. Honestly tell me Blechnic how I'm supposed to respond to that ridiculous insult? I've got more important things to do... so good luck. Your work will be easier now that I've stopped my advocacy for incorrect facts and stopped my encouragement of plagiarists. Geez, insult the volunteers at the soup kitchen because there's hunger in the world, huh? --JayHenry (talk) 13:07, 25 June 2008 (UTC)

"Remotely intelligent?" Are you resorting to personal attacks. Remember, first time I complained I was attacked by 6 editors or so, 3 of them administrators, and blocked for a week. Your comment does not deserve to be read, any more than the plagiarized crap on DYK. However, again, it just proves my point: the encyclopedia's quality is second to something else, and any challenges to that something else must be met with personal attacks. Yawn. --Blechnic (talk) 14:20, 25 June 2008 (UTC)
JayHenry, there's a reason that I didn't comment on this thread. It is not productive and it is a waste of time. Have and will copyright violations occasionally get on the main page? Yes, probably less than 1% of the time. Does that mean that DYK is broken? No. Blechnic, you can't expect people to know everything. I don't expect you to know things that are common knowledge to racing fans. You shouldn't expect non-African's to know things that are common knowledge to Africans. Wikipedia can't be proactive about the main page because there aren't enough volunteer time (or interest) to fact check the dozens of articles that appear in DYK each day. Even fact checking the main hook has been time consuming and difficult. There had been copyright violation bots that used crawl the Wikipedia database looking for copyright violations. Apparently that is no longer happening. I suggest that Blechnic spend time trying to get them to resume searching. Wikipedia will be better. Royalbroil 13:39, 25 June 2008 (UTC)
I'm not an African, and I only criticized the glaring and easily found inaccuracies. --Blechnic (talk) 14:20, 25 June 2008 (UTC)

CorenSearchBot is running today - it aims to spot copyvio's of web pages in new articles. I share everyone's frustration that neither it nor a human didn't spot Beth Wambui Mugo. As for factual errors in new articles, mea culpa. By the way, Blechnic, there is indeed (an illusion of) a prize: DYK is mentioned on 1 or 2 admin coaching pages. --Hroðulf (or Hrothulf) (Talk) 15:52, 25 June 2008 (UTC)

Blechnic, could you link to the incident where you were attacked for pointing out inaccuracies in DYK hooks? I can't find it in your contribs. Olaf Davis | Talk 16:35, 25 June 2008 (UTC)

Check my block log for dates, then run through those links and dates on my talk page. --Blechnic (talk) 23:32, 25 June 2008 (UTC)
I wanted to agree that just reading all the suggestions, checking the facts cited in the hooks (as well as checking for new article length or five-fold expansion), and doing the updates and notifications four times a day is a huge task. In an ideal world every aspect of each article would be checked before it appeared on the Main Page in DYK, but this is not an ideal world. I also note DYK has almost certainly been updated twice in just the time this conversation has been going on.
I wanted to thank Blechnic for pointing out this problem. Any errors on the Main Page are probably best reported to WP:ERRORS for quickest response. I also wondered if you (Blechnic) would be willing to look at the suggested hooks at Template talk:Did you know every two days or so and check just the ones you have some expertise on (Africa related and anything else). Anything not up to snuff could be noted under the suggested hook, as is already done. My guess is part of the problem is that not many people work on or know as much about African topics, so errors in an article on the Budweiser Clydesdales are more likely to be noticed because more people are familiar with them (and I am not asking you to check such articles anyway). Thanks to everyone who keeps DYK running smoothly and to Blechnic for catching some problems. Ruhrfisch ><>°° 16:40, 25 June 2008 (UTC)
There is not "rampant" plagarism in DYK--Mugo, which I may take the blame for, I thought (and still think, though I will not re add until I receive clarification) was in the public domain for being part of a Kenyan government source. As many know, DYK is frequently backlogged, and the real problem is the fact that not many admins work on the project. And BTW, there is rules against plagarism--we just need some admins to do the fact checking. I'm an Editorofthewiki[citation needed] 21:31, 25 June 2008 (UTC)
Also, Beth_Wambui_Mugo was only viewed 21 times in June 2006. Bebestbe (talk) 22:27, 25 June 2008 (UTC)
I'm sorry but it was damn easy to find a case where an article had been copied verbatim just going to DYK--I didn't have to spend 20 seconds searching.
I would be glad to check for African articles on DYK. I can't promise I will find everything, but I realize how foreign much of Africa is to the West, particularly when getting down to details and fact checking. My knowledge of Africa is very poor and geographically confined to western tropical Africa, though. I would like to see more articles on Africa in Wikipedia and more accurate articles. Somewhere besides South Africa and Egypt, that is. --Blechnic (talk) 23:32, 25 June 2008 (UTC)

In my experience, the best way to get more fact checking into Did You Know is to set an example. We already have way more than enough volunteers for the task of ordering others to do fact checking, but not enough volunteers for actually doing it. If you were blocked only for pointing out errors or plagiarism, then of course that is an outrage. But since the block was apparently made for a private email it's your word against his, and in any case you haven't made it easy to research it. Art LaPella (talk) 01:12, 26 June 2008 (UTC)

No, and since you didn't, apparently, read the entire block record, this may be why these problems keep cropping up in DYK. Try rereading it, the entire thing. Then come back and personally insult and attack me for it. --Blechnic (talk) 04:07, 26 June 2008 (UTC)

OK, I reread it. A WP:ANI consensus considers your block to have been an overreaction, but it doesn't say you were blocked simply for reporting errors or plagiarism. Is there something else you wanted me to read? I'm puzzled by "insult and attack me for it", apparently referring to the block. I thought I was offering no opinion about the block, and recommending only that you help us fact check. Art LaPella (talk) 05:21, 26 June 2008 (UTC)

Oh, someone else posted this with your name under it, "But since the block was apparently made for a private email it's your word against his, and in any case you haven't made it easy to research it." That's your evaluation and opinion about my being blocked, you know, "your word against his" blah blah blah. You commented without bothering to read the entire thing, that seems to be a jump to give your opinion, which was essentially a negative comment about me. Whatever. If you don't want to give your opinion about something, don't. --Blechnic (talk) 05:42, 26 June 2008 (UTC)
Translation: I give up! No one cares. They'd rather spend time looking for something, anything to attack me for, attack the messenger. If that wasn't the case you might actually have read the block log. --Blechnic (talk) 05:43, 26 June 2008 (UTC)

We need some sort of WikiProject:Did You Know? thing

I would agree that we need to ban DYKs from editors who have put up plagiarized articles more than once (It has happened). We need to keep a list somewhere, though, and maybe have a bot patrol the nominees for suggestions from such users.

A lot of the issues stem from the lack of any formal structure to the DYK page. There are currently maybe three or four users (myself included; anybody, even an anon, can verify the cited facts) who do about 90% of the fact checking (As such I vehemently disagree with the assertions that "DYKs are often wrong or plagiarized" without specific examples cited. I can look at almost any update and find two or three, at least, where I personally verified the hooks.

I really think we need some sort of WikiProject:Did You Know? to formalize and coordinate things (Another problem: There is no way, once a hook has gone on the Main Page, to find out who approved it short of paging through the history of T:TDYK. If we logged this information, we could also root out people who do shoddy verifying.

We also need to coordinate policy on what DYKs must contain (yes, we do reject shoddy articles that otherwise meet the criteria ... if it has a cleanup tag, forget it). I would like to see a requirement that a quote be included from a source that cannot be easily verified online, such as a book not (or not the pages quoted) covered by Google Books, or something behind a pay firewall (we have lots of British biographies that use the online Oxford Dictionary of National Biography as a source, something easily accessible to visitors at most British libraries but not to those of us logging in from outside Britain).

And other things. We are getting more than enough new hooks now; the days when the page was only updated once a day and we had to scrounge for hooks is gone. We can afford to be choosy, and decide how we're going to do it. Daniel Case (talk) 12:49, 27 June 2008 (UTC)

The discussion seems to have split. See below. Carcharoth (talk) 14:07, 27 June 2008 (UTC)
  Strong Support For the most part, creating new WikiProjects is a bad idea in my opinion. This idea is different. We should set up a formal WikiProject so that all of the discussion could be centralized. We can discuss things that affect all DYKs. DYK has grown significantly over the last few years. Maybe we could find more interested contributors if we had a WikiProject. Royalbroil 17:51, 27 June 2008 (UTC)
Some quick comments from me before I go on holiday:
  1. "Yes" to a more formalised setup; a WikiProject sounds good. DYK has grown substantially and has become rather unstructured.
  2. Source-checking in offline sources (whether books, journals or non-archived newspapers) is an inherent problem which needs some discussion and policy decisions. For my part, when providing a hook from one of my articles—many of which are sourced mostly from books—I try to double- or triple-cite it using online sources which corroborate it. Perhaps editors submitting offline-sourced hooks could be encouraged to try that? It would concern me if offline-sourced articles were discouraged, as many articles (and indeed types of article) are difficult to source by "online means".
  3. Some editors have the ability to check facts cited from particular types of source; for example, I can check stuff from the ODNB using my library card, and potentially other British-based things. I used to be able to verify things from the National Register of Historic Places as well, but my Javascript thing seems to have scr*wed up (?). Perhaps regular fact-checkers (and I am willing to become one) could identify (maybe on a project page) any sources they have regular access to, so they can be a first point of contact when a hook sourced to one comes up?
  4. We do indeed need to be more aggressive in spotting plagiarism, and decide what to do in grey areas such as copy-pasting from public domain text sources.
  5. As alluded to by Daniel, it may need to be made clearer that the presence of certain tags on an article (such as cleanup, expansion etc.) precludes its use by DYK.
Just a few thoughts there, anyway. Hassocks5489 (tickets please!) 20:08, 27 June 2008 (UTC)
I don't understand what a WikiProject would discuss that we don't already discuss at this page? Although this page isn't called "WikiProject DYK" that's de facto what it is. No need to create a duplicative page. --JayHenry (talk) 17:44, 28 June 2008 (UTC)

Maria Kotarba

The image illustrating her, per this page, is not public domain. WilliamH (talk) 16:20, 28 June 2008 (UTC)

I replaced the image. --BorgQueen (talk) 16:25, 28 June 2008 (UTC)
  • After having trouted myself, I also moved this to the correct section, but thanks for fixing it. As just a little heads up should similar images be suggested again, I would suggest that editors take what the Auschwitz Museum, Yad Vashem and the USHMM say on images with a pinch of salt, because in my experience, they are notoriously poor at observing image rights. For example, Yad Vashem labels the Auschwitz Album public domain, and the Auschwitz-Birkenau Museum label the construction album public domain, even when both remain copyrighted in the European Union and United States(!). A good linchpin to go by is that in a nutshell, all German images from this era aren't public domain. Cheers. WilliamH (talk) 16:32, 28 June 2008 (UTC)

Plagiarism in DYKs - one major problem

In many of the cases being discussed as plagiarism in DYKs, it's actually rather hard to see how someone could check the source confirmed the fact without seeing that it was plagiarised. I think that the reviewers being aware plagiarism is a problem, and being aware they must point it out if there are substantial similarities would go a long way to solving the problem. For instance, while investigating CarolSpears' plagiarism, I came across a few of the DYK contributions, and many of them were actually taken from the middle of copy-paste paragraphs.

Why didn't the reviewer point this out? There's no reason to think that the reviewers were stupid or acted inappropriately - maybe they didn't really look at how it was described in the article, just the source, and thereby missed that the text was substantially similar. Maybe they were just too trusting. Maybe they noticed it, but presumed it was the only incident of it in the article (this last, by the way, is almost never true). If we can figure out why the plagiarism was ignored or missed when checking the facts, maybe we can correct for these oversights and catch more problems in future. Shoemaker's Holiday (talk) 12:57, 27 June 2008 (UTC)

We didn't "point it out", obviously, because we didn't see the plagiarism.
Perhaps I should explain a little about how DYK works. Users submit hooks for their nominations, and reviewers then check the hook statements to ensure they are factual and correct. We don't generally go looking for plagiarism because that would add exponentially to the amount of time spent, and even just checking the hooks is very time-consuming. To give you an example, the last three updates I prepared (eight to ten hooks apiece) took 60 minutes, 90 minutes and three hours to prepare respectively. I am not being paid for this work, I am a volunteer here like everyone else, and there are obvious limits to the amount of time I am willing/able to invest in this project.
In addition to the hook statement checking, I do of course do a basic scan of the article, to ensure there are no obvious problems of POV, bad writing, suspected plagiarism and so on. But if I am now going to be expected to thoroughly compare every article submitted to every available reference in each article, the time will increase exponentially and the job will become not only a great deal more onerous, but virtually impossible. In short, we essentially have to rely on the honesty and good faith of our contributors. Either that or we will have to radically reduce the number of hooks featured, to perhaps one update per day. That will mean, for one thing, a lot less people getting encouragement for writing new articles, which is what DYK is supposed to be about. It will also mean, I think, that some of the current managers of DYK will have to reconsider their commitment, because I for one am not sure I want to spend hours every day combing through articles trying to find possible examples of plagiarism.
Apart from which, I'm not sure if all the extra effort would be worth the result anyway. The fact is that over 1,000 new articles are added to wikipedia every day, only a tiny handful of which are submitted to DYK. Even if we managed to catch half the plagiarism in DYK (which is doubtful, given that half or more of the submissions rely on offline sources which cannot be checked) it won't make the slightest dent in the amount of plagiarism being submitted to wikipedia as a whole. So the first question I think we need to ask ourselves is, will finding a somewhat greater percentage of plagiarism on DYK be worth all the extra effort? Will it be worth the loss of rewards that we are currently able to distribute? These are some of the issues we would need to consider I think. Gatoclass (talk) 13:48, 27 June 2008 (UTC)
Possibly some people think that appearing in DYK is also some kind of article validation. I agree that DYK is not and never has been a form of article validation, but that needs to be made clear. The concern comes because it seems like there is a review process here, when in fact there isn't. And as far as encouraging new contributions goes, we need to encourage not just more, but the right sort of contributions. And yes, more editors are needed to check for plagiarism. User:Blechnic was good at doing that. Carcharoth (talk) 14:01, 27 June 2008 (UTC)
New editors need to check anything, factual verification or expansion etc. Yesterday, I noticed that Yared was not expanded fivefold--and that one appeared on the Main Page. This is a serious problem, how dysfunctional the DYK area is. I'm an Editorofthewiki[citation needed] 14:11, 27 June 2008 (UTC)
The DYK process is designed to check the one or two facts submitted for the hook, the article length and whether it is new enough and is otherwise something we would link to from the Main Page. Nothing less, nothing more. If the process of verifying sources uncovers plagiarism (and it has, more times than Blechnic ever gave it credit for) then it does. Newpage patrol is the place to catch plagiarism; not DYK (although it's a useful backup, but not by design). We also have CorenSearchBot to look for cut-and-paste articles. If Blechnic had been more reasonable, he would have volunteered to be a plagiarism reviewer on everything, not just the stuff he felt he had expertise in and could sufficiently lower himself to do. We could use more reviewers, for sure. Daniel Case (talk) 14:38, 27 June 2008 (UTC)
Surely Blechnic's whole point is that we need to expect the process to do more before we put links to articles on the main page. Saying "This isn't DYK's problem" is not useful or correct. It is DYK's problem. 86.44.16.82 (talk) 16:13, 27 June 2008 (UTC)
Different people participate in DYK in different ways and for different reasons. In the case of Blechnic's concerns, the solution is rather simple, at least in principle. We just need some participants in DYK who check nominated articles for their cited hook and for plagiarism well before the hook is selected to appear on the main page. We already have an alert and commenting system in place, we just need people who are dedicated and skilled at performing this specialized function. If we can gather a handful of participants who appreciate the need for this checking and are willing to do it, then the problem is solved. As usual, we just need more editors willing to help out. --EncycloPetey (talk) 16:20, 27 June 2008 (UTC)
And we need tools to support such editors. In particular, there's a need for a plagiarism checking support tool. What i have in mind, and suggested in passing somewhere else before, is a program / bot / whatever which would run a diff between a given wikipedia article and any other specified internet webpage (or list of webpages). It is often the case, both in student papers submitted for class assignments, and for wikipedia articles, that plagiarism is from sources cited in a paper or article. For professors to check student papers, there is a commercial website Plagiarism.org / Turnitin.com, which some schools subscribe to which performs this check (against everything on the internet plus some literature databases content plus everything previously submitted to Plagiarism.org / Turnitin.com). However, the basics are just to run a Unix diff utility and/or other program to match up word sequences in two texts, and then to highlight the overlaps and/or report them in some way. There is freeware available to do this that can be adapted. If such a tool were available for wikipedia articles (or if there were a general website that would allow comparisons between any two internet websites), then a plagiarism checker or two could process the DYK nominations routinely. Such a tool would also be useful for regular editors who are trying to clean up articles and ensure that material adapted from a source is either reworded or put in quotes where it should be credited. There is no such tool available to us now however, and I am not even sure where to suggest it, so i am mentioning it here just to put it out again. doncram (talk) 18:43, 28 June 2008 (UTC)
I think this is basically what User:CorenSearchBot does. It's designed to catch copyright violations, and as such catches the most obvious examples of plagiarism. We need to be careful that copyright bots don't lull us into complacency toward plagiarism, as it doesn't catch plagiarism of the use-a-thesaurus-on-a-couple-words variety. --JayHenry (talk) 18:54, 28 June 2008 (UTC)
Untill someone writes a bot that can use a thesaurus as well, anyway. Daniel Case (talk) 14:42, 29 June 2008 (UTC)

How extensive must plagiarism be?

I think, upon further reflection, that what Blechnic was trying to point out but lacked the subtlety to do, was that the hook line itself is often taken, at least during the nomination phase, from the source. I've found this many times ... usually you see a hook with wording that clearly isn't encyclopedic, and it turns out in the review process that while the article itself certainly wouldn't be considered plagiarism, the hook is. He was pointing, in one of the examples he gave, to an article about some African official whose two sentences were taken off the Kenyan government website (my point in response was that, as a substub, it was ineligible for DYK anyway so I wouldn't even have reviewed the source).

What should we do in those circumstances? Rewrite the hook ourselves after verification (which I do in any event if it sounds too promotional)? Flag it as a possible vote and tell the submitter to rewrite it? Rewrite the article as well? Or just say upfront in the submission guidelines which (apparently) no one reads that such hooks will not be accepted until rewritten. I mean, when they use the same language as the source it is much easier to confirm and then rewrite afterwards. Daniel Case (talk) 14:42, 29 June 2008 (UTC)

Flag with possible vote and tell the submitter to rewrite the hook and/or article. The onus for having no plagiarism is on the writer. If they spent the time to write a non-plagiarized article, then they can spend a little more time to do the hook right. I have been reflecting on the plagiarism problems too. I think we should change the rules to require articles to have inline citations throughout the article at a somewhat lighter level than GA. DYK exists to highlight the best new articles, right? This higher standard would require articles to not be plagiarism. Most of the DYK regulars are already doing this. It's too bad that Blechnic wasn't more diplomatic. It was hard the catch the message in the tone. Royalbroil 15:13, 29 June 2008 (UTC)
That might arguably be of some assistance, but it wouldn't solve the underlying problem, and it's already hard enough trying to get users to conform to the existing DYK rules. If we are going to accept that plagiarism is a problem that DYK managers themselves need to tackle (a notion that I for one am still reluctant to accept) then it seems to me that the only viable solution, and the most sensible one, would be to get hold of some software that can automate the job, because labour-intensive string comparisons like this are exactly the sort of task that computers were invented for. Either that or I think we would probably have to radically reduce the number of updates, to perhaps one a day, because there simply isn't sufficient manpower to thoroughly check all the articles that are currently featured. And that is something else I wouldn't like to see happen. Gatoclass (talk) 11:25, 30 June 2008 (UTC)

Plagiarism award

Blechnic has suggested we try making an award for finding plagiarism. I must admit that such an approach could be at least a partial solution. Right now, we hand out little awards for every new article someone writes or nominates to the project. Why don't we just add a third award for finding plagiarism? Every time you find some, you get a notification, just like for DYK. And we have a DYK plagiarism hall of fame right alongside the existing HoF. Anyone have a comment? Gatoclass (talk) 00:20, 1 July 2008 (UTC)

I like this idea. Even editors who struggle with writing articles themselves can be taught how to identify plagiarism, and this would be a really productive way for such editors to help out. We can mention it at Wikipedia:Plagiarism as one prong in a Wikiwide struggle to raise standards and educate editors about the issue. I think inexperience rather than malice is usually what leads to breaches. The Kenyan politician, for example, was a simple misunderstanding about whether government works are in the public domain and how to document that if they are. As such we'll want to make sure that our approach is worded so that first-time offenders, who made an honest mistake, learn from the error but don't feel persecuted. --JayHenry (talk) 00:44, 1 July 2008 (UTC)
I'm not so sure. Examples of claimed plagiarism I see divide roughly equally between: 1) actual cases, apparently always from online sources, 2) highly debatable claims, 3) claims that we are ripping off WP mirror sites (crediting WP or not). I've seen many over-zealous deletions of 3), which this might encourage. Meanwhile much stuff that seems clearly from somewhere else stays, because it does not show on an internet search. Johnbod (talk) 00:52, 1 July 2008 (UTC)
(edit conflict) Johnbod beat me to the punch as I was composing this, but here it is anyhow: I'm not actually too keen on harsh punishment for plagiarizers anyway, except in the case of egregious breaches. This is because it's not always easy to define what plagiarism is to begin with. I mean, there wouldn't be court cases about it if the dividing line between plagiarism and original work was always absolutely clear. Indeed, I must concede that by Blechnic's standards, I'm wondering if one or two of the articles I've submitted might not be considered somewhat plagiarized. I mean, let's face it, there are only so many straightforward ways to state that A was born to parents B and C, went to schools D and E, attended university F and got job G.
It's easy to avoid plagiarism when your source is an entire book or two, but when your only sources are bios that are themselves only a few paragraphs, it can be quite difficult to make your version sound original or different. Gatoclass (talk) 00:58, 1 July 2008 (UTC)
That's the other reason that we need to make sure any sort of warning is very carefully worded. And obviously there's no award for flagging mirror sites. But actually Johnbod has hit on a reason why I actually think this might help. Much in the same way that experienced article writers know the rules, and we grow to trust their articles, so experienced plagiarism checkers would better learn how to spot plagiarism, and we could trust their claims. In other words, I think this system could increase the times we catch 1), and decrease the times we falsely flag 3) and perhaps improve our judgment with 2). --JayHenry (talk) 01:12, 1 July 2008 (UTC)
Well I think basically what needs to happen is that someone who finds plagiarism has to post the exact nature of the plagiarism he or she found somewhere. In most cases, I think it will be pretty obvious whether it's a valid example of plagiarism that merits an award (or, on the other hand, a possible sanction against the offender). There will inevitably be some relatively minor instances that are borderline, but I think in most cases we could probably quickly make up our minds about their validity in terms of handing out an award or a sanction. Gatoclass (talk) 01:24, 1 July 2008 (UTC)
Hmm. How about Master of the Virgo inter Virgines vs the ext link text? There are many more in Category:Anonymous artists. Maybe we should do one of those test thingies. Johnbod (talk) 01:30, 1 July 2008 (UTC)
Well I'm not sure what the status of "Answers.com" is, but assuming for a moment it was copyrighted content, I personally would be reluctant to describe that as plagiarism, because the author of the wiki piece has clearly tried to "put the information into his own words". But when you only have a paragraph of info as your source, how much different can your version be? Which is just the point I made previously - the less information you have to work with, the harder it is to create content that looks original. Gatoclass (talk) 01:56, 1 July 2008 (UTC)
I think a test would be useful, and clearer guidance and more thought as well. John, I think this is an interesting example of 2) but my primary concern is that 1)s sometimes get to the front page. This article isn't the sort that's eligible for DYK in the first place and so it's not my principal concern. As Gato notes, it's more muddled with extremely brief content. (Though the source is purportedly the copyrighted Art Encyclopedia, not Answers.com itself. This could stand to be rewritten because removing it from the source also robbed it of its context.) Back to the matter at hand: a 1,600 character article that was copied from a 1,600 character source--what we're talking about at DYK--is quite a bit different. It's also true that plagiarism is a bit trickier to define on Wikipedia, especially because of the muddled meaning of authorship, especially with editors who don't track their edits. My opuses page and the fact that Jay Henry is my name probably makes authorship implicit. But could you say that somebody named User:Anonymous H. Random who copy-pastes public domain material into an article, does not use edit summaries, and makes few other edits represented this as his own work? --JayHenry (talk) 02:09, 1 July 2008 (UTC)
I don't see much point in an award only where the removed material meets some DYK criteria. Johnbod (talk) 02:31, 1 July 2008 (UTC)
John, this is a proposal specifically trying to address the problem of featuring plagiarized material on the Main Page through the DYK project. --JayHenry (talk) 02:44, 1 July 2008 (UTC)

More on DYK view stats

A few days ago, I suggested giving recognition to DYK hooks that draw the most views. After all, the goal of DYK is to draw attention to newly written or expanded articles.User:Hassocks5489 have taken a shot at creating a template that could be used to track the top hooks on a monthly basis, giving recognition and encouragement to those whose hooks are most successful. It currently is updated through the hooks of June 23. If you want to take a look or help updating it, here's the link: http://en.wiki.x.io/wiki/User:Cbl62/sandbox3 And if people think this is useful, maybe we could start doing it each month.Cbl62 (talk) 17:06, 27 June 2008 (UTC)

How does the tool determine the number of views while the article is on DYK? The search tools availabel to a user display only daily views, without indicating times. --EncycloPetey (talk) 17:16, 27 June 2008 (UTC)
Just a comment. Look at this link. I think it's pretty obvious what views were a result of being on the Main Page. Thingg 17:59, 27 June 2008 (UTC)
Like User:Thingg said, the spike when an article goes onto DYK are dramatic. I've used the page view count for the day it goes on DYK and also the next day; that's because DYKs sometimes span two days.Cbl62 (talk) 19:40, 27 June 2008 (UTC)
We can't narrow it down to less than a day, so the figures recorded in the table are the total for the day(s) on which the article was featured on the main page; so in practice every article will have its total overstated by a slight amount. Rounding to the nearest hundred, as we do, should help to reduce the impact of this overstatement. Hassocks5489 (tickets please!) 20:22, 27 June 2008 (UTC)
Your table does give some interesting insights. I am not surprised that the Pakistani actress hook is on the top of the list, but I didn't expect hooks about Esperanto profanity or Subpersonalities would have drawn so many views. As you said somewhere, we clearly don't always need "sex and guns" to get readers' attention, it seems. --BorgQueen (talk) 06:26, 28 June 2008 (UTC)
It was nice to see that even a simple, but interesting, hook about the cushion plant drew 14,000 views.Cbl62 (talk) 23:09, 28 June 2008 (UTC)
An article on profanity getting a lot of views really isn't that surprising to me.--Bedford Pray 05:15, 30 June 2008 (UTC)

Carangoides

Shouldn't the phrase "for which the fish... was originally created" be "from which the fish... was originally created"? Mouse is back 14:51, 1 July 2008 (UTC)

It's talking about the creation of the genus as a human classification of fish, not the actual creation of those species of fish. In future, reports on mistakes in the current DYK would be better off at WP:ERRORS where they'll get swifter attention. Olaf Davis | Talk 15:25, 1 July 2008 (UTC)

wiki

Hello Gioan Dang thu su dung wiki —Preceding unsigned comment added by Hoabinh (talkcontribs) 02:52, 2 July 2008 (UTC)

Solution for delays in DYK updates

I've been thinking about this issue as I have to watch helplessly as the hours tick by before an update is made - many a times the list of articles to go up on the main page is left incomplete while the delay builds. It may seem an issue that is irritating but not deserving any particular action, but I have come up with 2 proposals:

  1. My first proposal - to have a system like WP:RFR, where an editor with a history of good contributions of writing DYKs, reviewing nominations and selecting nominations for the update template, be granted the tool of updating the template when the time comes. If a person works diligently on DYK jobs, you can trust him/her with this button.
  2. My second idea, and obviously the most prudent and swiftest for implementation, is to design a template akin to {{Vandalism information}} or a traffic signal, which has a (1) Green light   for normal status, 5 hours before the next update to (2)   Yellow light for 1 hour, the 6th hour before the slated update - indicating that adding nominations for the next update must be completed soon and (3)   Red light - to begin once the current time passes the slated time for updating. This template should be displayed on all pages and sub-pages of T:DYK and even places like WP:ANI/WP:AN, where it may be guaranteed to come to the attention of several administrators.

I'd like to have opinions of everyone regarding the good/bad qualities and possible implementation of these ideas. I think these are straight-forward solutions to a simple but irritating problem. Vishnava talk 14:36, 27 June 2008 (UTC)

I don't think any sort of page-specific admin powers are needed. The point of having the next update ready to go when the clock template turns yellow and then red is to allow any admin to do it, not just the DYK admins, who may be offline or busy.
But we do need some sort of notice on T:TDYK as to how full T:DYK/N is. Yes, you can check yourself. But not everyone makes the effort. Daniel Case (talk) 14:41, 27 June 2008 (UTC)
Just to be clear, are you saying you're in favor of the DYK warning template idea? Vishnava talk 15:04, 27 June 2008 (UTC)
I agree that repeating the 'hours since last update' information elsewhere on the DYK pages would be a good idea, along with a count of how full T:DYK/N is (if there's some simple way to do so that won't get out of synch with the actual template, as manually updating it might well). I don't think WP:AN or WP:ANI are the best places for it though - I anticipate people worrying about the precedent of starting to fill up the AN with all sorts of stuff that only a small fraction of admins are interested in. Vishnava, did you have a design in mind for a warning? I think the more similar it is to the current colour-coding on Template:DYK-Refresh the better. Perhaps we could add it to the top of Template:DYKbox?Olaf Davis | Talk 15:58, 27 June 2008 (UTC)
As you suggest, I like the idea of color-coding the Template:DYK-Refresh, but I was thinking of reduced-sized version Template:Vandalism Information that we could attach to any of the existing templates or post independently. The key need for such a solution is that DYK is part of the main page and thus necessary to maintain properly and seriously - so to that effect, it would be appropriate to post it on WP:AN/WP:ANI. Vishnava talk 16:31, 27 June 2008 (UTC)
The 2nd idea for warning signals seems to be a good idea that doesn't require any policy change or so - why don't we go ahead and create the template and give it a trial run? Vishnava talk 17:03, 27 June 2008 (UTC)
Another thing to keep in mind while designing this template is that some admins and interested parties would like to have this new status template on their user page. How about having a 7 color scheme on the template that goes 1) green (>5 hr left) 2) green (>4 hr) 3) green (>3 hr) 4) green (>2 hr) 5) yellow (>1 hr left) 6) orange (in the final hour), 7) red (overdue). Royalbroil 17:34, 27 June 2008 (UTC)
Check it out- I've created 3 test model template (albeit rudimentary; feel free to improve design): {{DYKUpdateRED}}, {{DYKUpdateGreen}}, {{DYKUpdateYellow}}

Vishnava talk 04:03, 28 June 2008 (UTC)

I thought that you would be using the number and color systems from {{Vandalism information}}, plus the timing scheme from {{DYK-Refresh}} to come up with a nice hybrid. This isn't what I was thinking about. I doubt I'd be able to work on what I'm thinking about for several days. Royalbroil 12:22, 28 June 2008 (UTC)
Why does {{DYKUpdate}} include links to the three different coloured templates regardless of the current level and colour? Is that meant to happen?
Also, it sounds like we have some disagreement over the colour scheme to use (traffic light; based on {{Vandalism information}}; or based on {{DYK-Refresh}}). I favour the third for consistency within DYK - what does everyone else think? Olaf Davis | Talk 14:31, 28 June 2008 (UTC)
I just noticed that Vishnava's template had been put on Template talk:Did you know. I've taken the liberty of removing it, for two reasons. First we should probably sort out the colour and the question I raised about the template linking to all possible versions of itself before putting it live. But besides that TT:DYK already includes the current {{DYK-Refresh}} which indicates whether the update is late. Maybe moving it to the top would be useful for alerting admins to a delay as soon as they come to the page, but I don't see why we'd need two templates giving the exact same information on the same page. Olaf Davis | Talk 14:41, 28 June 2008 (UTC)
(actually, Royalbroil and BorgQueen beat me to it and removed the template while I was writing the above) Olaf Davis | Talk 14:45, 28 June 2008 (UTC)
Answers:
  1. To Royalbroil and Olaf Davis - the design of the current template, etc. is just a rudimentary test to get the ball rollin'. I am a novice at creating templates from scratch so its not like its permanent or good enough. I like RB's hybrid idea. And yes - as RB says, I'd like something modeled on {{Vandalism Information}}.
  2. I think there is a difference between {{DYK-Refresh}} and {{DYKUpdate}} in the sense that like {{Vandalism Information}}, it is meant to be proliferated across user pages of concerned users and admins and all relevant project pages - the color bands are also more eye-catching and informative; take RB's suggestion of a 7-color template, which gives an hour-by-hour status; the RED alert warning is to be triggered 5-10 minutes before the next update, so the color-coding can be clearly interpreted. An AMBER Alert can be used if the template is half-empty with less than 5 minutes to go.
  3. We can also add new features like actually reporting how much of the next update template is filled with entries in the final hour (yellow), how many noms in the section of the last expired day and the current day are   or   so interested editors can fix the noms and update them swiftly. It is also a kind of warning that if there aren't enough confirmed noms, we need to hurry up and check the rest.
  4. Make it more interactive - an important objective is that it should get the attention of those admins who aren't regulars at DYK, since those regulars are the ones absent and causing delays. It should tell them exactly what is needed, especially as you can't make the next update if the template is half-filled. Ordinary users helping out at DYK should also get to know that the next update needs to be filled with confirmed noms and thus fill up shortages ahead of time. {{Vandalism Information}} fulfills a similar purpose in that a level-3 or level-2 rise will alert all available RC patrollers that they need to switch their Huggle on and do their rounds. A level-4 or level-5 status can allow people to relax a bit and work at convenience.

There is an issue with the current {{DYK-Refresh}} as it has proven ineffective in getting admins to update DYK on time. If we can upgrade the template with extra features and customize it for widespread use, well that's the objective of this discussion. I have no issue with deleting or completely revamping the templates I created in order to upgrade the existing ones or coming up with different solutions. My objective is to find a way to get rid of the confounded and irritating delays of 2-3 hours - its not serious if it were just 10-15 minute delays. With a 1,000 + admins and 3-5 being added each week, these 2-3 hour delays are simply unacceptable. Vishnava talk 15:33, 28 June 2008 (UTC)

Well put. I wish I knew more about programming templates. I looked at it and I don't know enough to do what I can see in my mind. I hope that I was clear that I am thinking that the template would go off of the reset time from the last update time. How could the template figure out the status of how full the next update is? Royalbroil 16:07, 28 June 2008 (UTC)
Well we need minimum 6 noms, so I think it is possible to calculate from how many vacant {{*mp}} spaces there are. I think this is possible in an automated template. Vishnava talk 16:21, 28 June 2008 (UTC)
  • I'd like to again remake the reminder that hitting the update exactly every six hours is not the purpose. Six hours is just an arbitrary goal to keep things moving. It it goes to seven hours or eight hours it does not matter unless the page is getting badly backlogged. If the page is badly backlogged it does not matter if we move the update to every five hours to help clear things out. I really think we need to spend more time focusing on the quality of the hooks, than worrying about hitting a completely arbitrary target. --JayHenry (talk) 17:49, 28 June 2008 (UTC)
So why is there a template that counts the minutes? Most importantly, we have 20 odd nominations from 2 days to get through - including those of the current day and the day before - so we need to hit the updates on time - why else would one pick a number like 6 instead of 8, 12 or a whole day? A 15-30 minute delay is fine, a 2-3 hour delay is not. Vishnava talk 23:40, 28 June 2008 (UTC)
In addition, its hardly a deviation of our mission to have a nice tool to improve our working. Making a good template doesn't take all that much time/effort. Vishnava talk 23:43, 28 June 2008 (UTC)
No, that's my point. Six is arbitrary. Eight would also be arbitrary. What actually matters is updating it often enough to move the hooks through. If you want to make a useful template, make one that's based off how many overdue hooks are sitting at T:TDYK, on how many of those hooks need to be checked for eligibility, NPOV, appropriate image tags, plagiarism, copyvio, and everything else. --JayHenry (talk) 23:52, 28 June 2008 (UTC)
Well, that's not really possible, JayHenry. If it were to happen, it would require template networking, and that would get very confusing on the already cluttered Suggestions page. -- Anonymous DissidentTalk 00:15, 29 June 2008 (UTC)
That's not really my point. I'm saying the factor that matters is how many overdue hooks are sitting at T:TDYK and how many hooks need to be reviewed. If we're taking care of those things, six or eight hours doesn't matter. If we're updating every six hours but missing those things we still have a huge problem. Whether it's a template, or a bot, or a methodology, or whatever doesn't matter. --JayHenry (talk) 00:27, 29 June 2008 (UTC)
While I don't agree with what you say on the non-importance of timely updates, your idea of a template on overdue hooks is very good. Vishnava talk 00:56, 29 June 2008 (UTC)
I would support a concept where, if the hooks are getting significantly delayed, an admin could decide to use the next set of hooks in less time than 6 hours. As JayHenry said, the length of time was set arbitrarily at 6 hours. I do think that 4 hours should be the minimum. What does everyone else think of this? Royalbroil 01:09, 29 June 2008 (UTC)
I support it - my question is, who can we approach, somebody with the skills to design such a template? We can keep bouncing ideas about, but we need some work and results. Vishnava talk 01:38, 29 June 2008 (UTC)

Look, I'm sorry if I missed something, but DYK lateness is not caused by an insufficient number of template warning colours or a lack of other information on it. The lateness is caused by lack of manpower. We already have three colour levels on the existing template that most folks ignore. Making a six-level one with information on how "full" the next update page is will not suddenly attract more users to the project. In fact, it will just mean an additional burden on the existing updaters. Gatoclass (talk) 11:49, 30 June 2008 (UTC)

Furthermore, one would think that any software development that can be directed toward DYK updating would be directed toward automating the update itself, along with subtasks like thanking everyone—not toward developing ever flashier ways to advertise the opinion that an insufficient number of administrators are willing to do this grunt work manually. Art LaPella (talk) 20:57, 30 June 2008 (UTC)
Look, we do actually have some software developers on this project, don't we? I don't know what they spend their time doing, but surely the community should be permitted to set priorities on things that they should be working on? I mean, it seems to me that plagiarism is a potentially serious issue and that detection software of some sort ought to be a priority. And I may have a personal prejudice, but I also agree it's long past time something was done to automate the notifications, because that's been talked about for yonks and nothing ever seems to come of it. Somebody somewhere is presumably setting priorities for the developers, so maybe it's time we found out who those people are and had a word to them? Gatoclass (talk) 01:45, 1 July 2008 (UTC)
Sounds like a good plan, Gatoclass, and I agree that detecting plagiarism and automating nominations seem like areas that'll provide more concrete return than anything else we could ask for. Anyone know who we're supposed to talk to about that? Olaf Davis | Talk 15:19, 1 July 2008 (UTC)
Are there other aspects that can be automated, like is checking the length of the hook, the date of the article, the date of expansion, all automated? If it doesn't require thought, can it be done by a machine? It seems a good place, Gatoclass, to start looking for ways to find more time for DYK editors to do actual editing, that is by placing non-thinking tasks in the hands of bots. --Blechnic (talk) 06:56, 2 July 2008 (UTC)
None of those are automated. Here's my proposal from last year. Art LaPella (talk) 22:18, 2 July 2008 (UTC)
I'm surprised to read how much more is not automated. Please move forward with your proposal, but could you start with DYK? At least they seem to know how valuable your automations could be~ --Blechnic (talk) 03:47, 3 July 2008 (UTC)
I don't understand. Please move forward with last year's battle? I don't have any ideas now that I didn't have then. I remind people every once in a while as in the above, and maybe someday I'll study the Internet programming that would allow me to do it myself. But it's less urgent now that I have Flock because I can do searches inside the edit window, thus speeding the task of doing the edits manually. Art LaPella (talk) 04:13, 3 July 2008 (UTC)

Sunderland Echo

I've expanded this article from 16,128 bytes to over 32,000 in the past few days, as well as adding seven new pictures and about 35 new references. Does this count as 'five-fold' ??--seahamlass 14:41, 2 July 2008 (UTC)

For the five-fold expansion, we count not bytes but characters of prose (excluding image captions, lists, infoboxes and so on). Since the article's prose hasn't expanded by a factor of five, I'm afraid it's not eligible. Olaf Davis | Talk 14:46, 2 July 2008 (UTC)
Oh well, that's a shame. It is so much easier to expand a stub than a lengthier article. It would be nice to have some kind of DYK incentive for doing that! (OK, I know we shouldn't need incentives, but...)--seahamlass 15:16, 2 July 2008 (UTC)

Clarification on policy regarding sources

What is the policy regarding an article that is based on only one source or primarily on one source but that source is of high quality? Is it acceptable for a DYK? I would assume not since notability requirement on wikipedia require multiple sources. Nrswanson (talk) 19:32, 30 June 2008 (UTC)

It's not really a problem by itself. Sometimes it's difficult finding sources on subjects even when they are clearly notable (eg. Moroni, Comoros, a capital of a country, but it's a stub, since there's a dearth of sources) but you could write a reasonably good article if there was a source that gave good coverage on the subject. Sometimes you'd have to be careful to separate facts from the author's opinions, but that's not always an issue, depending on the topic and the source. - Bobet 20:19, 30 June 2008 (UTC)
So basically use your discretion as to whether or not to make it an issue in the DYK process?Nrswanson (talk) 20:33, 30 June 2008 (UTC)
Yeah, pretty much, although I can't really think of a case where relying on one source would alone be a reason not to feature something on DYK. Maybe if the one source is a book, and the refs don't contain page numbers and it's hard to verify, you could tag an article with {{refimprove}} (since articles with cleanup templates aren't generally put on dyk). And I guess any article that relies on one source would be more likely to contain plagiarism, pov, self-promotion or something similar, so it wouldn't hurt to read it through with a critical eye. - Bobet 21:17, 30 June 2008 (UTC)
Thanks for that clarification.Nrswanson (talk) 21:20, 30 June 2008 (UTC)
It also depends on what the source is being used for. If it's being used for facts, it's probably okay. But if it's being used mainly to source opinions, it probably isn't because not everyone has the same opinion and relying on just one source is probably a violation of WP:UNDUE. That's one rule of thumb I use, anyhow. Gatoclass (talk) 22:37, 3 July 2008 (UTC)

DYK as an indication of quality and accuracy

Hello, at Talk:Conscript Fathers I'm having an argument with the primary author of the article. Conscript Fathers was on DYK on June 13, and the author feels that this is a guarantee of the article's quality and accuracy. Is this a correct understanding of DYK? --Akhilleus (talk) 23:13, 2 July 2008 (UTC)

In an ideal world, perhaps. As it is though we have just about enough time to check the one fact in the DYK hook and give an extremely cursory glance over the rest of the article to check it's not packed with redlinks, hideous formatting errors, or other dead-obvious problems. Especially given the length of the article in question and the massive number of footnotes, it's very likely that the reviewer didn't even read the whole thing, let alone check the accuracy or quality of every statement and the reliability of every source. Hope that helps. Olaf Davis | Talk 09:13, 3 July 2008 (UTC)
Thanks, that's helpful. --Akhilleus (talk) 13:50, 3 July 2008 (UTC)

DYK stats

Reading over the debate about Scientology, I was motivated to check stats for recent DYK hooks that have received the prime top spot with picture. I do think that the "interesting" element of the hook is key and find it disappointing when one of my proposed hooks proves to be a "bomb." A great hook will draw 2,500 or more views, and a mediocre hook may draw fewer than 1,000 views. As it may be informative to see what types of hooks are working, here are the stats:Cbl62 (talk) 02:36, 18 June 2008 (UTC)

Article Image DYK views DYK hook
Pale-yellow Robin   1,500 (article)
5,500 (pic)
that the Pale-yellow Robin (pictured) uses the prickly Lawyer Vine as a nesting site and for nesting material?
McCormick Tribune Plaza & Ice Rink   1,600 (article)
4,000 (pic)
that McCormick Tribune Plaza & Ice Rink (pictured) is both an ice skating rink and the largest alfresco dining venue in Chicago?
Uri-On   6,800 (article)
4,500 (pic)
…that Uri-On (pictured), created by Michael Netzer in 1987, was the first Israeli superhero to be published in color?
Culver Randel House and Mill   3,800 (article)
7,600 (pic)
... that Culver Randel manufactured pianos at his mill in Florida, New York?
Eberswalde Hoard   7,200 (article)
8,300 (pic)
that the Eberswalde Hoard (pictured), a collection of 81 gold objects weighing 2.59 kilograms (5.7 lb), is an important find from the European Bronze Age?
Harris Theater (Chicago)   1,500 (article)
140 (pic)
that the Harris Theater (pictured) is the first new performing arts venue built in downtown Chicago, Illinois since 1929?
HNoMS Kjell   7,900 (article)
3,800 (pic)
that the Norwegian torpedo boat HNoMS Kjell (pictured) was known as "Terror of the smugglers" when she intercepted rum runners during Norway's prohibition?
Tourism in Egypt   5,200 (article)
2,400 (pic)
that the worst terrorist attack against tourists in Egypt was in November 1997, when gunmen killed 57 tourists and 4 Egyptians (location pictured)?
Neil Hamilton Fairley   8,000 (article)
5,000 (pic)
that the British Army changed its plans for operations in Greece during World War II on medical advice from Australian Brigadier Sir Neil Fairley (pictured)?
Polyphemos Painter   743 (article)
2,300 (pic)
that the Analatos Painter, Mesogeia Painter and Polyphemos Painter (work pictured) were early Greek vase painters of the Proto-Attic period, active between 700 and 650 BC?
Cozy Dog Drive In   5,900 (article)
3,800 (pic)
that the original hot dog on a stick to be served at Cozy Dog Drive-in was called a Crusty Cur?
Yazdegerd I   3,300 (article)
5,500 (pic)
that the 5th-century Sassanian Emperor of Iran Yazdegerd I (coin pictured) was given the epithets of Ramashtras ("the most quiet") as well as Al Khasha ("the harsh")?
John Sowden House   5,700 (article)
9,900 (pic)
that the Lloyd Wright-designed John Sowden House (pictured) is known as the "Jaws House" because its facade resembles the open mouth of a shark?
Moika Palace File:Rasputin-Big-photos-1.jpg 4,500 (article)
3,000 (pic)
that the Moika Palace, a museum about the murder of Grigori Rasputin (pictured) by Prince Felix Yusupov, was also the scene of the homicide?
Delaware (chicken)   6,800 (article)
5,400 (pic)
that the Delaware breed of chicken (chick pictured) was once the favorite broiler on U.S. East Coast farms, but is now critically endangered?
Brunei pitis   2,500 (article)
7,200 (pic)
that the first coinage used in Brunei were Chinese coins (example pictured), which were referred to as the pitis?
Medieval Bulgarian Army   7,100 (article)
5,400 (pic)
that the core of the Medieval Bulgarian Army (pictured) was the heavy cavalry, which consisted of 12,000–30,000 heavily armed riders?
Catholic Church of St. Catherine   2,700 (article)
3,300 (pic)
that the Church of St. Catherine (pictured) in St. Petersburg was taken over by the Soviets, closed, ransacked and twice burned out, before being returned to the Catholic Church in 1992?
Christopher Smart   5,800 (article)
4,200 (pic)
that Christopher Smart (pictured) spent five years in a mental asylum and wrote his most important works, Jubilate Agno and A Song to David, during this time?
Crescent Honeyeater   1,000 (article)
5,300 (pic)
that the diet of the Crescent Honeyeater (pictured) changes from nectar and invertebrates to wholly insects during the breeding season?
Andreas Frederik Krieger   960 (article)
2,500 (pic)
that Andreas Frederik Krieger (pictured) was one of the most vocal critics of the morganatic marriage between Frederick VII of Denmark and Louise Rasmussen?
List of Registered Historic Places in Chicago   3,300 (article)
6,200 (pic)
that there are at least 296 historic places listed on the U.S. National Register in Chicago, including a German U-boat (pictured)?
Attack Squadron 46 (United States Navy) File:Attack Squadron 46 Insignia (US Navy).jpg 9,200 (article)
6,500 (pic)
that John McCain was a member of the VA-46 Clansmen (insignia pictured) when he was wounded during the 1967 USS Forrestal fire off the coast of Vietnam?
Thanks for that very nicely put together presentation Cb, but you forgot to sign it :)
I must say though that I disagree with your conclusions. If your table indicates anything to me, it's that certain topics are of interest to readers, rather than certain hooks. For example, articles about war machines and war related topics always seem to score quite well - because, I guess, most computer users are youngish males with an interest in that sort of thing. Articles about US subjects tend to do better than other articles on the same subject, because lots of people with computers are Yanks. Articles on popular culture (like the comic book cover above) do well because popular culture is just that - popular.
At the other extreme, articles on less popular subjects can really bomb. My four hooks on Australian composers got an average of only about 250 hits, in spite of the fact that the hooks were in my opinion quite good - I mean, stuff like best Australian composer of the early 20th century is a pretty outstanding achievement. But only a couple of hundred people cared to know more. Whereas if I write a hook about a warship, it's guaranteed to get a minimum of about 4,000 hits, no matter how ho-hum the hook is. So I don't really think hooks are all that important, it's mainly the subject matter. I bet that recent article on the Pakistani model-actress got plenty of hits! Gatoclass (talk) 02:30, 18 June 2008 (UTC)
Sex and weaponry may sell even with mediocre hooks (I confess I checked out the article about the Pakistani model-actress when it was on the main page), but this very small sampling seems to show that other subjects can also sell with clever hooks. Hooks about an 18th Century poet, hot dog on a stick, an Israeli superhero, a breed of chicken, objects from the Bronze Age and a Lloyd Wright house all scored more than 5,000 hits.Cbl62 (talk) 02:53, 18 June 2008 (UTC)
I certainly think an interesting hook helps :) But as you say, some subjects just seem to be more interesting in general. It's a combination of the two. Gatoclass (talk) 03:03, 18 June 2008 (UTC)
Funnily enough, I've just started to keep a record of viewing stats for my hooks, motivated by a similar curiosity over which topics are popular with readers and whether having a pic has a significant effect. My data sample suggests these conclusions...
  • Sex and murders are popular (getting the words "most bizarre sex scene" into a hook generated nearly 17,000 views...)
  • Politics isn't
  • Wacky, off-the-wall hooks can be successful (William Edge, a long-dead British MP, got more views than might otherwise be expected because of his exploits with pigeons)
  • My series of Brighton & Hove places of worship articles bounce along nicely but unspectacularly; lead pictures definitely helped the figures in two cases
I love seeing surprising, bizarre or daft hook facts and memorable "pub quiz"-style pieces of knowledge (the current hook about the cultivable area of the Seychelles is a good example. Hassocks5489 (tickets please!) 08:00, 18 June 2008 (UTC)
For more accuracy, you could also keep track of the time when the hooks were on the main page, since there are generally less views for hooks that are featured during night time in America (I don't have stats on that, but it sounds believable enough that I'll present it as fact and hope no one will notice). Also, the length of time between updates could be a factor, but that doesn't seem to have been an issue recently (since the updates get done so promptly, good job everyone involved). - Bobet 08:57, 18 June 2008 (UTC)
Since the goal of DYK is to draw viewers to new articles, would it make sense to recognize hooks that have extraordinary success, e.g., Hassocks' Jacqueline Voltaire hook that drew 16,000 views? Would it also makes sense to create a sub-page where we keep track of hooks that have drawn the highest number of hooks? While DYK is not a competition, an ongoing recognition for extraordinary hooks would help motivate people to come up with eye-catching, interesting hooks. Cbl62 (talk) 14:37, 18 June 2008 (UTC)
I've thought of starting such a page myself, but it's a matter of finding the time, and I just don't think I can make time for any extra commitments around here ATM. Gatoclass (talk) 10:35, 19 June 2008 (UTC)
Here's a first draft of a possible monthly "best of DYK" template. http://en.wiki.x.io/wiki/User:Cbl62/sandbox3 If others volunteer to contribute to such an effort, I think it would be a good way to continue to promote the best new articles. Cbl62 (talk) 02:09, 20 June 2008 (UTC)
Good idea. I don't mind helping to update and maintain it when I get a spare few minutes. It could possibly be linked in with either the DYK contributors list or the page with DYK records and statistics, which for the life of me I can't find at the moment. I have a feeling that if accepted, the current candidate hook for Human-goat sexual intercourse may feature prominently in the template... Interesting observation from JayHenry below, as well; I agree that thought should always be given to providing interesting and relevant wikilinks elsewhere in the hook sentence. Hassocks5489 (tickets please!) 23:03, 20 June 2008 (UTC)
I have updated hooks up to and including 16 June 2008—at the current time, the last day on which stats are available at stats.grok.se. Hassocks5489 (tickets please!) 22:06, 21 June 2008 (UTC)
In looking through the stats I've noticed that the bolded article isn't always what gets the most clicks. For example, on the day of Moika Palace, people who read the hook were more interested in Grigori Rasputin. While on a typical day a few thousand people look at Rasputin, on his day in the spotlight, 14,000 did which suggests a DYK bump of 12,000. If our goal is to draw readers into the encyclopedia then we should consider whether the other items in the hook are of interest as well. --JayHenry (talk) 03:08, 20 June 2008 (UTC)
I've never had an article have 5,000 hits during its DYK stay. Maybe this will help be get an idea of how to improve though. Wizardman 00:36, 21 June 2008 (UTC)
I know I'm probably dragging up an old discussion here, but since DYKs are only on the main page for six hours (?), shouldn't they be put on at a time which is appropriate. I.e. An American subject should be on during the day or evening in America, an English subject on during the day or evening in England, etc. Some subjects at DYK may only have limited interest and be on at a time when they are likely to receive less traffic. Peanut4 (talk) 20:33, 4 July 2008 (UTC)
I am not sure that, in this age of globalization, it will work well. A lot of Americans live all over the world, and many non-Americans stay or live in the U.S., etc. Besides, even for people living in the same time zone, it is not that they all get up and sleep at the same time. Also, all-American updates, no matter what time of the day they get featured on main page, will draw complaints. --BorgQueen (talk) 20:56, 4 July 2008 (UTC)
It's never going to be a perfect system, and I feel most DYKs will have some global interest - at the moment, most if not all do. But on a few occasions, I think it might be worth bearing in mind what the tag is mainly about and perhaps saving it for one or two updates' time. Peanut4 (talk) 21:05, 4 July 2008 (UTC)

Main Page redesign

Interested parties might want to pay attention to Wikipedia:2008 main page redesign proposal, as proposals could significantly affect any of the main page projects. --JayHenry (talk) 17:54, 6 July 2008 (UTC)

Who uses the archives?

At User talk:The Duke of Waltham#DYK bot we are discussing restructuring the Did you know archives. Restructuring doesn't matter unless someone uses the archives. So do you? Do you use the archives for any purpose other than to look up your own hook, which benefits only those who have contributed hooks? How often have you looked? Do the archives exist only to provide the illusion of permanence for a six-hour phenomenon? If so then maybe we need archives to motivate new hooks, but the form of the archives wouldn't matter. Art LaPella (talk) 19:26, 8 June 2008 (UTC)

In the past, I've used the archives to find hooks for the DYK section of the Organized Labor Portal. But that really wasn't a big deal, so you don't need to take it into consideration, if you want to redesign the archives.--Carabinieri (talk) 19:38, 8 June 2008 (UTC)
(edit conflict, sorry for some redundancy, although I disagree with carabinieri's note about being a big deal or not) I've used them to look what the hook on some article was when I happened to see the dyk template on the talk page, since the hook itself is not included on the template. For this, it would probably have been better if the hook itself was included in the talk page template (but I don't know if changing that now would be smart, since I like consistency). More importantly, I redid the film portal a few years ago (it's been redone since), and used the archives on a few occasions to look up new (or old, depending on the viewpoint) hooks for the dyk section there, the archives made it very simple. And I know you're not planning to delete the old archives, so this point wouldn't really matter, but the archives existed before the talk page template, so the only way to know whether an article had been on dyk was through seeing the archive on 'what links here'.- Bobet 19:46, 8 June 2008 (UTC)
I have used the archives several times. Once I was compiling archives for the April Fool's Day Mainpage, so I looked for them in the DYK archives. I have looked through the archives to find hooks used by myself and others for portals. I used the archives to calculate the number of DYK articles generated per month for an article in The Signpost.
This archive should be arranged by some type of date order, with the date clearly defined in the link to the archive. Having a series of numbers is not very helpful because it's quite difficult to determine which number corresponds with which date. I gave some thought about how much information should be placed in an archive (if everything remains the same as today), and I felt that dates 1-15 in a month should be in one archive and dates 16+ should go in another. Another reasonable alternative would be a structure where the archive links to a year page, then a month page, then date page. Royalbroil 04:58, 9 June 2008 (UTC)
That seems like a good idea, and I personally would also like to see updates datestamped in the archive itself so that anyone browsing through can see exactly when an update was displayed.
I've used them to look what the hook on some article was when I happened to see the dyk template on the talk page, since the hook itself is not included on the template - Bobet.
Adding the hook to the article template was suggested not long ago, I think that would just make too much work for updaters but on reflection I guess we could add the function to the template and just make it entirely optional, so that, for example, article authors could add the hook themselves if they so chose. Gatoclass (talk) 09:44, 9 June 2008 (UTC)
(to add on to Gato's last sentence)...or so a bot could add the hook to the template if a bot does the crediting. Royalbroil 12:22, 9 June 2008 (UTC)
Gato's suggestion sounds good to me, since it doesn't make any extra work for anyone who doesn't choose to volunteer it. Olaf Davis | Talk 11:24, 10 June 2008 (UTC)
Some wikiprojects go back through to find relevant DYK's on their topic. Totnesmartin (talk) 21:11, 9 July 2008 (UTC)

DYKs from lists

Can someone remind me the policy on getting DYK hooks from lists? Do we just allow them iff the list also contains at least 1,500 characters of prose, or are the rules different? Olaf Davis | Talk 14:23, 2 July 2008 (UTC)

DYK 1,500 character count focuses on the written prose of an article. List items generally are not included in the DYK 1,500 character count. On the other hand, if the list items each included three sentences of prose, those three sentences may count. In the end, it's up to the DYK posting admin to decide whether they want the suggested DYK hook to appear on the main page. GregManninLB (talk) 15:56, 9 July 2008 (UTC)

IMDb

I have noticed a tendency to disqualify articles for using IMDb as a source, with the argument that it is not a "reliable source". I'm not sure what policy this is based on, the only thing I could find was the following in Wikipedia:Reliable source examples#Use of electronic or online sources:

Trivia on sites such as IMDb or FunTrivia should not be used as sources. These media do not have adequate levels of editorial oversight or author credibility and lack assured persistence.

This very clearly concerns only trivia, not credits or other basic facts. The real problem here is that even though IMDb - like Wikipedia - has issues with credibility, it is nevertheless the most comprehensive source on matters to do with film and television. By doggedly pursuing a "no IMDb"-policy, we are seriously limiting the articles on these subjects that can be included in DYK. Lampman Talk to me! 16:04, 2 July 2008 (UTC)

I believe IMDb is a wiki. I have friends who are producers/directors. When there is a problem with their IMDb entries they simply get on line and edit the entries. --Blechnic (talk) 17:09, 2 July 2008 (UTC)
Is the issue that the trivia sections of IMDB often provide facts that would be fun to include in hooks? I would say don't give in to temptation here. IMDB, like Wikipedia, does a generally solid job, but a lot of those trivia sections originate in things like fan forums. The vast majority of the reliable information in IMDB comes from four sources: the first being diligent transcription of credits from the primary source (and the featurette-type information on the DVDs); the two major entertainment papers Variety and The Hollywood Reporter; and Box Office Mojo, which though a Web site, is considered a credible source by most major newspapers, who use mojo's data. I would suggest using IMDB to get ideas, but then checking those ideas out in reliable sources. If it's credible information you're likely to find it elsewhere. --JayHenry (talk) 05:23, 3 July 2008 (UTC)
Speaking by no means from a position of authority on IMDB I'm inclined to agree with JanHenry. Surely most reliable info on the site will also appear in respectable media publications or on the studio's or film's page. Olaf Davis | Talk 09:15, 3 July 2008 (UTC)
As you can see from this page, IMDb is a bit different from a Wiki, in the sense that there's a certain element of editorial control by the staff (it's a commercial site, so they can afford it). Yes, IMDb has its problems, and if there were other, more reliable sources available then that would be great. But with over a million titles and 2.3 million names, there is really nothing that can compete, online or offline. So by excluding IMDb we're really also excluding a wide range of more peripheral subjects from inclusion. Lampman Talk to me! 21:32, 3 July 2008 (UTC)
Information that only exists in IMDb seems highly likely not to satisfy inclusion standards for the project. For example, IMDb has pages on probably hundreds of thousands of actors and crew that never had significant roles. Databases and encyclopedias often have very different roles to fill. But I'm not so sure this is an issue that DYK can do anything about anyways. DYK just follows the rules of the rest of Wikipedia. DYK cannot decide for the rest of the project that extremely peripheral subjects can be included. --JayHenry (talk) 22:21, 3 July 2008 (UTC)
I think you misunderstand; I'm not suggesting changing the WP:N guidelines, I'm simply talking about the availability of references for article information. There is certainly plenty of relevant, encyclopaedic information about films and television shows that you'd have a hard time finding anywhere else than IMDb. Apart from that you're making the exact same point I'm trying to make: There same rules have to apply here as anywhere else. WP:RS has no general ban on IMDb, so there is no basis for introducing one specifically for DYK. Even FAs use IMDb, I see no reason why it can't be used here. Lampman Talk to me! 02:17, 4 July 2008 (UTC)
Oh, I see... there are some exceptions, but they're pretty narrow. Screenwriting credits in IMDb are provided directly by the Writers Guild of America and thus would be considered reliable. Movie ratings come directly from the MPAA. But in general, with information from IMDb, the job is to establish that it is reliable, rather than for others to try to establish that it's not. I see you have a hook on the suggestions page that uses screenwriting credits, and so that's okay. If an editor raises a concern about a screenwriting credit in the future, show them this link. --JayHenry (talk) 14:43, 5 July 2008 (UTC)
Thanks! Lampman (talk) 03:04, 9 July 2008 (UTC)
Not quite. They, like anyone, get online and can submit changes which are vetted by editorial staff at IMDb before inclusion, and their policies include, for obvious reasons, seeking confirmation for notable additions, modifications and deletions.Achromatic (talk) 17:55, 9 July 2008 (UTC)

Ex-copyvio

I just discovered that the whole of the important Romanesque art article was a copyvio from what is now the 1st Ext link - The Metropolitan NY. The article now is a lot shorter than before, but all new - & still expanding. Does this qualify for DYK? I think it should but the rules are a tad ambiguous. And does it qualify for the Jayhenry anti-plagiarism awards ..... Johnbod (talk) 23:15, 5 July 2008 (UTC)

As I understand the precedents, replacing a copyvio hasn't been an exception to the need for a new article or a fivefold expansion for DYK. If I'm wrong, I'll change my Unwritten Rules accordingly. Art LaPella (talk) 04:09, 6 July 2008 (UTC)
That's correct. We count only the (current number of characters) / (original number of characters), regardless of how bad those original characters were. That makes it much easier to judge individual cases and keep the backlog down. Olaf Davis | Talk 09:45, 6 July 2008 (UTC)
It looks like Johnbod took you at your word! Now we have diff that satisfies the five-fold increase as far as I can see. --Hroðulf (or Hrothulf) (Talk) 19:46, 7 July 2008 (UTC)
Actually much of it derives from Romanesque architecture. Does that count? Johnbod (talk) 20:19, 7 July 2008 (UTC)
Not according to the unwritten rule (no forks) :( --Hroðulf (or Hrothulf) (Talk) 21:32, 7 July 2008 (UTC)
It's not a fork if one is a summary article and the other a more detailed covering of one section from that summary, which this case may be. --EncycloPetey (talk) 07:29, 8 July 2008 (UTC)
What, should text that is illegally copied and pasted from another site count towards the "original number of characters"? Then this is what you do: before you start rewriting the article you remove the copyvio text, which is a perfectly legitimate edit. Then that amount of text will be the "original number of characters". Surely that must be ok? Lampman (talk) 03:18, 9 July 2008 (UTC)
Good point! I will put it up anyway, with a reference to this discussion. On Encyclopetey's point, I have expanded some of the text brought in but not other bits. At the moment the article is about 9,000 chrs, of which perhaps 4,000 are borrowed; they cover carving on buildings, mural paintings & stained glass (now much expanded). Johnbod (talk) 21:53, 9 July 2008 (UTC)

The obsession with unusual hook facts

This is something that's been bugging me for a while, it seems to me we are losing the wood for the trees in this obsession with trying to find unusual facts for hooks. Not every article has a "Ripley's believe it or not" type hook fact that can be utilized, and in cases where there is no such fact I feel we should just use the article's main subject. Just to give an example from the current batch, there's a hook which goes:

  • that when Henry D. Edelman became the first president and CEO of the Federal Agricultural Mortgage Corporation in June 1989, no staff had been hired to work with him?

- This fact is not at all unusual, so why try and pretend it is? Seems to me it would be better to just go with something much simpler which addresses the article's subject more directly, such as:

  • that Henry D. Edelman became the first president and CEO of the Federal Agricultural Mortgage Corporation in June 1989?

- Trying to spin quirky hooks out of prosaic facts just detracts from the overall impact of the update in my opinion. And the basic subject matter is usually informative and of some interest in itself, it it weren't, there wouldn't be an article on it in the first place. Gatoclass (talk) 06:00, 8 July 2008 (UTC)

This is certainly my recent experience, when simple and concise hooks fulfilling the DYK Rules are made to rot for not being exceptional.--IslesCapeTalk 14:26, 8 July 2008 (UTC)
I see your Jammu and Kashmir example as another instance of trying too hard to make a prosaic fact look unusual. The final objection was to the words "only existed", which suggested that the state had an unusually brief existence, when 101 years is not unusually brief. (Objections to earlier hooks were based on a need for reliable sources, which are critically important.) IMO, if the only facts available for DYK are totally prosaic (and inherently uninteresting, IMO) factoids like "that Henry D. Edelman became the first president and CEO of the Federal Agricultural Mortgage Corporation in June 1989" or "that the princely state of Kashmir and Jammu existed from 1846 and 1947", there's no point in having a DYK feature. --Orlady (talk) 15:12, 8 July 2008 (UTC)
First of all, the title is Kashmir and Jammu, which is exactly the reason why this needed to appear on DYK; to illustrate the difference between the two! Secondly, main idea of the nom was to imply that the state only existed as long as the British colonial rule did. And this fact was already cited with a reliable source in the LEAD. However, apparently some people missed the point. As a result, what happens is that even if the hook contains simple and referenced fact, the editors are forced to bring in catchy phrases, only to have their arms twisted for not being 'exceptional'. As for 'interesting', it is a relative term. DYK intro on Main page says From Wikipedia's newest articles, and not From Wikipedia's newest and most interesting articles. Enter, Gatoclass's point, if it weren't "informative and of some interest in itself, [...] there wouldn't be an article on it in the first place". And my point is that this equally implies on commentators. Otherwise, admins can always make minor modifications to hooks for updates, which I never mind. --IslesCapeTalk 19:05, 8 July 2008 (UTC)

You need quirky hooks in every update, otherwise people will ignore the DYK section. However, less quirky one can be used sparingly as filler, especially when the "On this day" section forces DYK to use more hooks.--Bedford Pray 21:59, 8 July 2008 (UTC)

I agree with Bedford - if we stop demanding a reasonable proportion of exciting hooks I'd expect the number of people who bother looking down to DYK will drop significantly, and we lose the point in having it on the Main Page. And the fact that it doesn't say From Wikipedia's newest and most interesting articles seems a bit of a red herring to me: if the hooks are interesting, we don't need a label to tell people they are - better for them to read a few and think "wow, Wikipedia has some really cool articles!" on their own. Besides, surely we should be basing the label we give to the section on the service we want to provide the reader, not the other way around. Olaf Davis | Talk 10:12, 9 July 2008 (UTC)
Yes, but the point I am making is that most of the hooks aren't very exciting, and that trying to turn every hook into an amazing fact just detracts from the hooks that are quirky and unexpected.
I'm not arguing that suprising facts should not be used when they are available, just that striving to create one that isn't there often results in a hook that just looks silly, or that is so peripheral to the subject matter of the article that it's practically irrelevant. I'm just saying, let's not forget that every article is about something that someone found interesting enough to write about, and that others will presumably find interesting to read, so that when there is no obvious hook for an article, there is always a self-evident fallback, which is the subject of the article itself. Gatoclass (talk) 10:32, 9 July 2008 (UTC)
Ah, sorry I'd sort of misinterpreted. Yes, I certainly agree with that: exciting hook >> ordinary this-is-the-subject-of-the-article hook > artificially hyped hook pretending to be exciting when it isn't. Something to bear in mind. Olaf Davis | Talk 11:29, 9 July 2008 (UTC)

Self-reference

In regards to yesterday's kerfuffle (and ooh, Firefox's spellchecker allows "kerfuffle"!) about WR, I hereby propose that articles which are self-referential - id est, the topic would not have existed if not for Wikipedia - be deemed ineligible for inclusion on DYK. It feels too narcissistic, for one thing. DS (talk) 14:18, 9 July 2008 (UTC)

Seems sensible to me. Also, while the WR DYK was only on the main page DYK list for two hours, as it was moved before the archiving, it never got into Wikipedia:Recent additions, so I added it. Interestingly, it was one of very few "Did you know?"s that didn't start with the customary unnecessary "that". Neıl 14:42, 9 July 2008 (UTC)
It was in Recent Additions; someone deleted it when they placed the hook from Indianapolis Fire Department on Recent Additions.--Bedford Pray 16:19, 9 July 2008 (UTC)
Please provide a link to the WR kerfuffle discussion. The article DYK date was July 7th and posted on the Main Page on July 8th. When I gave WR a green light, the hook did begin with ... that[1]. As for the green light, the WR article had just been moved from user space after a detailed review and consensus at DRV. While some of the footnotes were self referencing or not Wikipedia reliable sources, there seemed to be enough material from Wikipedia reliable sources and general material to meet DYK's 1,500 character requirement. Also, I was persuaded by the very recent DRV consensus. GregManninLB (talk) 16:32, 9 July 2008 (UTC)
I believe the kerfuffle being referenced is here: Wikipedia:Administrators' noticeboard/Incidents#WR on DYK, and you can see that the discussion spilled onto some admins' talk pages. If DYK frequently featured hooks about Wikipedia I can see it being a cause for concern, but in practice this is quite rare, is it not? --JayHenry (talk) 03:05, 10 July 2008 (UTC)

The main page permanently contains no less than 60 self-referential links, all "topic[s] would not have existed if not for Wikipedia" like Wiktionary etc. You might start there if self-referential links on the main page are truly so bothersome. --Rividian (talk) 02:33, 10 July 2008 (UTC)

The Administrator's Noticeboard discussion above didn't consider any possible article about Wikipedia Review to be a self-reference problem. It's clearly an example of the exception described at WP:WAWI. Art LaPella (talk) 03:39, 10 July 2008 (UTC)

Order of suggestion consideration?

I've been noticing a trend on DYK in that there is no particular order of review - it seems that an admin can arbitrarily comment and confirm on suggestions, regardless of the date/time of the article. I don't particularly care about the order of inclusions, but I think it would be more logical for the working admins to at least comment on older articles first.--Jiuguang (talk) 19:08, 9 July 2008 (UTC)

Most comments are on the oldest articles. My comments, or more typically my proofreading changes, are on the newest articles. So perhaps you mean me. The oldest articles already have my comments, because at one time the oldest articles were the newest articles. I find the most recently submitted articles, which may appear anywhere on the page, by using the history page to compare the current version to my last edit (or the last time/date stamp I recorded if I made no edit). I think it would be better if everyone worked on articles as they are submitted, which gives us 5 days to debate or correct any problems that may emerge. We occasionally get angry comments from authors who are asked to perform a major correction with only a few hours left before the deadline (although the "deadline" actually has a few days of leeway). Art LaPella (talk) 00:04, 10 July 2008 (UTC)
that is exactly my concern - an article can sit on the page for 5 days without a comment, and if something goes wrong, there's so little time to identify and correct the problem. Thank you very much for the response! --Jiuguang (talk) 00:27, 10 July 2008 (UTC)
I know how you feel; three for July 3rd are perfectly good, two of which are mine, but are being ignored.--Bedford Pray 03:38, 10 July 2008 (UTC)
Of course it's bad when a nomination sits for days and only gets reviewed as the 'deadline' is approaching, so changes have to be completed quickly. Ideally, we'd have no backlog to work at and we'd comment on articles as soon as they're nominated. However, we usually do have a backlog and it hardly makes sense to ignore old articles altogether to comment on the new ones. As with many aspects of DYK, the only prospect I can see for improving on the current system is to obtain more man-hours of reviewing and keep the backlog down - but that's out of our control, really. Olaf Davis | Talk 16:02, 10 July 2008 (UTC)
Reviewing articles when they are first submitted takes the same amount of time as reviewing them when they are about to expire, because every oldest article was a newest article when it was submitted, no matter how overworked we are. Only during the transition would there be the extra work of reviewing everything between the newest and the oldest. Art LaPella (talk) 01:33, 11 July 2008 (UTC)

Generally speaking one does try to review the oldest hooks first. However, there are a couple of reasons this doesn't happen. The first is that updaters need to get a balanced selection of hooks for an update which usually means they have to find suitable hooks from several days' worth of submissions. The second reason is that because we are chronically understaffed, hooks which are likely to be less problematic tend to get reviewed first. After doing this for a while, one has an intuition about which hooks are going to cause problems, and these tend to get reviewed last because of the time constraints.

I'm afraid there is no solution to the problem except more reviewers which we don't have, but if it's any consolation I personally work on the basis that hooks only truly expire five days after they were first reviewed. But now that I think of it, I'm wondering if perhaps we shouldn't make that a more formal process. Gatoclass (talk) 03:53, 11 July 2008 (UTC)

In regards to the above, I have now added a hidden message below the "Expiring noms" section header, which states the following: NOTE that hooks should only be deleted if more than five days have elapsed from the date that the hook was first reviewed. This is to ensure that article submitters get sufficient time to respond to the problems raised.
This should hopefully prevent hooks being deleted prematurely (which has happened on several occasions recently) and also give DYK managers a rule of thumb for when to delete hooks. Gatoclass (talk) 14:06, 11 July 2008 (UTC)

Quick question - split-off articles

Sorry if this has been asked fifty times before, but I can't find out an answer to this: are split articles eligible for DYK? I've just split a chunk off (the far-too-long) Zimbabwean presidential election, 2008 and was wondering if the new article qualifies, seeing as I only wrote the introductory paragraph. Totnesmartin (talk) 21:18, 9 July 2008 (UTC)

According to the previously Unwritten Rules, which are now linked from Template talk:Did you know#Instructions: "No forks, that is, an article isn't really new if you copied it from a larger article." Art LaPella (talk) 01:17, 10 July 2008 (UTC)
Ah, thanks. I had a feeling it would be out but I thought I'd confirm it. Totnesmartin (talk) 07:41, 10 July 2008 (UTC)

List of DYKs?

Is there a way to get a comprehensive list of all DYK articles? category:Wikipedia Did you know articles has 10,000+ articles in it, but those are not all. Some of the {{dyktalk}} templates were substituted on talk pages and thus do not transclude this category (I estimated around 6,000 such articles). Maybe a bot should be asked to un-subst them? Renata (talk) 07:33, 10 July 2008 (UTC)

See the archives. All 200-something volumes, that is. Daniel Case (talk) 17:32, 10 July 2008 (UTC)

So without combing thru 200-page archive there is no way to get a clean list of DYKs? Renata (talk) 21:11, 10 July 2008 (UTC)
Probably not, since the earliest dyk entries are older than the dyktalk template. Actually, you probably won't get a really comprehensive list without going through revisions of the template itself, since I'm sure there have been times when people forgot about the archiving. Do you need a comprehensive list of dyks for something? - Bobet 21:28, 10 July 2008 (UTC)
I've mentioned this before but a while ago I was working with Jreferee on a tool that would produce index tables like Wikipedia:Recent additions 146/History. This is a project that is far (perhaps very far, and seemingly getting farther every day since I haven't worked on it in months) from being finished. Generating a comprehensive list of all DYK articles would be considerably easier than generating these index tables, but still a fairly large task. -- Rick Block (talk) 01:00, 11 July 2008 (UTC)
Ok, I don't really need the list - it was more like for curiosity. The index tables seem like a good idea - a bot should be able to do them, no?. Renata (talk) 18:12, 11 July 2008 (UTC)

Proposal: DYK quotation requirement for inaccessible Internet sources

As Daniel Case noted above,[2] there needs to be a DYK requirement that a quote be included from a source that cannot be easily verified online, whether that source is off-line, online behind a subscription service, or in a foreign language. In lieu of a DYK member actually checking the source to verify the DYK hook fact, DYK should allow the article editor to provide a quotation from the source from which the DYK hook fact may be verified. In addition, DYK should accept quote translations into English from foreign language references. The hook fact can and should generally be easily verifiable by anyone viewing the Main Page. The wording I propose to add to DYK Rules Selection criteria item #3 is below. Please comment below the propose wording. Thanks. GregManninLB (talk) 15:47, 9 July 2008 (UTC)

If the hook fact cannot be easily verified online through the article inline citation, then a quote from the cited source must be included to allow DYK and those reading the Main Page to verify the hook fact.
  • Oppose. This proposal violates the spirit of Wikipedia:Assume good faith while providing no benefit in resolving the problem of unsourced or incorrectly sourced material. For the majority of contributors, this proposal would do nothing but force each contributor to perform extra work to demonstrate that they were already doing the right thing. For those who wish to game the system however, this proposal does nothing but require the creation of a quote to accompany a created source. While I support the purpose of improving verifiability of material on Wikipedia, a system that forces people doing the right thing to prove their innocence while at the same time doing nothing to prevent individuals who desire to insert made up information is not the way to proceed. --Allen3 talk 16:36, 9 July 2008 (UTC)
    • The issue is not unsourced or incorrectly sourced article material, the issue is verification of the < 200 character DYK hook by DYK. A bulk of DYK's problems come from statements presented on the Main Page that are not supported by the cited source. Mistakes happen and an editor may be interpreting the source material incorrectly when proposing the DYK hook. DYKs hook verification procedure does not violate assume good faith. This proposal addresses DYKs responsibility to Wikipedia; it does not address the responsibility of individual editors to Wikipedia. This proposal improves DYK's ability to take all reasonable effort to better ensure that statements appearing on the main page are more likely supported by the cited source. GregManninLB (talk) 17:25, 9 July 2008 (UTC)
      • The issue of incorrectly sourced articles and verification of a DYK hook are functionally equivalent. It all boils down to a question of being able to trust either a citation or a quotation. If we can not trust article creators to provide us with valid citations that support a specific claim, what makes you believe we can trust the same person to provide us with valid quotations that accomplish the same goal? Without access to the cited source it is impossible to known if the quotation was accurately transcribed, as opposed to "edited" to prevent misinterpretation by DYK fact checkers, or if it was ever part of the cited source. --Allen3 talk 18:00, 9 July 2008 (UTC)
        • Howcheng is trustworthy. But once he provided me with access to the quoted supporting the proposed hook, the DYK suggestion discussion lead to his realizing that he make a factual mistake and the hook was revised.[3] Ricky81682 proposed a DYK hook crediting Charles Thomas Bolton as the first astronomer to prove the existence of a black hole (a big deal). However, review of the sources revealed that Webster and Murdin independently discovered the wobble with Bolton and the hook was revised.[4] House of Scandal proposed a hook where Bartholomew Gilbert was blamed for a failure. However, a review of the original sourced material shows that Gilbert may have been responsible for the failure, but there was no evidence that he was blamed.[5] Without a review of the original source material, these mistakes would have not been caught. Also, until I know otherwise, I will assume good faith in that people will provide a valid quotation or translation to allow DYK to review the DYK hook appearing on the Main Page. --GregManninLB (talk) 20:21, 9 July 2008 (UTC)
  • Support. It's not a matter of trust. I just submitted a DYK nom where I completely misread the source, and if it hadn't been accessible online, it wouldn't have been caught by GregManninLB. AGF only asks you believe people are working towards the best interest of the encyclopedia, not that they are infallible. I wouldn't mind having the citation on T:TDYK instead of the article footnote, however, to avoid cluttering up the article page. howcheng {chat} 19:03, 9 July 2008 (UTC)
  • As a result of this and my own efforts to place quotes in some footnotes (where they really didn't fit), the proposed language leaves open the location where the quote may be placed. Sometimes it works well in the article footnote, but having the citation on T:TDYK works just as well. GregManninLB (talk) 20:29, 9 July 2008 (UTC)
  • Oppose. In addition to the WP:AGF problems noted by Allen3, this smacks of instruction creep. The new requirement even goes beyond what is required for an article to attain GA status, which seems like a huge burden to place on a 200 character hook. --EncycloPetey (talk) 20:51, 9 July 2008 (UTC)
  • Oppose per allen & petey, and you should certainly not already be demanding this of current nominations, as you are. This is going in the wrong direction, and will discourage the use of higher-quality print sources, exactly the opposite of the way we should be going. Johnbod (talk) 21:58, 9 July 2008 (UTC)
  • Strong oppose - I understand that not being able to verify might be an issue... but I am writing a ton of articles about super obscure subject that are lucky enough if they get a half-sentence mention in online English-language sources. I rely on dead-tree books published sometimes decades ago in foreign language and not available even on Gbooks as "snippet preview". I think that's where value of Wikipedia kicks in - dragging such subjects out of dusty and dark library corners into Internet superhighway. I object such articles being treated as somehow inferior to those that are already covered in other Internet sources. Renata (talk) 07:25, 10 July 2008 (UTC)
  • Oppose - because even though I like the idea, we already get a huge number of hooks that fail the DYK criteria, people seem to have a lot of trouble getting it right. Adding more rules will result in an even higher failure rate, and given the fact that we aren't exactly getting a vast number of submissions these days, it means we would end up with less variety to choose from to create a balanced update. Gatoclass (talk) 12:40, 10 July 2008 (UTC)
  • I support something more mild: "If you are using an offline source, as a courtesy please try to include information to help reviewers find the source, such as the URL to its Google Books entry (if possible) or an ISBN". I think it's good practice to link to the Google Books entry even if it's just snippit view... you can sometimes verify that way. But actually requiring quoting is a step in the wrong direction... I think we forget how daunting DYK is already for people who aren't familiar with it yet want to give it a go. If you can't verify a hook, there's no need for you to approve it or put it on the template. Some of us do work in university libraries *cough cough* if there's something you truly can't verify, consider dropping me a line on my talk page... I might just be able to find the book if given a day or two. Ultimately I think we should resist forcing verification down to something we can all do in 60 seconds... the result compromises the integrity of articles. Some stuff just is obscure and takes work to verify... that doesn't mean it's bad. --Rividian (talk) 13:28, 10 July 2008 (UTC)
  • Oppose 1. Per Allen3 above, this violates WP:AGF: though there may be exceptions, we have to assume that editors are able to read their own sources 2. More importantly, along the lines of Renata3: by introducing extra criteria for editors using off-line sources, we're effectively discouraging the use of books. This is the opposite of what Wikipedia should be doing, otherwise we become little more than a repository of on-line links - a fancy Google. Most importantly though: there should either be such a requirement or not. We cannot have the situation we've had recently, where editors make up their own requirements (IMDb is not a reliable source (not true), off-line sources must provide quotations, foreign-language sources must provide translations) and then hold the entry hostage until the demands are met. Anyone who comes across such demands I would just encourage to ignore them - Wikipedia is a community process, we're not allowed to make up our own rules. Lampman (talk) 13:46, 10 July 2008 (UTC)
  • Regretful oppose from me too, mainly for the reason Gatoclass mentions: lots of editors already find the DYK rules daunting enough. Olaf Davis | Talk 14:43, 10 July 2008 (UTC)
  • Comment I'm not following the logic. The issue raised seems to be about off-line sources. Most of the posts above agree that the editors are using the off-line sources for their new article but then state that these same editors somehow are not able to provide DYK the one or two sentences from those off-line sources to allow verification of the DYK hook? If the editors are using the off-line sources, why can't they provide DYK the relevant one or two sentences from those off-line sources? How are editors using off-line sources discouraged from doing something they already did? If we have to assume that editors are able to read their own sources, then why require citations in the article? Since the DYK hook fact appears on the main page, there should be additional requirements for that fact beyond those required for facts placed in an article. GregManninLB (talk) 16:02, 10 July 2008 (UTC)
  • Why? There is no such requirement for FAs. All that is asked of an FA is that facts have proper inline citations to reliable sources, but the sources don't have to be quoted in the footnotes, not even those that appear on the main page. Are you saying we should have stricter requirements for DYKs than for FAs? That certainly wouldn't encourage participation. Lampman (talk) 17:12, 10 July 2008 (UTC)
Well, frankly we should start requiring it in FAs IMO, for the reasons I've discussed below. Daniel Case (talk) 17:21, 10 July 2008 (UTC)
  • Strong support because I will keep doing it no matter how this poll turns out. AGF does not trump WP:V here. The point of Wikipedia is to create a quality, free, online encyclopedia, not make the people who work on it feel good about themselves. If we really didn't want to discourage people from submitting hooks, if that came first, we would never have required sourced hooks at all. If that makes people upset, believe me that's nothing compared to what the article would get in a serious FAC, where every single fact and citation gets stress-tested. I consider learning about this on one single fact you want to appear on the Main Page to be the easy way to learn it.

    We do not keep policies because they are easy. We do the right thing even if it's difficult. We adopted the more stringent fair use policy, by which noncompliant images are summarily deleted and the innocent uploader made to feel like they're some sort of scum, over the objections of myself and others because it was ultimately deemed more in line with promoting free content. Despite the misgivings I still have about that, I consider this situation crucial because it goes directly to the issue of Wikipedia's credibility and reliability rather than some sort of idealistic vision of the future of intellectual property.

    There are, of course, other reasons for this. I do not mark hooks as verified if someone just gives a book and page number because I'd be lying if I did. I used a quote myself when I submitted Radovich v. National Football League, where a fair amount of info, including the hook fact (that the lawsuit began with a brief sketched out on a cocktail napkin) was sourced to a book now out print. Without the quote there is less guarantee that someone isn't making something up. Or ripping something off, which as you may know we've had some problems with lately. I well remember from my early days on Usenet how Serdar Argic, to back up his theories that Armenians committed genocide against Turks, would include citations to real magazine articles, complete with the usual citation information, from the 1920s. It took someone who actually had access to bound volumes to go look them up and find that while the articles were really there, the quotes were complete fabrications. Yes, including a quote won't stop someone that determined, but it will raise the bar.

    Without this, I think, we're just asking for another embarassment to Wikipedia based on assuming too much good faith. To restate the Russian aphorism I've used before, and indeed this is reflected in two of our most fundamental policies, дoвepяй нo пpoвepяйtrust but verify.

    Good faith is a two-way street ... editors at first resentful or puzzled about being asked for quotes in notes should, as the editor doing the asking is assuming that they are on the intellectual up-and-up and can comply in the greater interest of keeping Wikipedia credible and reliable, reciprocally assume good faith on the reviewer's part that he or she, too, is only doing this for the same reason and isn't creating arbitary hoops for the nominator to jump through. Look, this creates more time and work for me, too. If it were all about making my life as an editor easier I wouldn't be doing this. I have articles I want to work on, too many pictures to upload, too many sockpuppets and vandals to block, to really want to watch other people jump through hoops jsut for the sheer fun of it.

    All the same, I would support, in fact highly recommend, a softening qualifier that a quote is not necessary in the footnote where such is already in the text, properly attributed of course.

    And maybe, to be fair, this isn't really the right place to be having this discussion. How about WT:V instead? Daniel Case (talk) 17:19, 10 July 2008 (UTC)

    Two comments: One, Stating that " I will keep doing it no matter how this poll turns out" shows a complete unwillingness to achieve community consensus. Two, you've misunderstood WP:V, which does not require that all sources be quoted. What it requires is a clearly identified citation of the publication information for the source, so that someone is then able go find that source to verify the information. --EncycloPetey (talk) 17:33, 10 July 2008 (UTC)
Two responses: first, consensus doesn't prevail over policy if it's in contradiction (OK, if it's shot down, I will just not even review any offline or firewalled hook source without it in the name of intellectual honesty ... if you want to be the one who has to explain why he marked a plagiarized or fabricated hook as "verified" just because the footnote was properly formatted, that's your choice), and second, as I said I really think we ought to move this discussion to WT:V. Daniel Case (talk) 17:39, 10 July 2008 (UTC)
1. Policy is itself the result of consensus. 2. WP:V says (direct quote): "Editors should cite sources fully, providing as much publication information as possible, including page numbers when citing books". It then goes on to say (emphasis mine): "When there is dispute about whether the article text is fully supported by the given source, direct quotes from the source and any other details requested should be provided as a courtesy to substantiate the reference". And also: "Where editors use a non-English source to support material that others are likely to challenge, or translate any direct quote, they need to quote the relevant portion of the original text in a footnote or in the article, so readers can check that it agrees with the article content". So: yes, WP:V does foresee the need for quotes, but only in the rather narrowly-tailored circumstance of potentially controversial material. Now, just out of curiosity, I took a look at the 26 or so proposed hooks here; 0 or perhaps one are even remotely controversial or likely to provoke dispute/challenge. Per WP:V (at least by my interpretation), the rest only require page numbers, not direct quotes. Biruitorul Talk 18:22, 10 July 2008 (UTC)
To sum up, I am not going to lie to the reader and say "I checked this out" when all I checked out was the footnote formatting. I am not going to be an accomplice to that. I am disturbed by the intellectual laziness the opposing attitude represents. Daniel Case (talk) 17:15, 11 July 2008 (UTC)
Both Biruitorul and Daniel Case are right. As WP:V in it's current state doesn't support this practice, that is where the discussion should be. But until WP:V policy is changed to require quotations in citations, we shouldn't demand it here. Lampman (talk) 18:51, 10 July 2008 (UTC)
  • Oppose for three reasons. 1: WP:AGF, and the rules are hard enough. 2: Why limit this only to DYK facts? Why not have every sentence on Wikipedia backed up by a direct quote from the source? Surely they too are as important, right? You see, the principle we operate under is that reliable editors take material from elsewhere and shape it into articles. Demanding to know the precise text of their sources is not especially conducive to productive editing. 3: This could get very annoying. If I'm writing an article from scratch, I don't really mind giving a quote. But hypothetically, let's say I decided to translate es:Monetario clásico de la República Oriental del Uruguay. All the sources are books from Uruguay to which I don't have access, so I'd be forced to ask es:Usuario:Ncespedes to help me out. But what if Ncespedes didn't know English? Or if he'd quit Wikipedia? Or if he wasn't in Uruguay anymore to access the books? Or if he just didn't want to be bothered - which is quite understandable, since it is rather rude to assume someone might be fiddling with sources? So while I agree we shouldn't let just any hook pass, this proposal strikes me as a step too far. Biruitorul Talk 17:22, 10 July 2008 (UTC)
Other editors are already doing this, without DYK being involved, because they see the problem in allowing trust-me sourcing too. See Larry Davis (criminal). And on the suggestions page, see Focal and diffuse brain injury, which I just approved, an approval made much easier by an editor (Delldot) who gets this, too. He took, from a technical journal and an article with a daunting title, the one sentence that proved his point, and put it in the footnote for anyone reading to come across. Without me even asking. Without anyone here even asking. It can't really be that difficult. If we challenge editors to work a little harder for the ultimate benefit of Wikipedia, they will rise to it. Daniel Case (talk) 17:31, 10 July 2008 (UTC)
Like I said, I don't really mind putting in a quote myself, although it strikes me as both unnecessary and still open to fraud - I could just manufacture the quote, couldn't I? But what about the Uruguay example? Can't we trust anyone to give only a page number - even, say, an administrator on another Wikipedia? Biruitorul Talk 17:51, 10 July 2008 (UTC)
If someone is going to go to those lengths for fraud purposes, well we can't really prevent that, can we. The whole point of this proposal is to prevent honest mistakes. I'd be happy if this were a preferred course of action but not required, with the caveat that those that do not provide quotes may be passed over in favor of those with when we have a plethora of nominations, for example (kind of like we do for length). howcheng {chat} 18:37, 10 July 2008 (UTC)
As Biruitorul indicated above, I don't think it's good practice to demand quotes because it essentially adds to article clutter. Quotes from source should generally only be made if they are actually adding useful information that isn't in the main body text (like a normal footnote), or if the accuracy of the cite has been contested. We don't want to burden readers with unnecessary repetition of information that is already in the main body text. Gatoclass (talk) 04:01, 11 July 2008 (UTC)
And I also stated above that I don't see it being added to the article, but at T:TDYK with the nomination. howcheng {chat} 16:27, 11 July 2008 (UTC)

Daniel's really right that WT:V is the proper place for this: if it's proposed there and accepted, of course DYK will follow. If it's not and we take it up as a DYK policy anyway, then I anticipate a thousand "but not even FAC requires that - DYK is so unreasonable" arguments filling this page up and sapping the project's popularity. Either way, any further discussion here will be repeated at WT:V so perhaps Daniel or Greg would like to propose it there? Olaf Davis | Talk 09:41, 11 July 2008 (UTC)

That looks very like forum shopping, given the clear lack of support here. What would be proposed? That all FAs adopt this policy? Good luck with that! If only DYK is affected, then the correct place to discuss it is here. If a proposal is made, be sure to link it here. Johnbod (talk) 13:41, 11 July 2008 (UTC)
I agree that getting acceptance for this across Wikipedia in general is very unlikely, but since people have objected to the idea of requiring more for DYK than anywhere else proposing the change for the whole project seems like a reasonable step. I don't really see that it's forum shopping since this is the wrong place to discuss a general change of policy. Olaf Davis | Talk 15:51, 11 July 2008 (UTC)
I'm still not clear what is being proposed, but if it is to apply to everything, then ok. I see WP:SNOW looming, but have a go. Johnbod (talk) 16:06, 11 July 2008 (UTC)
I agree that such a DYK policy would be frustrating, but if it were to pass - then so be it. What is ten times more frustrating is editors imposing requirements that don't even exist, by using the {{subst:DYK?no}} symbol which literally means "Article is currently ineligible", when it is indeed eligible. This is simply moving the goalpost after the race is over. I've noticed frustration with that already, and that kind of arbitrary disregard for guidelines and procedure is bound to drive contributors away in droves. Lampman (talk) 13:21, 11 July 2008 (UTC)
It's basically because of the recent plagiarism scare. Some reviewers appear to have taken it upon themselves to try and do something about this potential problem by demanding quotes from source, but the problem is that there is no consensus for this approach. So I think whatever we do, we need to keep things co-ordinated to ensure that everyone is on the same page and working from the same set of rules, or it is just going to cause frustration and confusion for would-be contributors. Gatoclass (talk) 13:49, 11 July 2008 (UTC)
It's also because I find footnotes that don't provide the sourced information, that use unreliable sources, and so forth (this accounts for 30-40% of sources used, believe me). I was asking for quotes even beforehand because I am not one of those reviewers who says, "Well, this looks good, uses the right template and is properly formatted; I'll take it on faith" unless we need them for the next update and it's got to go out. I am not going to rubber stamp stuff. If that means, after this discussion, that I mark only those hooks as verified that don't have quotes for inaccessible (to me) sources and we don't have a reviewer available who can look at those sources (like the Oxford Dictionary of National Biography, where someone always is available), then that's what it means. I am perfectly happy to spend less of my time doing this. Daniel Case (talk) 17:07, 11 July 2008 (UTC)
In principle I sympathize with your position, and would like to find some way of imposing stricter standards myself. My concern is in regards to the practicality of this. DYK submitters are already struggling to meet the existing requirements, adding an additional criterion that is not expected anywhere else on Wikipedia is bound to cause confusion. All these queried hooks are additionally making it more difficult to put together balanced updates because, let's face it, a lot of people don't seem to realize their hooks are being queried and don't return to the suggestions page to try and resolve the problem. And as I've also said above, I am generally not in favour of adding redundant material to articles that can only cause irritation to the reader. So for a variety of reasons I'm not at all keen on this proposal of yours. Gatoclass (talk) 01:07, 12 July 2008 (UTC)
Oppose. I find this to create a recentism bias, making it so that historical figures who are expanded might not be DYKable just because they are historical figures. Wizardman 17:10, 12 July 2008 (UTC)
  • Strong oppose as an editor who works with a lot of offline sources, I find this extra work unnecessary and agree with Allen that its feels like a lack of AGF or an undue bias favoring online sources. Some hooks are not word for word copies from a 1 or 2 line quote but rather a summary from maybe paragraphs of text which would be absurd to ask a nominator using an offline source to transcribe. If a claim is "too strong" or seems odd, it is fair to ask in good faith the nominator to re-review their source just to make sure. If there is a "bad egg" consistently nominating articles with bogus offline sources, then the DYK community is savy enough to quickly learn not to accept nominations from that individual. Till then it is rather unfair to put an unnecessary burden on those of us who prefer to whip out our library card for some good old fashion research. I, for one, actually take more comfort in the reliability and permanence of an offline book source versus a website that could disappear or change at a whim. But that's just me. AgneCheese/Wine 16:34, 15 July 2008 (UTC)

OK, I have a better solution

With the usual benefit of some time offline, a solution to this issue formed itself in my mind: Another template, preferably named "DYKtickAGF"" or something like that, with text that would indicate the reviewer is accepting the editor's inaccessible source on good faith. I think this would satisfy everyone's concerns. Mine included. Daniel Case (talk) 02:21, 12 July 2008 (UTC)

Strong support on the grounds that I too thought of that last night and was just now coming to suggest it! It means people who do have access to a big library can check those entries and upgrade some of them to DYKtick, while other people won't keep repeatedly verifying the date and length before discovering the reference isn't online and not ticking it. It also avoids having a DYK team working to two different standards, which I don't think would be a good state for the project to maintain at all in the long run. The only real disadvantage I can see is the increment in complexity, but I really think that'll be worth it for the benefits in terms of saving repetition, finding plagiarism, and maintaining cohesion. Olaf Davis | Talk 11:34, 12 July 2008 (UTC)
No objections by the sound of it... anyone feel like designing an icon for DYKtickAGF? Olaf Davis | Talk 21:42, 14 July 2008 (UTC)
I've created the template and added it to the list on the suggestions page. I'm a little worried that the blue tick might not stand out and may get missed by editors working on updates, but I don't have a better image to use. Everyone should try and watch out for them when searching for approved hooks. Olaf Davis | Talk 11:10, 15 July 2008 (UTC)
Support, I think it's a good solution; this way an editor accepting a nomination on good faith won't be liable to accusations of poor reviewing. You're right that the icon isn't ideal though - since it's effectively a "yes", it should ideally be green with a different symbol. That way editors will intuitively see them when looking for nominations that are ready to go. Now it's easy to confuse with "subst:DYK?" and "subst:DYK?no". Lampman (talk) 15:16, 15 July 2008 (UTC)
Yes, green is better. How about   - it's a similar colour to   and we can claim that we cunningly chose the plus to mean "this is ready but a reviewer with access to the reference could provide an additional check." :) Olaf Davis | Talk 18:02, 15 July 2008 (UTC)
Since we are likely to have many new and inexperienced people visiting our discussions, I think using the GA icon would cause confusion. Some people would associate DYK and GA, and we'd constantly be fielding questions and concerns over this. If we have to, we could always commission one of the talented icon-makers for a new "check-plus" and "check-minus" for our use. --EncycloPetey (talk) 18:25, 15 July 2008 (UTC)
I'm really liking the current AGF symbol, which is  . I think the relationship between a green check and a grey-blue check (green stronger than grey) is more intuitive than the relationship between a green check and a green plus sign. I suppose it's also a matter or personal preferences, namely "do you believe the color green or the check mark symbol communicates approval better?" For me the check works better, and I think there's an inherent hierarchy between the vivid check and pale check that there isn't between the green plus sign and the green check sign. That said, if others really prefer the green plus sign I don't have strong objections. Either way, the addition of the AGF symbol is a great thing. Vickser (talk) 18:34, 15 July 2008 (UTC)
I was going to say much the same as Vickser. The grey-blue tick works very well for me—the tick seems to stand out nicely from the background. Using the "Good Article" sign would be confusing; it's too deeply ingrained with and associated with that area of Wikipedia. Anyway, I think we have found an elegant and useful solution. Hassocks5489 (tickets please!) 19:01, 15 July 2008 (UTC)
I guess you may be right about the GA symbol confusing people, although it's also used as an 'in favour' vote in some places. At the end of the day what matters is: are the reviewers going to miss the current AGF tick when scrolling down looking for ticks? Olaf Davis | Talk 22:23, 16 July 2008 (UTC)
  • Revisited The now resolved discussion over "first millionaire" Pierre Lorillard II - July 19 submissions, pretty much repeated at Talk:Pierre Lorillard II - is a relevant case here. The nominator has found over 15 reliable-looking online sources that turn out to be flat wrong, and moreover the correct fact was published in the hardly obscure OED in 1908. This too is in fact online to subscribers, and the original reference of 1826 is online at Project Gutenberg, but evidently these did not show up in Google. Good faith offline beats ok-looking online again! Johnbod (talk) 21:07, 20 July 2008 (UTC)