Module talk:Citation/CS1/Archive 6
This is an archive of past discussions about Module:Citation. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | ← | Archive 4 | Archive 5 | Archive 6 | Archive 7 | Archive 8 | → | Archive 10 |
CS1 comparison testcases
We can have several pages of testcases for wp:CS1 cites, including:
- wp:CS1/test_parameters - list of cites to show each parameter
- wp:CS1/test_basics - list of cites to show basic, typical examples
- wp:CS1/test_problems - list of cites which test known problems/fixes
The massive complexity of the 430+ parameters in the wp:CS1 cite templates requires a large set of testcases, to provide some assurance of handling the astronomically huge set of endless combinations of rampant variations of parameter names. The testcases will provide a basic "sanity test" of the overall functionality, because the testing of all possible parameter groups would exceed the age of the universe, several times over. This is a typical case of combinatorial explosion: "the cite templates can be rewritten within 1 year with Lua script, but would require 90 billion years to completely test". The possible count of testcases starts with 430 factorial (430! ~= 2.2946e+947), or zillions of parameter combinations, where setting "first=" to blank might erase "author=x".
Comparing the related templates: For each new Lua-based template named with "/lua" then the original markup-based template will have a permanent copy as "/old" to compare the side-by-side results, even after the Lua versions are installed with the current template names. For example:
- Template:Cite_web - {cite_web} live, whether markup or switched to Lua-based
- Template:Cite_web/lua - the Lua-based version of {cite_web}
- Template:Cite_web/old - the markup-based version of {cite_web} as the old copy
- Template:Cite_book/lua - the Lua-based version of {cite_book}
- Template:Cite_book/old - the markup-based version of {cite_book} as the old copy
- Template:Cite_journal/lua - the Lua-based version of {cite_journal}
- Template:Cite_journal/old - the markup-based version of {cite_journal} as copied
Again, the focus must be on confirming just the general parameters, with occasional variant spellings; otherwise, there would quickly be hundreds of thousands of parameter combinations. However, without some form of sanity check, then the complexity of the CS1 cites would become impossible to handle. -Wikid77 (talk) 01:03/06:21, 23 February 2013 (UTC)
Volume bolding
Wikitext | {{cite encyclopedia
|
---|---|
Live | LAST1, FIRST1; LAST2, FIRST2 (YEAR). "TITLE". In EDITOR (ed.). ENCYCLOPEDIA. Vol. VOLUME (EDITION ed.). LOCATION: PUBLISHER. pp. PAGES. ID. Retrieved 2006-07-02. {{cite encyclopedia}} : |volume= has extra text (help); Check date values in: |year= (help)CS1 maint: numeric names: authors list (link) CS1 maint: year (link)
|
Sandbox | LAST1, FIRST1; LAST2, FIRST2 (YEAR). "TITLE". In EDITOR (ed.). ENCYCLOPEDIA. Vol. VOLUME (EDITION ed.). LOCATION: PUBLISHER. pp. PAGES. ID. Retrieved 2006-07-02. {{cite encyclopedia}} : |volume= has extra text (help); Check date values in: |year= (help)CS1 maint: numeric names: authors list (link) CS1 maint: year (link)
|
No bolding on the volume? |
Is it intentional to remove the bolding on the volume of an encyclopedia? Dragons flight (talk) 03:41, 12 March 2013 (UTC)
- Looks like only volume numbers four characters or less are bolded. --— Gadget850 (Ed) talk 15:26, 12 March 2013 (UTC)
Wikitext | {{cite encyclopedia
|
---|---|
Live | LAST1, FIRST1; LAST2, FIRST2 (YEAR). "TITLE". In EDITOR (ed.). ENCYCLOPEDIA. Vol. 1234 (EDITION ed.). LOCATION: PUBLISHER. pp. PAGES. ID. Retrieved 2006-07-02. {{cite encyclopedia}} : Check date values in: |year= (help)CS1 maint: numeric names: authors list (link) CS1 maint: year (link)
|
Sandbox | LAST1, FIRST1; LAST2, FIRST2 (YEAR). "TITLE". In EDITOR (ed.). ENCYCLOPEDIA. Vol. 1234 (EDITION ed.). LOCATION: PUBLISHER. pp. PAGES. ID. Retrieved 2006-07-02. {{cite encyclopedia}} : Check date values in: |year= (help)CS1 maint: numeric names: authors list (link) CS1 maint: year (link)
|
- Intentional non-bolded longer volume names: For years, there had been suggestions to unbold the volume name when using a volume-name title, and so beyond 4-character length, it inserts a dot and omits the prior bolding, "Volume III: Garrish to Nominal" because the bolded name had appeared too garrish, too excessive, in many current articles. In fact, the unbolded volume was requested, again, on 21 February 2013, in the above thread "#series/volume/publisher order". For the markup-based templates, a rapid {padleft} can be used to detect and unbold beyond 5-character volume names. -Wikid77 11:19, 13 March 2013 (UTC)
- How was the 4-character limit derived? I see your objective, but I don't think this gives the right answer for
|volume=XXVIII
or|volume=55–56
for journal cites. Rjwilmsi 15:26, 13 March 2013 (UTC)
- How was the 4-character limit derived? I see your objective, but I don't think this gives the right answer for
Editor problem
Wikitext | {{cite book
|
---|---|
Live | Playfair, Major-General I.S.O.; Stitt, Commander G.M.S; Molony, Brigadier C.J.C.; Toomer, Air Vice-Marshal S.E. (2004) [1st. pub. HMSO:1954]. Butler, J.R.M (ed.). Mediterranean and Middle East Volume I: The Early Successes Against Italy (to May 1941). History of the Second World War, United Kingdom Military Series. Uckfield, UK: Naval & Military Press. ISBN 1-845740-65-3. {{cite book}} : Unknown parameter |lastauthoramp= ignored (|name-list-style= suggested) (help)
|
Sandbox | Playfair, Major-General I.S.O.; Stitt, Commander G.M.S; Molony, Brigadier C.J.C.; Toomer, Air Vice-Marshal S.E. (2004) [1st. pub. HMSO:1954]. Butler, J.R.M (ed.). Mediterranean and Middle East Volume I: The Early Successes Against Italy (to May 1941). History of the Second World War, United Kingdom Military Series. Uckfield, UK: Naval & Military Press. ISBN 1-845740-65-3. {{cite book}} : Unknown parameter |lastauthoramp= ignored (|name-list-style= suggested) (help)
|
Incorrect labeling on the editor |
The Lua version replaces the "X ed." editor marker with a nonsensical "In X" expression. Dragons flight (talk) 03:49, 12 March 2013 (UTC)
- Document collections use "In Editor" format: Some users have preferred the format as "In Editor" rather than "Editor, ed." and so that is why it has been displayed. Because wp:CS1 style is a hodge-podge of cite styles, the Lua module was originally written to use a few styles for all citations, rather than mimic each of the prior 23 {cite_*} fork templates. -Wikid77 11:19, 13 March 2013 (UTC)
- PS. There is also a change to the author list, where the old version had an ampersand. Dragons flight (talk) 03:54, 12 March 2013 (UTC)
Changes in page / date handling for cite news
Wikitext | {{cite news
|
---|---|
Live | "Auction Record for an Original 'Alice'". The New York Times. 11 December 1998. p. B30. |
Sandbox | "Auction Record for an Original 'Alice'". The New York Times. 11 December 1998. p. B30. |
This is a case where the new version is different, but not necessarily wrong (i.e. both approaches seem basically reasonable). The label on the page number and the placement of the publication date appear to have changed in the handling of cite news. I assume this was probably intentional, since it seems like too large a change to be accidental. However, I tried skimming this page and didn't find any discussion of this, so I thought I would highlight it. Dragons flight (talk) 15:47, 12 March 2013 (UTC)
- That's a change to cite news, but it doesn't seem to discuss the page and date rearranging. The example given doesn't use the agency or location fields. Dragons flight (talk) 15:55, 12 March 2013 (UTC)
- My bad. Not sure how I connected this. --— Gadget850 (Ed) talk 17:02, 12 March 2013 (UTC)
- That's a change to cite news, but it doesn't seem to discuss the page and date rearranging. The example given doesn't use the agency or location fields. Dragons flight (talk) 15:55, 12 March 2013 (UTC)
- Reset page format as "p." for Cite_news: There had been an overuse of the colon ":" page format, and so I changed when config.CitationClass is "news" to use the p./pp. page-number format. -Wikid77 (talk) 10:13, 13 March 2013 (UTC)
Test cases
Are we, through the current process, developing a (near-) comprehensive suite of test cases? Should they be captured and documented for future use? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:35, 12 March 2013 (UTC)
- Expanding representative sets of testcases: The goal is to expand the various pages, of numerous testcases, as issues are noted in importance, such as testcase essay "wp:CS1/test_problems". See above: "#CS1 comparison testcases". The tactic has been to view the pages during a run-preview when editing the Lua Module:Citation/CS1, so the testcases need to be kept limited, at first, so the pages are not too large to view during a run-preview. Because there are potentially unlimited billions of billions of parameter combinations, I expect the testcases to be expanded for years. The complete testing of parameters would exceed the age of the universe, many times over as a combinatorial explosion of parameter choices. -Wikid77 10:13, 13 March 2013 (UTC)
Thanks to everyone in 3-year upgrade of CS1 cites
As we can finally see the light at the end of the fast-cite tunnel, with {cite_news} being converted to Lua, I want to take a minute to thank everyone for observing, and debating, and analyzing, and rewriting or fixing the wp:CS1 templates to run much faster and smarter, in both markup and Lua versions. Although the major slowness had been extensive use of {cite_web}, {cite_book}, {cite_journal}, and {cite_news}, the remainder of the 23 {cite_*} forks can still benefit from enhancements to the cite formats, such as fixing some double-dot ".." problems, even in the markup-based cite templates. The core markup Template:Citation/core will continue to be used to support the other {cite_*} forks, which have not been converted to Lua, as well as supporting the long-term comparisons with {cite_web/old}, {cite_book/old}, {cite_journal/old}, and {cite_news/old}, etc. In fact, as the major cites are converted to use Lua, then {Citation/core} can afford to run a little slower, to provide better formatting, for the remaining few cases of other {cite_*} forks which do not use Lua yet. Anyway, the overall improvements have involved many people, in debates and suggestions as well as template/module changes and testing, so let's take a minute to thank everyone for helping, in this 3-year (or longer) transition to better CS1 cite templates. -Wikid77 (talk) 06:24, 19 March 2013 (UTC)
Testing Cite_book and date/page
The template {cite_book} should put parameter "others=" (illustrator) before "edition=" as in {cite_book/old}. Like {cite_news}, the place/publisher should not use parenthesis brackets "(__)". Although previously ignored, {cite_book} should put quotation marks around the parameter "title=" when having "journal=" or "periodical=" or "magazine=" or "work=" (yet rarely used). When there is an author/editor, then "date=" should follow that in "(...)" but without author/editor when only "title=" then date should precede the page-number. Also, {cite_book} uses the p./pp. page format.
Wikitext | {{cite book
|
---|---|
Live | Test Cite_book title only. Illustrated by J. Doe (2nd ed.). 1 May 1998. pp. 32–4.{{cite book}} : CS1 maint: others (link)
|
Sandbox | Test Cite_book title only. Illustrated by J. Doe (2nd ed.). 1 May 1998. pp. 32–4.{{cite book}} : CS1 maint: others (link)
|
Sandbox: Test Cite_book title only. Illustrated by J. Doe (2nd ed.). 1 May 1998. pp. 32–4.{{cite book}}
: CS1 maint: others (link)
Wikitext | {{cite book
|
---|---|
Live | Test Cite_book title+work. 7 June 2012. p. 163. {{cite book}} : |work= ignored (help)
|
Sandbox | Test Cite_book title+work. 7 June 2012. p. 163. {{cite book}} : |work= ignored (help)
|
Sandbox: Test Cite_book title+work. 7 June 2012. p. 163. {{cite book}}
: |work=
ignored (help)
Placement of other parameters has been shifted, slightly different from {cite_book/old}. All parameters for {cite_book}:
Wikitext | {{cite book
|
---|---|
Live | Author (Date) [Origyear]. "Chapter Name". Test Cite_book Parameters. Department (Type). Series (in Language). Vol. Volume. Others (Evening ed.). Place: Publisher. p. page. arXiv:ArXiv. ASIN ASIN. Bibcode:Bibcode. doi:10.DOI. ISBN Isbn. ISSN Issn. JFM JFM. JSTOR Jstor. LCCN LCCN. MR MR. OCLC OCLC. OSTI OSTI. PMC PMC. PMID PMID. RFC RFC. SSRN SSRN. Zbl ZBL. Id. Archived from the original (Format) on Archivedate. Retrieved Accessdate – via Via. Quote {{cite book}} : |author= has generic name (help); |page= has extra text (help); |volume= has extra text (help); Check |arxiv= value (help); Check |asin-tld= value (help); Check |asin= value (help); Check |bibcode= length (help); Check |doi= value (help); Check |isbn= value: invalid character (help); Check |issn= value (help); Check |jfm= value (help); Check |jstor= value (help); Check |lccn= value (help); Check |mr= value (help); Check |oclc= value (help); Check |osti= value (help); Check |pmc= value (help); Check |pmid= value (help); Check |rfc= value (help); Check |ssrn= value (help); Check |zbl= value (help); Check date values in: |accessdate= , |date= , and |archivedate= (help); Invalid |ref=harv (help); More than one of |pages= , |at= , and |page= specified (help); Unknown parameter |agency= ignored (help); Unknown parameter |chapterlink= ignored (help); Unknown parameter |coauthor= ignored (|author= suggested) (help); Unknown parameter |deadurl= ignored (|url-status= suggested) (help); Unknown parameter |doi_inactivedate= ignored (help); Unknown parameter |laydate= ignored (help); Unknown parameter |laysource= ignored (help); Unknown parameter |laysummary= ignored (help); Unknown parameter |subscription= ignored (|url-access= suggested) (help); Unknown parameter |titlelink= ignored (|title-link= suggested) (help); Unknown parameter |trans_title= ignored (|trans-title= suggested) (help); Unknown parameter |transcript= ignored (help); Unknown parameter |transcripturl= ignored (help)CS1 maint: unrecognized language (link)
|
Sandbox | Author (Date) [Origyear]. "Chapter Name". Test Cite_book Parameters. Department (Type). Series (in Language). Vol. Volume. Others (Evening ed.). Place: Publisher. p. page. arXiv:ArXiv. ASIN ASIN. Bibcode:Bibcode. doi:10.DOI. ISBN Isbn. ISSN Issn. JFM JFM. JSTOR Jstor. LCCN LCCN. MR MR. OCLC OCLC. OSTI OSTI. PMC PMC. PMID PMID. RFC RFC. SSRN SSRN. Zbl ZBL. Id. Archived from the original (Format) on Archivedate. Retrieved Accessdate – via Via. Quote {{cite book}} : |author= has generic name (help); |page= has extra text (help); |volume= has extra text (help); Check |arxiv= value (help); Check |asin-tld= value (help); Check |asin= value (help); Check |bibcode= length (help); Check |doi= value (help); Check |isbn= value: invalid character (help); Check |issn= value (help); Check |jfm= value (help); Check |jstor= value (help); Check |lccn= value (help); Check |mr= value (help); Check |oclc= value (help); Check |osti= value (help); Check |pmc= value (help); Check |pmid= value (help); Check |rfc= value (help); Check |ssrn= value (help); Check |zbl= value (help); Check date values in: |accessdate= , |date= , and |archivedate= (help); Invalid |ref=harv (help); More than one of |pages= , |at= , and |page= specified (help); Unknown parameter |agency= ignored (help); Unknown parameter |chapterlink= ignored (help); Unknown parameter |coauthor= ignored (|author= suggested) (help); Unknown parameter |deadurl= ignored (|url-status= suggested) (help); Unknown parameter |doi_inactivedate= ignored (help); Unknown parameter |laydate= ignored (help); Unknown parameter |laysource= ignored (help); Unknown parameter |laysummary= ignored (help); Unknown parameter |subscription= ignored (|url-access= suggested) (help); Unknown parameter |titlelink= ignored (|title-link= suggested) (help); Unknown parameter |trans_title= ignored (|trans-title= suggested) (help); Unknown parameter |transcript= ignored (help); Unknown parameter |transcripturl= ignored (help)CS1 maint: unrecognized language (link)
|
The "volume=x" will unbold when 5 or more characters. The correct placement for date in {cite_book} has only 2 styles: for date to follow author/editor, or when only "title=" to precede the page-number. The placement of parameter "others=" is a major issue, such as for name of illustrator. So, I have changed Module:Citation/CS1/sandbox for CitationClass "book" to show "others=" before the edition data as in {cite_book/old}, but for CitationClass "journal" to display "others=" after authors/editors and before the title as in {cite_journal/old}. I think that was the only major problem, and then {cite_book} should be ready to transition to Lua. -Wikid77 19:30, 20 March 2013 (UTC)
Cite book does not support the periodical parameters, and I can't see the need. It does support chapter, which interacts badly with periodical:
Wikitext | {{cite book
|
---|---|
Live | "Chapter". Title. 7 June 2012. p. 163. {{cite book}} : |work= ignored (help)
|
Sandbox | "Chapter". Title. 7 June 2012. p. 163. {{cite book}} : |work= ignored (help)
|
--— Gadget850 (Ed) talk 19:49, 20 March 2013 (UTC)
- Suggest to ignore illogical combinations: At this stage in transitioning to use Lua, I think we need to ignore the billions of illogical combinations of parameters, where it is unusual for a user to specify a periodical name, with article "title=" and then insist "chapter=" as well. We are currently past the point where changing Module:Citation/CS1, to fix one problem, is very likely to break another feature, among billions of parameter combinations. I did not design the overall Lua module, and I would have strongly suggested the Lua version should have closely mirrored Citation/core with 23 fork driver functions, rather than try to force the one Lua module to, internally, mimic the actions of 23 forks all combined into a mass of multiple if-conditions to block interactions among all the forks combined into the same logic flow. However, the original Lua module was even more complex and tried to combine those 23 forks, plus the Vancouver Vcite format, plus some {smallcaps} options of other cite formats, and the nightmare of instant "creeping featurism" has been the result. Also, note that Lua only allows 200 variable names within a single function, so there is a limit to having more parameters, and currently the multiple alias spellings are folded into a single variable name each. I am still worried that too many tangent issues will delay the release of the Lua-based templates. -Wikid77 (talk) 20:50, 20 March 2013 (UTC)
Bibcode Colon, and the separators following various IDs
Omit the colon in "Bibcode:Bibcode" data: For option "bibcode=" there is a spurious colon added in the Lua version (during last year?). Other than that, I think the {cite_journal} users will be happy to have 6x faster cites in the medical/science articles. See: Pneumonia, Cancer, Cystic fibrosis:
- Run: {{#invoke:CiteConversionTest|test|Cancer}}
Most journal cites look almost identical in format with the Lua version. Like an echo. Like an echo. Everything else seems good to go. I think many scientists will not even notice the Lua version is formatting the {cite_journal} data. -Wikid77 (talk) 07:46, 21 March 2013 (UTC)
- I'm not actually sure it is accidental. If you look at the separators following the Lua identifiers, we see that:
- arXiv, Bibcode, and doi use colon (":")
- ASIN, ISBN, ISSN, JFM, JSTOR, LCCN, MR, OCLC, OSTI, PMC, PMID, RFC, SSRN, and Zbl use space (" ")
- I could easily believe that someone wanted to use space after all of the uppercase ones and ":" after all the mixed case ones, but somehow missed Zbl.
- The only difference between Lua and the older templates right now is that Bibcode was moved from using a space to using a colon.
- So, we have several possible options.
- We could leave the configuration as is.
- We could revert Bibcode to use space, matching the current templates, but leaving doi and arXiv as the odd ducks.
- We could convert Zbl to use ":", so all mixed case IDs use colon and all uppercase IDs use space.
- We could convert all of them to use ":" as a separator.
- We could convert all of them to use space as a separator.
- Personally, I think the choice of separators here is pretty unimportant, and none of these options would bother me, but if people have strong feelings one way or the other, it might be good to share. Dragons flight (talk) 19:04, 21 March 2013 (UTC)
- I am also ambivalent. The stand-alone {{Bibcode}} uses a colon. --— Gadget850 (Ed) talk 23:15, 21 March 2013 (UTC)
Transition Phase-6: Cite_web to Lua
As the 6th phase (of 9 major steps), we are ready for {cite_web} next. I think some editors are concerned that their "favorite" cite templates have not been transitioned yet, so we need to upgrade faster, to quicken most edit-previews. I have created typical testcases:
- Module_talk:Citation/CS1/test/web - testcases similar to {cite_news}
Again, based on the corrections and success of prior phases, each next phase becomes less risky because the one Lua Module:Citation/CS1 runs 99% the same for all cases. The transition of {cite_news} to Lua, on 19 March 2013, quickened many major pop-culture articles to edit-preview, or reformat, within 19 seconds (over 20% faster). Next, the transition of {cite_journal}, on 23 March, ran even faster, because many medical/science articles use mostly {cite_journal}, with numerous slow PMID/PMC, doi or Bibcode parameters, and the speed improvement was 2.5x times faster for article edit-previews of many science articles. The more parameters used, the slower the cite, for Lua or especially markup. Also, there was a significant unlinking of Template:Citation/core afterward in another 114,000 pages, from the prior 1.74 million pages to 1.65 million, where the only cite templates in use had been {cite_journal} or also {cite_news}, as unlinked now. In this next, massive, transition phase, {cite_web} affects almost every remaining wp:CS1-style article, to rapidly quicken 1.6 million articles to speed most articles for edit-preview within 3-8 seconds. Expect almost 1 million articles to delink {Citation/core}. -Wikid77 (talk) 16:36, 24 March 2013 (UTC)
- Has the pdf export bug been sorted yet. We were asked to hold back until it was.--Salix (talk): 20:20, 24 March 2013 (UTC)
Completed -- Gadget850 (Ed) talk 14:00, 2 April 2013 (UTC)
Transition for Cite Book
Rather than following Wikid77's schedule, I'd prefer to convert {{cite book}} to Lua before {{cite web}}. Firstly, cite book has 400k uses rather than 1.3M for cite web, and I think it is better to work our way up to the really large one so we catch any additional issues. Secondly, I've already spent the time working up a set of test cases for cite book, Module talk:Citation/CS1/test/book, including fixing a couple of new bugs that hadn't been caught during the prior iterations. I think we are probably ready to convert {{cite book}} today, assuming that no one can point to any additional unresolved problems. By contrast, I haven't yet studied any cite web testcases, so I'm not personally confident on whether or not there are still additional bugs for that case. Dragons flight (talk) 20:32, 24 March 2013 (UTC)
- This sounds fine to me. --MZMcBride (talk) 05:51, 25 March 2013 (UTC)
- Avoid delays for superstitious fears: Sometimes, it can be difficult to make progress when dwelling on superstitions about user concerns, and the fear of making "mistakes" while delaying the deployment of improved templates. Always try to prioritize the cost/benefit analysis, to balance the delayed benefits which outweigh the cost of potential incompatible changes. Because {cite_web} is very similar to {cite_news} (but without "newspaper=" or "periodical="), and 99% of the Lua-based cites share the same Module:Citation/CS1, then in effect, the testing for {cite_news} already tested the majority of features for {cite_web}. Meanwhile, almost 4,900 articles will be improved when {cite_web} is transitioned to Lua, to no longer blank the "separator=" option, to cause run-together cite parameters in the References section of those 4,900 pages. More than 25,000 articles will then display the singular page "p. nn" to fix the common typo "pp. nn". Also, the Lua cites are restoring the COinS metadata, into 1.8 million articles, for processing by User:DASHBot to update dead-link URL address links. After {cite_web} is updated, then there is the need to restore the COinS metadata into Template:Citation/core, which can afford the extra 20% slower COinS processing because {Citation/core} will be only rarely used, after delinking from almost 1 million pages once {cite_web} is upgraded. Also, long-awaited enhancements to {Citation/core} can be added, even though slower, because the Lua-based cites will reduce the overall reformat time. Until {cite_web} is transitioned to Lua, then the required changes to {Citation/core} would trigger reformatting of all 1.6 million pages, which is likely to cause users to ask why Wikipedia is running so slow again. Hence, we need to avoid changing Module:Citation/CS1 continually, and instead, try to batch a set of changes in the /sandbox version and wait until several issues are collected, and then update Citation/CS1 for minor changes, combined, in a single update. In cases of emergency, then {cite_book/old} or {cite_web/old} could be used to provide the prior formatting, until all related changes are collected within the Lua /sandbox version for installation in the live Citation/CS1. Prioritize whether a problem can be handled by using {cite_web/old} versus a Lua update which will reformat over 800,000 pages for those small changes, at this point. -Wikid77 (talk) 08:19, 25 March 2013 (UTC)
- "Superstitious", really? Every round of this I've personally found a few more bugs not caught in the previous round. In addition to that, I woke up today to find another half-dozen new bug reports from other users. Lua is an unambiguous improvement for performance and also includes many formatting bug fixes, but that doesn't mean we should rush it into deployment without studying each major version for possible regressions. You've complained several times now that this deployment is taking too long. Frankly, I've started to find those complaints a bit annoying. For a project like this where the number of deployed uses is very large and the set of possible configurations is too enormous to comprehensively validate, I think we have actually been moving quite quickly over the last couple weeks. You are perfectly free to try and convince some other admin to move more quickly, but as long as I am the one actually doing the installation, then I plan to move at a pace that I am comfortable with. In the mean time, your help reporting, analyzing, and fixing bugs would be appreciated. Dragons flight (talk) 16:23, 25 March 2013 (UTC)
- No way to thank people enough: I am sorry when being so busy, I don't even have time for proper explanations. Even trying to thank you, and Gadget850, for sacrificing the prior weeks to test and update the markup and Lua versions to match, cannot adequately emphasize the impact of the cite-transition efforts, as one of the greatest performance improvements in the history of Wikipedia. The profound impact is not just the instant fixes of over 100,000 clerical errors (such as "Inc.." or awkward "pp." for singular page), nor the 6x faster edit-preview of citation sections, nor even the re-addition of the COinS metadata to reconnect dead-link sources to archive URLs, but the overall impact is the system-wide reduction in resource usage, to quicken most major articles as reformatting 2x-3x faster. If the Top 1000 articles formerly needed 5 hours to reformat, then those can be re-displayed now, in perhaps just over 1 hour. Also, the Lua-based cites will be the "crowning achievement" in the initial transition to Lua (for both the speed impact as well as supporting over 430 parameter options). However, there is always the potential for complaints, not just for overlooked discrepancies, but also for people asking why the miraculous transition did not happen sooner. The old adage warns, "The squeaky wheel gets the oil", even if 1.6 million wheels must wait before being allowed to move faster. Hence, it is important to keep the overall transition effort moving forward. After {cite_web} has been transitioned to Lua, then {Citation/core} will be delinked from nearly 1 million pages, and any further upgrades can reformat those million pages 2x-3x times faster with Lua. Most likely, with the upcoming 2x-faster Scribunto upgrade, then Lua cites will be considered as speeding most major articles to edit-preview beyond 3x faster. At this point, any new features introduced into the wp:CS1 cites can be deployed, and reformatted, into those 1.8 million pages as 3x faster than ever before. Only the minor {cite_*} forks, among 23 variations, will continue to use {Citation/core}, and their slow reformatting (14/second) will become negligible. Anyway, I am hoping more editors will come to help to discuss concerns, and offer better solutions, but we do not want to "stop the presses" to focus on only the "squeaky wheels" among the rest of 1.8 million, waiting to move faster and cleaner. -Wikid77 (talk) 22:55, 25 March 2013 (UTC)
- NO. DON'T force unfinished software on users. this isn't a playground or test bed for your favorite software projects. clean out the bugs before you present software as production-ready. this nonsense about "superstitions" (?) has to stop. this is insulting to editors who try to workaround your fumbles and to readers who get inconsistent presentation. you can either do it right or you can't. i'm sure editors won't appreciate spending valuable time to beta-test your code. typical is the nonsense about "changing" the script "continually". are you suggesting that bugs be left in the software just so you can move on to something else? or the nonsense about going back to the old system as a "patch" on an as-needed basis? where did you get these ideas about software development? this is getting ridiculous. but i can see that you are another one who thinks that buggy software is better if it presents the bugs faster. 70.19.122.39 (talk) 12:48, 25 March 2013 (UTC)
- Lua version is greatly improved and tested, not buggy playground: There have been extensive tests of the Lua-based cites, for over 6 months, which also show the numerous improvements compared to the prior markup-based cites. The word "superstitious" refers to the idea that the Lua-based cites are worse, rather than many times better than the markup cites. Many problems with the prior cite templates have been fixed with the Lua version, such as removing double-dots ".." after "Inc.." or author initials. There are almost 5,000 articles which incorrectly omit the dots (or commas) between parameters, and the Lua version will fix those articles as well. Over 25,000 articles will be fixed to show "p." (rather than "pp.") for a singular page number. Beyond all those improvements, the COinS metadata will be restored, in 1.8 million pages, for DASHbot to automatically insert the archive URL where a dead-link URL has been used. The fact that the Lua-based cites run almost 9x faster than the markup-based cites, from last year, was not even mentioned in the above paragraph, but that is another improvement, where users will be able to insert and preview new citations 9x times faster than before. So the Lua-based cites will help our editors improve the appearance and addition of citation footnotes. I am sorry that you imagined the Lua version was "buggy software" and I hope I have explained how the opposite is true. -Wikid77 (talk) 14:38, 25 March 2013 (UTC)
- and on what do you base your statement that lua "is many times better"? do you have any real-world, objective, properly benchmarked PROOF? because if not, this is just your particular "superstition". but this is not the point, lua is a given. i'm just illustrating how fantastic your "logic" is.
- you misrepresent behavioral (human) errors, like the mistaken use of "Inc.", as coding errors of the markup version. the first line of defence against behavioral errors is proper, unambiguous documentation written in simple, non-technical language, that anticipates such problems.
- similarly for the use of "pages". but maybe you think that documentation is not part of a software project. now it seems that lua can better handle the string manipulations involved, but this should still be a last-ditch solution.
- you misrepresent the re-entry of COInS functionality, as if it was an invention of the lua system or as if its previous implementation was breaking something. the first is clearly not the case, the second was never proven.
- as for your assertion that the referred-to implementation was not buggy, let's just say that you are being funny. but only in order to avoid much stronger language.
- afaic all these statements of yours raise more unflattering questions regarding the implementation of this important project. thankfully others involved seem to have a better grasp of things and are more responsive and reality-based. 70.19.122.39 (talk) 01:08, 26 March 2013 (UTC)
- The Lua templates are working quite well. We have done a lot of regression testing, but with our small team some of the lesser used parameters and a lot of odd uses were missed. As we deploy the Lua templates, these issues are being reported and are quickly resolved. We welcome the reports of any specific problems. --— Gadget850 (Ed) talk 14:48, 25 March 2013 (UTC)
- you shouldn't welcome reports of problems. production code shouldn't have any. imo, you clearly had not done enough testing. if you insist on calling editors' prerogative (in their efforts to make content understandable) "odd uses" then imo you should not be involved in software that is there to assist humans. it is looking at it from the wrong perspective. 70.19.122.39 (talk) 01:08, 26 March 2013 (UTC)
- Aiming for Zarro boogs, are we?
- you shouldn't welcome reports of problems. production code shouldn't have any. imo, you clearly had not done enough testing. if you insist on calling editors' prerogative (in their efforts to make content understandable) "odd uses" then imo you should not be involved in software that is there to assist humans. it is looking at it from the wrong perspective. 70.19.122.39 (talk) 01:08, 26 March 2013 (UTC)
Completed -- Gadget850 (Ed) talk 14:02, 2 April 2013 (UTC)
Transition for Cite Web
I think we are about ready for the big one. I would like to transition {{cite web}}, used on 1.3 million pages, later today. The current test cases page for cite web is Module talk:Citation/CS1/test/web. After this one, the vast majority of citations on Wikipedia will be using Lua. Dragons flight (talk) 20:33, 28 March 2013 (UTC)
- {{cite web}} has now been deployed using Lua. Dragons flight (talk) 00:53, 29 March 2013 (UTC)
- Deployment of {cite_web} for mega-scale improvements: Thank you for transitioning Template:Cite_web to Lua, which is currently fixing over 4,900 articles which had omitted the dot "." separator between thousands of parameters, and correcting the singular page "pp.n" to show "p." in over 25,000 articles, plus fixing "inc.." etc. I have verified the instant 6x-faster cite speed improvement, when editing pop-culture articles which edit-preview, now, within 7 seconds, as 2x-3x faster. This focus on mega-scale improvements is needed to avoid tangent delays to debate rare parameters used in less than a 1-in-10,000 fraction of all cites. The articles currently re-edited, by the "101,000" daily editors, will reduce the reformat backlog of the 1.3 million {cite_web} pages, among the 1.8 million {cite_*} pages. Overall reformatting has been somewhat slow, where {Citation/core} has not been further delinked much yet in the past 15 hours, despite many thousands of pages using only {cite_web}. The category for lone accessdate (no URL) has increased by over 12% to exceed 45,590 pages. Also, perhaps 10% of articles might not delink for over 4 days, when the reformatting is purposely delayed to balance the server's wp:Job_queue. However, we should focus on transitioning Template:Citation (style wp:CS2), as the next mega-scale effort, to revise a million cite parameters as used in "95,261" pages, many without {cite_*} due to {citation} showing the comma separator, as exclusive citation style wp:CS2, but still showing typos as singular page "pp.n" and running 6x slower w/o Lua. -Wikid77 (talk) 15:27, 29 March 2013 (UTC)
Completed -- Gadget850 (Ed) talk 14:06, 2 April 2013 (UTC)
Cite web, url, and archiveurl
It has come to my attention that for {{cite web}}, specifying archiveurl= without a url= is historically allowed:
Wikitext | {{cite web
|
---|---|
Live | House of Lords (21 November 2000). "Science and Technology - Sixth Report". UK Parliment. {{cite web}} : |archive-url= requires |url= (help); Missing or empty |url= (help)
|
Sandbox | House of Lords (21 November 2000). "Science and Technology - Sixth Report". UK Parliment. {{cite web}} : |archive-url= requires |url= (help); Missing or empty |url= (help)
|
While for all other citation template it appears to be an error:
Wikitext | {{cite journal
|
---|---|
Live | House of Lords (21 November 2000). "Science and Technology - Sixth Report". UK Parliment. {{cite journal}} : |archive-url= requires |url= (help)
|
Sandbox | House of Lords (21 November 2000). "Science and Technology - Sixth Report". UK Parliment. {{cite journal}} : |archive-url= requires |url= (help)
|
So, is this behavior we need to replicate, and if so, why does it work this way? Dragons flight (talk) 03:07, 29 March 2013 (UTC)
- That uses the same markup as the other templates, and the error check is in core. That is a bug in the old version, but I don't see the problem right off. --— Gadget850 (Ed) talk 10:15, 29 March 2013 (UTC)
- When 'url' is fed into 'IncludedWorkURL' then {{citation/core}} is not throwing the error as intended. This affects other templates such as {{cite conference}}. This is a bug in core, but I will have to dig into it later to see what is going on. --— Gadget850 (Ed) talk 10:25, 29 March 2013 (UTC)
deadurl = no
Wikitext | {{cite web
|
---|---|
Live | House of Lords (21 November 2000). "Science and Technology - Sixth Report". UK Parliment. {{cite web}} : |archive-url= requires |url= (help); Missing or empty |url= (help); Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
|
Sandbox | House of Lords (21 November 2000). "Science and Technology - Sixth Report". UK Parliment. {{cite web}} : |archive-url= requires |url= (help); Missing or empty |url= (help); Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
|
Wikitext | {{cite journal
|
---|---|
Live | House of Lords (21 November 2000). "Science and Technology - Sixth Report". UK Parliment. {{cite journal}} : |archive-url= requires |url= (help); Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
|
Sandbox | House of Lords (21 November 2000). "Science and Technology - Sixth Report". UK Parliment. {{cite journal}} : |archive-url= requires |url= (help); Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
|
deadurl = yes
Wikitext | {{cite web
|
---|---|
Live | House of Lords (21 November 2000). "Science and Technology - Sixth Report". UK Parliment. {{cite web}} : |archive-url= requires |url= (help); Missing or empty |url= (help); Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
|
Sandbox | House of Lords (21 November 2000). "Science and Technology - Sixth Report". UK Parliment. {{cite web}} : |archive-url= requires |url= (help); Missing or empty |url= (help); Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
|
Wikitext | {{cite journal
|
---|---|
Live | House of Lords (21 November 2000). "Science and Technology - Sixth Report". UK Parliment. {{cite journal}} : |archive-url= requires |url= (help); Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
|
Sandbox | House of Lords (21 November 2000). "Science and Technology - Sixth Report". UK Parliment. {{cite journal}} : |archive-url= requires |url= (help); Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
|
I've updated all of these cases to give behavior consistent with the preexisting templates. However, I'm not really sure that setting up the error handling in this way makes sense. Dragons flight (talk) 05:02, 29 March 2013 (UTC)
- The old core checking has a bug as noted above. All templates should use consistent checking. --— Gadget850 (Ed) talk 10:29, 29 March 2013 (UTC)
- If a URL is dead, but an archive has already been provided, I'm not actually sure why it is ever an error to omit the original URL. All of the widely used archive services record the original URL, so it would still be available that way, but having a link there when we are confident that it won't work seems rather pointless. Dragons flight (talk) 18:38, 30 March 2013 (UTC)
- (edit conflict):It is better I think to be consistent across all of the CS1 cites. Treating
{{cite web}}
differently from all of the others just makes for ugly code and editors who will complain about this cite format being different from that format. Make{{cite web}}
the same as all of the others.
- Beware triggering new error messages in thousands of articles: At this point, each new error message should be considered as a "retroactive law" to upset prior use of parameters. I think most readers would consider a page "ugly" which contains new error messages formerly not there before the Lua cites. Instead, a hidden maintenance category can be used to determine the extent of the supposed error conditions. For the use of "archiveurl=" without "url=" there has been the concern of promoting one URL, versus the other URL, in nations where some web-archive sites are considered to be a clear copy-vio of the original webpage, and linking the archive without the original URL could be considered favoritism towards the "copyvio" website. -Wikid77 (talk) 15:27, 29 March 2013 (UTC)
Trans title with no title
Wikitext | {{cite book
|
---|---|
Live | Doe, John (1965). Neverland: Foreign Books. {{cite book}} : Missing or empty |title= (help); Unknown parameter |trans_title= ignored (|trans-title= suggested) (help)
|
Sandbox | Doe, John (1965). Neverland: Foreign Books. {{cite book}} : Missing or empty |title= (help); Unknown parameter |trans_title= ignored (|trans-title= suggested) (help)
|
Noting a bug. Dragons flight (talk)
- Agree- the old behavior is a bug. --— Gadget850 (Ed) talk 15:06, 29 March 2013 (UTC)
- Actually, I would say that both sides are a bug. We shouldn't be ignoring translated titles when given without a title as that causes information to vanish from preexisting citations. However, it also isn't good to have only translated titles specified without specifying the original title. My intention is to fix the display format to include the translated title and add a tracking category for translations lacking original text. Dragons flight (talk) 15:42, 29 March 2013 (UTC)
Okay, I've added the translated title back into the display, but tagged it with Category:Pages with citations using translated terms without the original when there is a translation but no original. Dragons flight (talk) 16:24, 29 March 2013 (UTC)
- That seems better, to keep what they show. -Wikid77 (talk) 16:29, 29 March 2013 (UTC)
Transition Phase-7: Template:Citation
The next major cite is Template:Citation, used repeatedly in over 95,000 pages. Although technically called "wp:CS2" I created the Template:Citation/lua to use CS1's Module:Citation/CS1 as thinking it functions as {Citation/core} for both CS1/CS2 styles, so it could be claimed that CS2 is a variation under CS1, and hence covered by the same Lua module. I just wanted to avoid Vancouver style cites, already handled quickly by Template:Vcite. Anyway, the first minor problems are:
- The {cite compare} needs to handle "mode=citation".
- Parameter "accessdate=" is hidden when "doi=" but no "url=" parameter.
The general format is:
The results of {cite_compare} show:
Wikitext | {{citation
|
---|---|
Live | Doe, John (29 March 2013), Try {citation}, vol. II (2nd ed.), London: Acme, doi:10.555 {{citation}} : |access-date= requires |url= (help); Check |doi= value (help)
|
Sandbox | Doe, John (29 March 2013), Try {citation}, vol. II (2nd ed.), London: Acme, doi:10.555 {{citation}} : |access-date= requires |url= (help); Check |doi= value (help)
|
These are the first concerns. -Wikid77 (talk) 16:29, 29 March 2013 (UTC)
- I added some redirects to allow {{cite compare}} to work as expected. Dragons flight (talk) 17:30, 29 March 2013 (UTC)
- Vancouver is not well used. The editor who was the main proponent is inactive and I suspect it will not grow past the current articles. Bottom line: don't worry about updating it. --— Gadget850 (Ed) talk 18:05, 29 March 2013 (UTC)
- Creating /test/citation testcases: Since {citation} looks like {cite_book} with commas, I have created Module_talk:Citation/CS1/test/citation from the /test/book page, but adding "coauthors=" data. Again, if anyone asks "Why named /CS1?" then reply that CS2 is a citation sub-style of CS1, from the days when they were both based on {Citation/core}. -Wikid77 00:39, 30 March 2013 (UTC)
Citation and page / pages
ditto All of the new CS1 templates are set to the reverse (page= overrides pages=). Is it acceptable to do the same with {{citation}}? Dragons flight (talk) 17:27, 1 April 2013 (UTC)
- It's a mistake to have both set (and articles that make that mistake should be tossed into a maintenance category). My guess is that most instances of this happen with a book where pages= is (incorrectly) the total number of pages of the book, and page= is the actual citation, so for this case having page= take priority is correct. —David Eppstein (talk) 17:37, 1 April 2013 (UTC)
- Ditto. And the old CS1 templates have the order of hierarchy of 'page', 'pages', 'at'. -- Gadget850 (Ed) talk 17:56, 1 April 2013 (UTC)
The Mysterious Place
Wikitext | {{cite book
|
---|---|
Live | Jones, John (1956). Written at Seattle, Washington. My Book. New York: Books 'R' US. |
Sandbox | Jones, John (1956). Written at Seattle, Washington. My Book. New York: Books 'R' US. |
Specify both place= and publication-place= in cite book |
Wikitext | {{cite book
|
---|---|
Live | Jones, John (1956). My Book. Seattle, Washington: Books 'R' US. |
Sandbox | Jones, John (1956). My Book. Seattle, Washington: Books 'R' US. |
Specify only place= in cite book |
Wikitext | {{citation
|
---|---|
Live | Jones, John (1956), written at Seattle, Washington, My Book, New York: Books 'R' US |
Sandbox | Jones, John (1956), written at Seattle, Washington, My Book, New York: Books 'R' US |
Specify both place= and publication-place= in citation |
Wikitext | {{citation
|
---|---|
Live | Jones, John (1956), My Book, Seattle, Washington: Books 'R' US |
Sandbox | Jones, John (1956), My Book, Seattle, Washington: Books 'R' US |
Specify only place= in citation |
So {{citation}} has an extra field place=
that means the same as publication-place=
if publication-place=
is not specified, but has a different meaning if both are specified. To make things worse, the old templates also allow both parameters, but historically place=
overrides publication-place=
while we are presently doing the reverse.
Given this situation, I'm tempted to make all of the templates match the behavior of these parameters in {{citation}}. Any thoughts / comments? Dragons flight (talk) 17:57, 1 April 2013 (UTC)
- Give in to the temptation. -- Gadget850 (Ed) talk 18:27, 1 April 2013 (UTC)
- agreed that "place"/"publication-place" should follow the "date"/"publication-date" convention. but reserve the use of colons for "publication-place" only. otherwise it (a) may cause people to confuse author location with publisher location (b) may undermine confidence in the citation system, especially if nit-pickers fail to verify the presumed publisher location detail. 70.19.122.39 (talk) 00:51, 2 April 2013 (UTC)
Migration for Citation template
I have gone ahead and migrated {{citation}} to use Lua. Test cases can be seen as Module talk:Citation/CS1/test/citation. This is the last of the major citation templates (with 95k page uses). Dragons flight (talk) 23:48, 3 April 2013 (UTC)
span class="reference-accessdate" exposed
span class="reference-accessdate"
is exposed if 'publisher' is linked and ends with a period:
Wikitext | {{cite web
|
---|---|
Live | Lavrinc, Damon (2010-03-29). "Hennessey Venom GT: A $600k mid-engine Cobra for the 21st Century". Autoblog. Weblogs, Inc. Retrieved 2010-03-29. {{cite web}} : Unknown parameter |sandbox= ignored (help)
|
Sandbox | Lavrinc, Damon (2010-03-29). "Hennessey Venom GT: A $600k mid-engine Cobra for the 21st Century". Autoblog. Weblogs, Inc. Retrieved 2010-03-29. {{cite web}} : Unknown parameter |sandbox= ignored (help)
|
Wikitext | {{cite web
|
---|---|
Live | Lavrinc, Damon (2010-03-29). "Hennessey Venom GT: A $600k mid-engine Cobra for the 21st Century". Autoblog. Weblogs, Inc. Retrieved 2010-03-29. |
Sandbox | Lavrinc, Damon (2010-03-29). "Hennessey Venom GT: A $600k mid-engine Cobra for the 21st Century". Autoblog. Weblogs, Inc. Retrieved 2010-03-29. |
Wikitext | {{cite web
|
---|---|
Live | Lavrinc, Damon (2010-03-29). "Hennessey Venom GT: A $600k mid-engine Cobra for the 21st Century". Autoblog. Weblogs, Inc. Retrieved 2010-03-29. |
Sandbox | Lavrinc, Damon (2010-03-29). "Hennessey Venom GT: A $600k mid-engine Cobra for the 21st Century". Autoblog. Weblogs, Inc. Retrieved 2010-03-29. |
-- Gadget850 (Ed) talk 10:39, 4 April 2013 (UTC)
- Need to move dot '.' before span-tag: I think the easiest fix would be to move the 'sepc' variable to lead the access-date to avoid any similar chopped span tags "span xx>". The Lua function safejoin() looks past the span-tag in "<span...>. Retrieved" and treated the dot '.' as being adjacent to the prior end-dot data, to chop the lead '<' off the span-tag. I have triggered "sandbox=yes" in the first example above, to check the fix. -Wikid77 (talk) 17:40, 4 April 2013 (UTC)
- Moving the dot is not really an acceptable option. The dot needs to be included in the span in order for "reference-accessdate" to properly perform the purpose described under "accessdate" at {{cite web}}. Otherwise, everyone who follows the directions given at that page would start seeing double dots all the time. I've fixed the code to actually remove the dot from inside the span under these circumstances. Dragons flight (talk) 18:34, 4 April 2013 (UTC)
Fixed -- Gadget850 (Ed) talk 20:09, 4 April 2013 (UTC)
Multi-phase transition to Lua cites
Bumping thread for 30 days. Allen3 talk 10:43, 24 March 2013 (UTC)
With all the corrections people have already submitted, some Lua cites are very close to being released. I suggest a multi-phase transition in different weeks for the 23 cite templates, to focus on 5 major cite templates (for web, book, news, journal & {citation} ) with one minor cite template, {cite_encyclopedia}, to start as a small pre-release:
- test general parameters as "text-book" cases
- test several example articles (India, United States, Canada, Germany, Japan, etc.)
- transition Template:Cite_encyclopedia to Lua, as a small start (only 62,000 articles)
- transition Template:Citation to Lua (wide use, but only 93,000 pages)
- transition Template:Cite_news to Lua (mostly pop-culture, 385,000 articles)
- transition Template:Cite_journal to Lua (complex science/document parameters, 275,000 pages)
- transition Template:Cite_web to Lua (majority of cites, 1.3 million pages)
- transition Template:Cite_book to Lua (2nd largest, 460,000 pages)
- transition Template:Cite_video to Lua (minor, in 9,000 pages)
- transition other cite templates to use Lua
At any point, the transition phases can be reverted, or delayed, to handle whatever issues are found. The multi-phase plan is a balance between conservative delays and wide-scale impact. The most-used template, {cite_web} in 1.3 million pages, will be released in the middle phases, after tests which have smaller impacts. Meanwhile, because {cite_news} is used mostly for pop-culture articles, the parameters are often simple, while many users will be editing articles which use {cite_news} and report any unusual cite formats. Also, because {cite_news} is a major cite template, it will be a good "stress test" to having Lua used in several hundred thousand articles (385,000), before transitioning {cite_web} as used in 1.3 million pages. -Wikid77 (talk) 16:37, 22 February 2013 (UTC)
- Looks like a good list FYI: {{cite video}} is now {{cite AV media}}, and supports all the features of {{cite sign}}. --— Gadget850 (Ed) talk 16:58, 22 February 2013 (UTC)
- Yes a good list, it might be an idea to try an inform the wider community of whats happening at the village pump or signpost.--Salix (talk): 17:17, 22 February 2013 (UTC)
- Transition plan announced at PUMPTECH: I have worded an announcement to emphasize the benefits of using Lua-based CS1 cites, even if not "perfect" yet:
- The Lua-cite advantages will outweigh the risks of slight format differences, which can be fixed later. -Wikid77 (talk) 19:31, 22 February 2013 (UTC)
- All seems to be progressing steadily, some very good work being done here. One question, as each problem is found and corrected, are we updating a page of testcases to show that the Lua output is equivalent (or better where agreed) than the current citation core? I would be interested to add a selection of sample citations to it. Thanks Rjwilmsi 22:19, 22 February 2013 (UTC)
- Preparing long-term testcases: We did not have a naming structure to compare side-by-side testcases, but creating "/old" versions of each cite template will allow long-term comparisons. These testcases could quickly become a nightmare, as a "cottage industry" of thousands of parameter combinations, so I have waited until now. See more below: #CS1 comparison testcases. -Wikid77 (talk) 01:03, 23 February 2013 (UTC)
- Release delayed 3 weeks until 17 March 2013: Due to several trivial problems, the release of {cite_encyclopedia/lua} was delayed for over 3 weeks. Perhaps most debilitating, the excessive limitations with the Lua timeout, as a mere 10 seconds, compared to 60-second allowance for markup-based parameters, made Lua unusable for cite templates in major articles, due to the risk of entire cites stored as "Script error" when the file servers were extremely slow. In rare cases, some Lua functions can slow to over 65% slower, where a 7-second Lua run could stretch beyond 11 seconds. To patch the severe Lua time limitation (with a "band-aid"), the Lua timing was changed to omit time elapsed when parsing the parameter templates, to enable formatting of hundreds of citations; however, the 10-second timeout still limits Lua to only partial analysis of large article pages. There was also a complete inability to use Lua templates when generating PDF output. Other trivial problems involved the shifted position of multiple parameters, again providing evidence for the need to just hand-write citation footnotes, where the Lua-based cites have become yet the next level of "much ado about nothing" in excessive formatting of footnotes. However, in related tests, the 200-variable-name limit in Lua functions was confirmed, so there is another limit to rambling additions of parameter names, where they cannot be given separate variable names inside a single Lua function, unless limited to within 200 possible names. -Wikid77 (talk) 04:32, 19 March 2013 (UTC)
- Release of {cite_journal} as Lua on 23 March 2013: After numerous discussions about the position of the "editor=" parameter, which was left as "In Editor" for now, adjusting some minor options, and creation of the related testcases page, {cite_journal} was transitioned to use Lua on 23 March 2013 at 1am. After several hours, about 114,000 more articles were auto-delinked from the markup-based helper Template:Citation/core. A specific article, "Lyme disease" was timed to edit-preview within 9 seconds (formerly 22+ seconds) using 189 {cite_journal} and 6 {cite_news}, with similar reformat times for other major medical articles. -Wikid77 (talk) 16:36, 24 March 2013 (UTC)
HTML classes
Now that Lua has deployed, we should add (as well as, or instead of, COinS) HTML classes to our citation templates, to describe the various parameters. For example, instead of emitting, say,
Much ado about Nothing
we could emit:
<span class="title">Much ado about Nothing</span>
The visual rendering would not change.
By agreeing (and sharing with the wider web community) a standard set of such class names, others can write tools to parse our citations, and allow them to be inserted into other documents or web services (or, indeed, into other Wikipedia articles). The makers of Zotero, for example, have already expressed an interest in parsing citations that use such classes.
As some of you may have realised, what I am talking about, a standard, shared, set of class names, is a microformat. I have written more about how we could use a citation microformat, at Wikipedia talk:WikiProject Microformats#Proposal: citation microformat. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:39, 11 March 2013 (UTC)
- Get a specification together. Once this module is debugged and implemented, then we can look at adding this feature. Do we want to include HTML5 elements as well? --— Gadget850 (Ed) talk 15:55, 12 March 2013 (UTC)
- OK, I've started a brainstorming page at Wikipedia:WikiProject Microformats/citation, with a draft proposal for discussion and an example. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:12, 12 March 2013 (UTC)
- So, can we now apply these classes? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:37, 25 March 2013 (UTC)
- OK, I've started a brainstorming page at Wikipedia:WikiProject Microformats/citation, with a draft proposal for discussion and an example. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:12, 12 March 2013 (UTC)
- I could see supporting the use of an existing third party specification (COinS is already an example of this), but I don't really like the idea of inventing such a specification ourselves. Dragons flight (talk) 17:23, 12 March 2013 (UTC)
- Why not? (In fact the draft is based on work done by the microformat community, which has moved on since my initial suggestion).Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:49, 12 March 2013 (UTC)
Example
It would be helpful if you could map out an example or two of what you propose to add.
For example, given a citation like:
- {{ cite book | isbn=978-1-4020-4520-2 | last1=Luhmann|first1= J. G.|last2=Russell|first2=C. T. | editor1-fisrt=J. H.|editor1-last=Shirley|editor2-first=R. W.|editor2-last=Fainbridge | publisher=Chapman and Hall | location=New York | year=1997 | url=http://www-spc.igpp.ucla.edu/personnel/russell/papers/venus_mag/ | work=Encyclopedia of Planetary Sciences | title=Venus: Magnetic Field and Magnetosphere | accessdate=2009-06-28 }}
- Luhmann, J. G.; Russell, C. T. (1997). Shirley, J. H.; Fainbridge, R. W. (eds.). Venus: Magnetic Field and Magnetosphere. New York: Chapman and Hall. ISBN 978-1-4020-4520-2. Retrieved 2009-06-28.
{{cite book}}
:|work=
ignored (help)
Which parameters do you want to add classes to, how do you propose to deal with multiple authors/editors, how do you want to distinguish article titles from main work titles, etc.? Dragons flight (talk) 02:13, 29 March 2013 (UTC)
I'm open to debate, but the HTML from your example is (whitespace added; irrelevant attributes omitted, for clarity):
<code> <span class="citation book"> Luhmann, J. G.; Russell, C. T. (1997). <a class="external text vt-p" href="http:.../">"Venus: Magnetic Field and Magnetosphere"</a>. In Shirley, J. H.; Fainbridge, R. W. <i>Encyclopedia of Planetary Sciences</i> (New York: Chapman and Hall). <a href="..." title="..." class="vt-p">ISBN</a> <a href="..." title="..." class="vt-p">978-1-4020-4520-2</a> <span class="reference-accessdate">. Retrieved 2009-06-28</span>.</span> </code>
(incidentally, I don't think the first full stop should be inside the "reference-accessdate" span). I would make that:
<code> <span class="citation book h-cite"> <span class="p-author">Luhmann, J. G.</span>; <span class="p-author">Russell, C. T.</span> (<span class="dt-published">1997</span>). <a class="external text vt-p u-url p-name" href="http:.../">"Venus: Magnetic Field and Magnetosphere"</a>. In <span class="p-editor">Shirley, J. H.</span>; <span class="p-editor">Fainbridge, R. W.</span> <i class="p-publication">Encyclopedia of Planetary Sciences</i> (New York: <span class="p-publisher">Chapman and Hall</span>). <a href="..." title="..." class="vt-p">ISBN</a> <a href="..." title="..." class="vt-p u-uid">978-1-4020-4520-2</a> <span class="reference-accessdate">. Retrieved <span class="dt-accessed">2009-06-28</span></span>.</span> </code>
by adding classes "h-cite", "u-url", "u-uid", "p-author" (twice), "p-editor" (twice), "dt-published", "p-publication", "p-name", "p-publisher" and "dt-accessed". What do others think? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:34, 29 March 2013 (UTC)
- We don't have the ability to add classes to links generated via [[...]] or [...], so that part of your proposed markup doesn't work. Those bits would have to be in separate spans (or similar), probably outside of the <a>. Dragons flight (talk) 01:52, 3 April 2013 (UTC)
- Doh! Of course - I have an open ticket for that. Otherwise, what do you think? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:46, 9 April 2013 (UTC)
- Did you see this? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:36, 3 May 2013 (UTC)
Reverted move
Since this is a new feature, I am moving it to the requests page. --— Gadget850 (Ed) talk 14:48, 30 March 2013 (UTC)
- Was there any discussion and consensus reached, before you decided to create a sub-page with few watchers and little traffic? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:45, 30 March 2013 (UTC)
- Move it back if you want. -- Gadget850 (Ed) talk 15:56, 30 March 2013 (UTC)
- You didn't answer my question; but I've moved it back anyway. BTW, Module_talk:Citation/CS1/Feature_requests has been viewed just 21 times since it was created. It has fewer than 30 watchers. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:38, 1 April 2013 (UTC)
- Move it back if you want. -- Gadget850 (Ed) talk 15:56, 30 March 2013 (UTC)
Cite comparison tool
I have created Module:CiteConversionTest. It is a simple that allows one to see select an article, pull out its citations, and see the both and after conversion results side-by-side.
Try (in your personal sandbox):
{{#invoke:CiteConversionTest | test | France }}
To avoid time outs it will only show the first 90 citations, but even so, one can easily test a large number of citations very quickly. Dragons flight (talk) 15:09, 13 March 2013 (UTC)
Missing options, strange dots
Wikitext | {{cite news
|
---|---|
Live | . BurgerBusiness. 2012-01-25 http://www.burgerbusiness.com/?p=9168. {{cite news}} : Missing or empty |title= (help)
|
Sandbox | . BurgerBusiness. 2012-01-25 http://www.burgerbusiness.com/?p=9168. {{cite news}} : Missing or empty |title= (help)
|
Some dots to do something about. Or, you know, require that the title is not blank. Dragons flight (talk) 04:27, 14 March 2013 (UTC)
Citation class change
- I pulled this query into it's own section. Dragons flight (talk) 17:47, 18 March 2013 (UTC)
Is there any reason that the HTML class has changed from Journal
to journal
? Some browsers are case-sensitive on class identifiers. I noticed a few days ago that {{cite encyclopedia}}
, which used to generate HTML with class="book"
now generates HTML with class="encyclopaedia"
Will all the CS1 templates change their class names as they are converted? --Redrose64 (talk) 16:22, 18 March 2013 (UTC)
- Apparently CSS class names are case sensitive according to the standard (which surprises me). In the present Lua, the class names are taken from the citation mode, so cite book gives class=book, cite encyclopedia give class=encyclopedia, etc. I assume that was done for the sake of simplicity and consistency, though it was there before I got here. So, the real question is: Do we want to preserve the preexisting class behavior, or do we want to make them systematic in this way (or some other way)? Personally I'm not sure which behavior should be preferred, nor do I know how widely used these class names are. Dragons flight (talk) 17:47, 18 March 2013 (UTC)
{{cite encyclopedia}}
actually givesclass=encyclopaedia
(note the British spelling). --Redrose64 (talk) 18:11, 18 March 2013 (UTC)- I have never been convinced that the HTML class in the cite templates has any purpose. --— Gadget850 (Ed) talk 18:17, 18 March 2013 (UTC)
- The User:DASHBot has been using the COinS metadata, to add (hundreds) "archiveurl=" into {cite_*} which have deadlink URLs. It does not look at the visible span-tag class="citation book" (now class="citation encyclopaedia") but inside the COinS tags which have <span title="ctx_ver=Z39.88-2004...">. I think DASHBot will continue to run the same, regardless of the citation "class=" name. -Wikid77 (talk) 01:18, 19 March 2013 (UTC)
- Let's get {cite_news} released soon, and avoid overthinking of options, at this point. See below: "#Avoid analysis paralysis". -Wikid77 (talk) 17:44, 18 March 2013 (UTC)
Automated regression testing
I set up Module talk:Citation/CS1/testcases to automatically compare {{cite encyclopedia}} and {{cite news}} values generated from both the live module and its sandbox. As people make improvements to the sandbox, please check the testcases page to make sure nothing is breaking. We should extend this as we go. Dragons flight (talk) 23:30, 20 March 2013 (UTC)
Error trapping and checks
I've made several updates to mimic the current error checking on citation templates as well as extend a few additional checks. The values being examined, and the actions taken, are described at Module talk:Citation/CS1/test/errors. Dragons flight (talk) 23:35, 20 March 2013 (UTC)
- I've recently spent way too much time fixing citations listed at Category:Articles with incorrect citation syntax. Right now that category has seven subcategories and collects mostly
{{cite web}}
missing title errors, although I have seen the|archivedate=
and|archiveurl=
errors there as well
- The
{{cite web}}
missing title error, it seems to me, is pretty much the same error as those found in Category:Pages with citations lacking titles and Category:Pages with citations having bare URLs except that in the latter two cases there aren't any error messages to help editors find the offending citation. So, I'd like to see some changes arise from this:
- Annotate every trapped error so that editors can see the malformed citation in the rendered page
- Change the
{{cite web}}
error category to Category:Pages with citations lacking titles (Category:Articles with incorrect citation syntax becomes a holder of subcategories but doesn't list any individual articles) - Change CS1 error handling to recognize that citations with bare urls are the same as citations lacking titles and assign these errors to the single category Category:Pages with citations lacking titles
- I think that some of the category names are awkwardly worded and should be renamed:
- I guess I don't see much value in hiding error messages in the HTML because it's unclear how that helps editors find and fix those errors. After all, you've trapped, categorized, created and then hidden an error message, isn't it relatively easy to display the error message as you've done in other cases?
- Adding:
.citation-comment { display: inline !important; color: red; }
- to your personal CSS page, will cause the hidden comments to be visible as red text when you read the page. I haven't really done anything to promote that feature, but it can be used for cleanup while we discuss whether to make the error messages fully visible. You can learn more about such features at Wikipedia:Customisation.
- As far as naming, I'm happy to use whatever names people agree on. I've pretty much just been making them up as I go.
- Regarding bare URL and lacking titles, these aren't actually the same error, though they often occur together. For example:
- is an example of a citation lacking a title that doesn't have a bare URL. However, it is true that most (all?) of the occurrences of a bare URL happen due to the lack of a title, so they probably could be merged on that basis, if that is what people prefer. Dragons flight (talk) 19:45, 27 March 2013 (UTC)
- Thanks for that bit of css - it makes a world of difference, so much so that I can't imagine why we wouldn't want every editor to be able to see the errors. I think that displayed error messages work. When I came to this topic (I'd just discovered Category:Articles with incorrect citation syntax) there were 400ish pages listed there. By comparison, as I write this, accessdate without url: 40,000+ pages, citations with bare urls: 11,000+ pages, citations lacking titles: 8,700+ pages, conflicting page specifications: 5,000+ pages, and format without url is currently at 2,700+ pages. Those numbers alone seem to indicate that when editors are alerted to malformed citations, they fix them.
- People won't agree on an alternate name unless an alternate name is offered. So, I've offered names that I think are meaningful and not too awkward.
- In your bare URL and lacking title example (which doesn't actually have a bare url), it seems to me that a citation without a title is meaningless (especially since it refers to page 45; page 45 of what?) We have
{{harvnb}}
and{{sfn}}
to do that kind of shortened citation. So I think that your example should be flagged as a citation lacking a title as{{cite news}}
does. And when we do add a bare url:
- In your bare URL and lacking title example (which doesn't actually have a bare url), it seems to me that a citation without a title is meaningless (especially since it refers to page 45; page 45 of what?) We have
- Doe, John (1956). London. p. 45 http://en.wiki.x.io/wiki/Module_talk:Citation/CS1#Error_trapping_and_checks.
{{cite news}}
: Missing or empty|title=
(help)
- Doe, John (1956). London. p. 45 http://en.wiki.x.io/wiki/Module_talk:Citation/CS1#Error_trapping_and_checks.
{{cite news}}
emits Bare URL needs a title.Citation has no title which seems a bit redundant, right? If there is a distinction here, I'm not seeing it.
- I do want to say that you and Editor Gadget850 are doing good work here that seems to me much under appreciated.
Making all parameters case insensitive
There are currently about 25 examples where we check multiple case representations of arguments, i.e. doi= and DOI=, Author= and author=.
Would there be any downside to making all the parameters case insensitive? It can be done easily at the point we locally copy the argument table, and I expect the net effect is pretty performance neutral (a few calls to string.lower offset by removing various checks for alternate capitalization), so basically I'm asking is there any reason it would be bad if editors had the option of using whatever parameter capitalization they wanted? I seems like allowing title=, Title=, and TITLE=, etc. to function the same is probably okay (and might help newbies) even if that's not how templates generally work. It would certainly help clean up some of the code by allowing us remove the various capitalization checks. Dragons flight (talk) 02:57, 22 March 2013 (UTC)
- This seems reasonable to me. BibTeX is not case sensitive in its corresponding parameter names, and that doesn't seem to cause any problems. Some external software might need to be updated (e.g. I have code I use to convert back and forth between Wikipedia citations and BibTeX that is currently case-sensitive on the Wikipedia side) but the update would be very easy. —David Eppstein (talk) 04:14, 22 March 2013 (UTC)
- Avoid vast divergence from 23 {cite_*} forks and markup templates: We need to beware any changes which radically differ from the 23 older {cite_*} forks. For example, allowing capital-letter "Title=xx" in {cite_journal} would encourage use of a parameter spelling which would be insidiously ignored in the older fork templates, to cause confusion in new users unaware of the transition status of the various {cite_*} fork templates. Also, we would complicate the comparisons of parameters between the Lua versions and the old markup-based templates which would ignore many uppercase parameter names. Beyond those problems, there would be endless confusion with alternate citation templates, such as Template:Vcite or any other templates which currently expect lowercase "title=" and ignore the capital "Title=" form. Let's just try to focus on getting the other major cite templates, {cite_journal} and {cite_web} and {cite_book}, transitioned to use Lua, and discuss numerous tangent issues next month. Too much speculation about other features leads to the paralysis of analysis which causes a 9-day transition to Lua-based cites to drag into 6 weeks/months of numerous delays. -Wikid77 (talk) 05:29, 22 March 2013 (UTC)
- Changing the capitalisation rules would widely affect bots. We'd diverge further from other cite templates and those related to them. As Wikid77 says mid-transition is not a good point. I think overall it would lead to more problems. Rjwilmsi 09:08, 22 March 2013 (UTC)
- The documentation explicitly states to use lower case, and the upper case aliases and the one misspelling alias have never been documented. The defacto site standard for parameters is lower case. --— Gadget850 (Ed) talk 12:47, 22 March 2013 (UTC)
- Changing the capitalisation rules would widely affect bots. We'd diverge further from other cite templates and those related to them. As Wikid77 says mid-transition is not a good point. I think overall it would lead to more problems. Rjwilmsi 09:08, 22 March 2013 (UTC)
- Okay, we can table this, if people think it will be too much of a problem. For the record, I did add a check for URL=, which seems to be the most frequent variant in actual use among cases we haven't been checking (probably because it is an acronym and we allow most other acronyms to be all uppercase). Dragons flight (talk) 15:12, 22 March 2013 (UTC)
Pages with DOIs inactive since
This page is in this category. Shouldnt it just be for main space? Christian75 (talk) 19:20, 22 March 2013 (UTC)
- Right. {{Citation/identifier}} does a namespace check and uses the category only for articles. --— Gadget850 (Ed) talk 19:31, 22 March 2013 (UTC)
- Okay, but why only main space? References can also appear occasionally on file descriptions, Wikipedia pages, and other places. It's not obvious why only using the category for main space is the right idea. If the only problem one is worried about is documentation pages and ones like this one where the error is being displayed intentionally, then those can be removed from the category by adding
nocat=true
to the citation that generates the error. Dragons flight (talk) 20:00, 22 March 2013 (UTC)
- Okay, but why only main space? References can also appear occasionally on file descriptions, Wikipedia pages, and other places. It's not obvious why only using the category for main space is the right idea. If the only problem one is worried about is documentation pages and ones like this one where the error is being displayed intentionally, then those can be removed from the category by adding
- PS. I managed to block the track cats in enough places to disable that category for this page. Dragons flight (talk) 20:22, 22 March 2013 (UTC)
- Point. {{Broken ref}} controls the cite error messages and categorizes only main (article), template, category, help and file pages. --— Gadget850 (Ed) talk 20:25, 22 March 2013 (UTC)
Automated archiving
I've just set up automated archiving of this talk page, with a delay of seven days. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:09, 22 March 2013 (UTC)
URL in page number
When a page number is linked, a hyphen is converted to an ndash. --— Gadget850 (Ed) talk 16:07, 25 March 2013 (UTC)
Wikipedia:Village pump (technical)/Archive 131#Linking problem to Google books
Wikitext | {{cite book
|
---|---|
Live | Sedgwick, John (2000). Popular Filmgoing In 1930s Britain: A Choice of Pleasures. University of Exeter Press. pp. 146–148. ISBN 9780859896603. |
Sandbox | Sedgwick, John (2000). Popular Filmgoing In 1930s Britain: A Choice of Pleasures. University of Exeter Press. pp. 146–148. ISBN 9780859896603. |
- However, putting a URL in the page number is going to blow up the COinS metadata. --— Gadget850 (Ed) talk 16:22, 25 March 2013 (UTC)
- I agree that such links are a mess for COinS and should be active discouraged for that reason. At some point we should investigate whether parameter validation is reasonable / performance affordable, but that's not something I have any interest in getting into now. In the mean time it might be worth making hyphen to dash more selective about what it converts. Dragons flight (talk) 18:07, 25 March 2013 (UTC)
- COinS functionality is not policy, but WP:V is. exactly linking to the source preserves verifiability and source-text integrity, that is why page-linking is actively encouraged in the related pages. wikipedia policy should be the overarching standard to be followed, then work to make software compatible with policy. you are doing it the other way around. 70.19.122.39 (talk) 14:13, 27 March 2013 (UTC)
- I agree that such links are a mess for COinS and should be active discouraged for that reason. At some point we should investigate whether parameter validation is reasonable / performance affordable, but that's not something I have any interest in getting into now. In the mean time it might be worth making hyphen to dash more selective about what it converts. Dragons flight (talk) 18:07, 25 March 2013 (UTC)
- There are many bizarre things people can put in parameters: For now, we need to focus on transition to the Lua versions, and then next month, we can discuss adding numerous tests to reject improper data during parameter validation. For example, a user might put an external link in the parameter "last=http://www.google.com" (rather than "last=Google.com") and then the anchor text of a ref-id would contain the improper URL address after the "CITEREF" prefix. Perhaps we should not auto-replace a page-number hyphen with dash. -Wikid77 (talk) 17:18, 25 March 2013 (UTC)
- Looks like people are not aware that at our content guideline we tell editors to link the page parameter if they like see - WP:BOOKLINKS.Moxy (talk) 19:03, 25 March 2013 (UTC)
- Fix for common URL or wikilink in page number: Granted that is common to put a URL address in parameter "page=" (or "at="), such as linking one page in Google Books, then we could change the Lua formatted COinS metadata to not contain a URL or wikilink, as follows:
OCinSdata["rft.pages"] = Page or Pages or At
if OCinSdata["rft.pages"]:sub(1,1) == '[' then
OCinSdata["rft.pages"] = link
end
- The replacement of the linked page data by "link" will allow the COinS data to be used, now, and perhaps a better replacement, for the linked page number, could be determined next month. Currently, the COinS data stores a URL in page number as the following:
- pages=[http://books.google.com/books?id=YsUfc8Ijb-wC&pg=PA146 146]–148
- &rft.pages=%5Bhttp%3A%2F%2Fbooks.google.com%2Fbooks%3Fid%3DYsUfc8Ijb-wC%26pg%3DPA146+146%5D%E2%80%93148
- So that is the status so far. -Wikid77 (talk) 19:36, 26 March 2013 (UTC)
Help talk:Citation Style 1
Would it make sense to merge Help talk:Citation Style 1 with this page? (I'll ask there also). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:01, 27 March 2013 (UTC)
- I have thought on that. Let's not merge now, as this page is very busy. --— Gadget850 (Ed) talk 17:54, 27 March 2013 (UTC)
ORCID
Please see Help talk:Citation Style 1#ORCID, where some technical help is needed. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:35, 27 March 2013 (UTC)
cite tag
lastauthoramp parameter (again)
The presence of a value for |coauthors=
should suppress the "&" produced by |lastauthoramp=yes
to avoid the following:
- {{cite book |last=Last1 |last2=Last2 |last3=Last3 |last4=Last4 |last5=Last5 |last6=Last6 |last7=Last7 |last8=Last8 |coauthors=Last9; Last 10 & Last11 |year=2000 |contribution=Chapter | editor-last=EdLast1 |editor2-last=EdLast2 | title=Title |lastauthoramp=yes }} → Last1; Last2; Last3; Last4; Last5; Last6; Last7; Last8 (2000). "Chapter". In EdLast1; EdLast2 (eds.). Title.
{{cite book}}
: Unknown parameter|coauthors=
ignored (|author=
suggested) (help); Unknown parameter|lastauthoramp=
ignored (|name-list-style=
suggested) (help)CS1 maint: numeric names: authors list (link)
Note that |lastauthoramp=yes
is needed to put the "&" between the editors. (The same issue would arise with co-editors.) Peter coxhead (talk) 11:00, 28 March 2013 (UTC)
Just to note, this has always worked this way:
Wikitext | {{cite book
|
---|---|
Live | Last1; Last2; Last3; Last4; Last5; Last6; Last7; Last8 (2000). "Chapter". In EdLast1; EdLast2 (eds.). Title. {{cite book}} : Unknown parameter |coauthors= ignored (|author= suggested) (help); Unknown parameter |lastauthoramp= ignored (|name-list-style= suggested) (help)CS1 maint: numeric names: authors list (link)
|
Sandbox | Last1; Last2; Last3; Last4; Last5; Last6; Last7; Last8 (2000). "Chapter". In EdLast1; EdLast2 (eds.). Title. {{cite book}} : Unknown parameter |coauthors= ignored (|author= suggested) (help); Unknown parameter |lastauthoramp= ignored (|name-list-style= suggested) (help)CS1 maint: numeric names: authors list (link)
|
You are using coauthors as a workaround for the limit on the number of authors and you are asking to fix a problem that shows because of this workaround. Wouldn't it be better to fix the basic issue by increasing the number of authors? --— Gadget850 (Ed) talk 11:32, 28 March 2013 (UTC)
- Yes, I know that it's "always" been a bug. But other bugs relating to
|lastauthoramp=
have been fixed recently, so it should now be possible to fix this one. Lua makes it practical to increase the number of conditions that can be tested for and dealt with. - Unless an indefinite number of authors is allowed, increasing the number isn't really a solution. Scientific papers seem to have more and more authors these days; 30+ isn't unusual in some areas (big physics, molecular phylogenetic studies of many taxa, etc.). Peter coxhead (talk) 12:42, 28 March 2013 (UTC)
- Lua does allow for an indefinite number of authors. The two requirements are that if author N is specified then authors 1 to N-1 must also be specified, and you must also adjust displayauthors= to a higher value if you don't want the list truncated with "et al.". Subject to those restrictions, you can carry the author list to as many as you like, e.g. 45, 100, 1000. It doesn't matter (aside from the fact that it could be a pain to type), the Lua templates will handle it. Dragons flight (talk) 17:36, 28 March 2013 (UTC)
- Yes, I saw one where the list of authors was longer than the abstract. If you really want to include more than nine authors, then either stuff all the authors into the one 'authors' field or get the number of authors increased. --— Gadget850 (Ed) talk 12:50, 28 March 2013 (UTC)
- The first doesn't work because it prevents the automated Harvard style reference links being created – they need some individual authors. (And yes, I know you can set such links up manually, but it's tedious and clutters up the citation). Peter coxhead (talk) 13:11, 28 March 2013 (UTC)
- Yes, I saw one where the list of authors was longer than the abstract. If you really want to include more than nine authors, then either stuff all the authors into the one 'authors' field or get the number of authors increased. --— Gadget850 (Ed) talk 12:50, 28 March 2013 (UTC)
- I've fixed the code to handle the example case, but as I noted above, Coauthor is now unnecessary since Lua can handle arbitrarily many author and editor names. Dragons flight (talk) 20:00, 28 March 2013 (UTC)
I didn't realize that we now had a multiplicity of authors:
Wikitext | {{cite book
|
---|---|
Live | Last1; Last2; Last3; Last4; Last5; Last6; Last7; Last8; Last9; Last 10; Last11; Last12; Last13; Last14 (2000). "Chapter". In EdLast1; EdLast2 (eds.). Title. {{cite book}} : Unknown parameter |lastauthoramp= ignored (|name-list-style= suggested) (help)CS1 maint: numeric names: authors list (link)
|
Sandbox | Last1; Last2; Last3; Last4; Last5; Last6; Last7; Last8; Last9; Last 10; Last11; Last12; Last13; Last14 (2000). "Chapter". In EdLast1; EdLast2 (eds.). Title. {{cite book}} : Unknown parameter |lastauthoramp= ignored (|name-list-style= suggested) (help)CS1 maint: numeric names: authors list (link)
|
But the old templates went to 'et al.' after eight names. --— Gadget850 (Ed) talk 02:06, 29 March 2013 (UTC)
- Does User:Citation bot have a rule limiting the number of authors to the old 9? It reverted an attempt on my part to use an arbitrarily large number of authors. Choess (talk) 16:18, 29 March 2013 (UTC)
- Thanks- we will have to keep an eye on that as things change. DF has reported the issue. --— Gadget850 (Ed) talk 18:09, 29 March 2013 (UTC)
HTML classes redux
Please note the outstanding issue at #HTML classes, above. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:16, 28 March 2013 (UTC)
Reply moved to the aforesaid section, so as not to fragment discussion. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:17, 29 March 2013 (UTC)
New features
As we work though this, we are getting a few requests for new features. I would like to see us update the current templates and replicate their current function, while fixing bugs and inconsistencies.
I want to hold off on new features until we have the new templates in place and fully debugged. I started Module talk:Citation/CS1/Feature requests to track new features. --— Gadget850 (Ed) talk 14:44, 29 March 2013 (UTC)
- For my part, I'm not planning to think much about new features such as these (or microformats / ORCID) until after the bulk of the transition is complete. That said, if other people want to work out the details of what is needed, then that could be helpful when we do get there. Having already converted the largest four templates, I think we've already made huge progress on the transition though. The most significant missing one is {{citation}} (95k pages), while most of what is left after that are low priority (in my opinion) specialty templates. Dragons flight (talk) 16:51, 29 March 2013 (UTC)
- Exactly my point. The priority should be to completing the update, so I would rather push new features to the other talk page so we don't get a lot of clutter. I am going back through archives to pick out worthwhile proposals that were never implemented. One often requested was fixing double periods, which has been implemented. --— Gadget850 (Ed) talk 17:56, 29 March 2013 (UTC)
Catching user misspelling bugs (pilot error)
I think an important interim advancement will be the reporting of misspelled parameters (so-called "pilot error"), even if considered a "new feature" but also improving the handling of old parameters. These are some to consider:
- newspaper - misspelled as "newpaper" (omits "s" in "news")
- title - misspelled as "titel" (German spelling is "Titel")
- publisher - misspelled as "pubilsher" (transposed "li")
- accessdate - misspelled as "acessdate" (as one "c")
- accessdate - misspelled as "accesdate" (as one "s")
Although the misspellings are rare, I think many are discovered by users puzzling over the missing data in the results, then obsessing about spelling and finally realizing "newpaper" is a misspelling. Because the patrollers often mass-fix such one-word misspellings, it is difficult to know how many misspelled parameters were originally in use. For reporting those as "errors" we could edit the few articles which contain them, now, and fix those articles, before deploying the spell-savvy Lua version. -Wikid77 (talk) 16:29, 29 March 2013 (UTC)
- The best way to handle this is to have a complete list of supported parameters and check each submitted parameter for membership on that list so that any unknown parameter can be flagged. That will require someone making such a list, as well as some special handling for the numbered parameters. I'm also concerned that wide scale parameter validation may involve too large a performance penalty, though we won't really know that until we measure it. Dragons flight (talk) 17:29, 29 March 2013 (UTC)
- A whitelist would be the better way. I started Module talk:Citation/CS1/Error checking and will start populating it later. --— Gadget850 (Ed) talk 18:58, 29 March 2013 (UTC)
COinS genre
There's an entry in the changelog for October 20, 2012 stating that the default rft.genre for {{cite web}} was changed to "book". Looking at the COinS implementation guide, wouldn't "document" be a more appropriate value? Choess (talk) 19:40, 29 March 2013 (UTC)
- Acceptable rft.pages values: Also, there was prior discussion about changing the COinS rft.pages value to be "link" when "page=[...]" is either a wikilink or external IP link. Such as:
OCinSdata["rft.pages"] = Page or Pages or At
if OCinSdata["rft.pages"]:sub(1,1) == '[' then
OCinSdata["rft.pages"] = "link"
end
- Does it matter to User:DASHBot if rft.pages contains a URL or wikilink square brackets "[[...]]" rather than mild "link" as the contents? -Wikid77 (talk) 20:56, 29 March 2013 (UTC)
- Seems like a bad idea. If you have a page range like "58–87" where the "58" is linked, say to a Google Books URL, passing on the "58–87" is much more useful metadata than just "link". Where is "link" documented as a value for rft.pages, anyway? Choess (talk) 22:32, 29 March 2013 (UTC)
- Extracting text from links: Okay, we can "easily" just extract the text, from either a wikilink or external link. Perhaps like this:
OCinSdata["rft.pages"] = Page or Pages or At
if OCinSdata["rft.pages"]:sub(1,1) == '[' then
text = OCinSdata["rft.pages"]
if text:sub(1,2) == "[[" then
index = string.find(text, '|')
if index ~= nil
then OCinSdata["rft.pages"] = text:sub(index+1,-3)
else OCinSdata["rft.pages"] = text:sub(3,-3)
end
else
index = string.find (text, ' ')
local index2 = string.find (text, ']')
if index2 == nil then index2 = string.len(text)-1 end
if index2 == string.len(text) then index2 = string.len(text)-1 end
if index ~= nil
then OCinSdata["rft.pages"] = text:sub(index,index2-1) .. text:sub(index2+1,-1)
else OCinSdata["rft.pages"] = text:sub(index2+1,-1)
end
end
end
- Hence, extracting the text is not really easy, due to risks of people using incomplete links ("[[x]"), so perhaps just use:
OCinSdata["rft.pages"] = "link"
rather than risk a fatal "Script error" due to an invalid substring of the text. Otherwise, we need to test for "[x]]" or "[[xx|XX]" and absolutely ALL fatal conditions, or that horrific "Script error" will ruin the COinS metadata. I think the extraction of link text would be the most complex part of the entire cite-parameter processing. However, if we carefully checkindex
andindex2
, then it could work. -Wikid77 (talk) 00:39/00:57, 30 March 2013 (UTC)- Sorry, "bad idea" was unduly harsh on my part. I see why we'd have to be very careful with link extraction. But since the use of "link" for this purpose isn't defined anywhere in the standard, I think we should make as little use of it as reasonably possible, because we can't expect it to be usefully parsed by downstream consumers. (Note that "journal=" should probably also have link extraction performed on it before it gets turned into COinS, as obscure journals with Wikipedia articles are often linked at least once in a given bibliography.) Choess (talk) 04:36, 30 March 2013 (UTC)
- Getting a delink function for external and interwiki links: Okay, to simplify the COinS metadata, we need to delink text in titles, or authors, as well as page numbers. That means removing multiple links, as any combination of either simple wikilinks (of each part), or interwiki links (such as "wikt:" to Wiktionary) or some text as external links with "http*". Although the delinking will be, relatively, very complex, it could be pre-scanned by a rapid string search for simple '[' before invoking a tedious, thorough delink() function. I have submitted a request, to see if anyone is already writing such an important complex function; see: "wp:Lua_requests#Need a delink function". Because this is such a vital feature, I think it could be written soon. For each related cite parameter, the data item could be pre-scanned by "index=string.find(text,'[')" to avoid the overhead of delinking. -Wikid77 (talk) 08:20, 30 March 2013 (UTC)
- Generally speaking, we shouldn't need to delink authors, because of the existence of the "authorlink" parameter. Agree that a generalized function of this sort would be useful for text processing. Choess (talk) 12:57, 30 March 2013 (UTC)
- You're betting that no one has used an external link in the author field? I have seen some convoluted hacks in the last few weeks. --— Gadget850 (Ed) talk 13:40, 30 March 2013 (UTC)
- No, but if we bend over backwards to try to extract useful data from it, we'll wind up having to support it in perpetuity. Dump it into a maintenance category, but don't try to save the metadata from someone doing that. Choess (talk) 15:12, 30 March 2013 (UTC)
Next?
The five most-used templates have been updated. The next ones should be:
- {{Cite press release}}, supported by RefToolbar; main differences are a default 'type' and 'docket' as an alias to 'id'
- {{Cite AV media}}, supported by ProveIt
- {{Cite conference}}, supported by ProveIt
- {{Cite episode}}, supported by ProveIt
-- Gadget850 (Ed) talk 15:23, 30 March 2013 (UTC)
- You are forgetting about {{citation}} (95k page uses). It isn't actually CS1, but I think we can accommodate it. It has many more uses than all of those combined. Dragons flight (talk) 18:05, 30 March 2013 (UTC)
- Well, {citation} was tested weeks ago, and still seems reasonable to transition soon, while we still have much work for the {cite_*} forks, but do we need to have a /sandbox2 to experiment with the forks, while leaving /sandbox for top-priority updates? -Wikid77 (talk) 18:32, 30 March 2013 (UTC)
- Looking at those, press release and conference are pretty trivial. The others less so. Dragons flight (talk) 20:22, 30 March 2013 (UTC)
- {{Cite press release}} [1]
- 'title' is the included work title
- 'type' defaults to "Press release"
- 'docket' alias for 'id'
- {{Cite AV media}} [2]
- 'people' alias for 'last'
- 'medium' alias for 'type'
- 'minutes', 'time', 'timecaption' (these are used in other templates)
- 'distributor' alias for 'publisher'
- {{Cite conference}} [3]
- 'conference', 'conferenceurl'
- 'booktitle' for main title
- 'title' for included work title I would love to get these two parameter names straightened out, but with 5000 uses...
- {{Cite episode}} [4]
- 'transcript', 'transcripturl'
- 'airdate' alias to 'date'
- 'began', 'ended' (used in other templates)
- 'series', 'serieslink' for main work title
- 'title', 'episodelink' for included work title
- 'minutes', 'time', 'timecaption'
- 'season', 'series', 'seriesno'
- 'network', 'station'
Transition for Cite_press_release
I have created the 3 typical "Template:Cite_press_release/old" and "/lua" and "/new" versions, as used by {cite_compare}. For the rare parameter "docket" in the /lua template, I assigned the extra parameters as:
- |type={{{type|Press release}}}
- |id={{#if:{{{docket|}}}|{{{docket}}}|{{{id|{{{ID|}}}}}}}}
That ordering will give precedence to "docket=" as overriding "id=" in the same order as the {cite_press_release/old}. Remember: The original writers of the Lua Module:Citation/CS1 did not fully know about the "23" {cite_*} forks, nor even the basics about the typical {cite_web} format, and hence, hundreds of format changes had to be made, months ago, just to match the simple 20 common parameters of {cite_book} or similar. I was "hoping" to get lucky, but, "There is no substitute for knowledge" which the writers did not have. So now, we start rewriting in the sandbox (or /sandbox2? to not conflict with other changes?), to support {cite_press_release} and other {cite_*} forks.
Wikitext | {{cite press_release
|
---|---|
Live | spokesperson (30 March 2013). "Press document" (Press release). Associated Press. Retrieved 1 May 2012. {{cite press release}} : Unknown parameter |sandbox= ignored (help)
|
Sandbox | spokesperson (30 March 2013). "Press document" (Press release). Associated Press. Retrieved 1 May 2012. {{cite press release}} : Unknown parameter |sandbox= ignored (help)
|
With 17,000 transclusions, then {cite_press_release} is next in popularity, after {citation} with 95,200 transclusions. -Wikid77 (talk) 18:32, 30 March 2013 (UTC)
- All parameters: The following are all parameters, as a sanity check, but not a substitute for a full testcases page:
Wikitext | {{cite press release
|
---|---|
Live | Lastname, Firstname; Author2 (Year) [Origyear]. "Test Cite_press_release Parameters". Department. Journal (Press release). Series (in Language). Others (Edition ed.). Place: Publisher (published Publicationdate). Agency. p. page. arXiv:ArXiv. ASIN ASIN. Bibcode:Bibcode. doi:10.DOI. ISBN Isbn. ISSN Issn. JFM JFM. JSTOR Jstor. LCCN LCCN. MR MR. OCLC OCLC. OL OL-22A. OSTI OSTI. PMC PMC. PMID PMID. RFC RFC. SSRN SSRN. Zbl ZBL. Docket. Archived from the original (Format) on Archivedate. Retrieved Accessdate – via Via. Quote {{cite press release}} : |author2= has generic name (help); |edition= has extra text (help); |page= has extra text (help); Check |arxiv= value (help); Check |asin-tld= value (help); Check |asin= value (help); Check |bibcode= length (help); Check |doi= value (help); Check |isbn= value: invalid character (help); Check |issn= value (help); Check |jfm= value (help); Check |jstor= value (help); Check |lccn= value (help); Check |mr= value (help); Check |oclc= value (help); Check |ol= value (help); Check |osti= value (help); Check |pmc= value (help); Check |pmid= value (help); Check |rfc= value (help); Check |ssrn= value (help); Check |zbl= value (help); Check date values in: |accessdate= , |year= , |publicationdate= , and |archivedate= (help); Invalid |ref=harv (help); More than one of |pages= , |at= , and |page= specified (help); Unknown parameter |coauthor= ignored (|author= suggested) (help); Unknown parameter |conference= ignored (help); Unknown parameter |conferenceurl= ignored (help); Unknown parameter |deadurl= ignored (|url-status= suggested) (help); Unknown parameter |doi_inactivedate= ignored (help); Unknown parameter |editorformat= ignored (help); Unknown parameter |editors= ignored (|editor= suggested) (help); Unknown parameter |laydate= ignored (help); Unknown parameter |laysource= ignored (help); Unknown parameter |laysummary= ignored (help); Unknown parameter |publicationdate= ignored (|publication-date= suggested) (help); Unknown parameter |sandbox= ignored (help); Unknown parameter |subscription= ignored (|url-access= suggested) (help); Unknown parameter |titlelink= ignored (|title-link= suggested) (help); Unknown parameter |trans_title= ignored (|trans-title= suggested) (help); Unknown parameter |transcript= ignored (help); Unknown parameter |transcripturl= ignored (help)CS1 maint: numeric names: authors list (link) CS1 maint: unrecognized language (link) CS1 maint: year (link)
|
Sandbox | Lastname, Firstname; Author2 (Year) [Origyear]. "Test Cite_press_release Parameters". Department. Journal (Press release). Series (in Language). Others (Edition ed.). Place: Publisher (published Publicationdate). Agency. p. page. arXiv:ArXiv. ASIN ASIN. Bibcode:Bibcode. doi:10.DOI. ISBN Isbn. ISSN Issn. JFM JFM. JSTOR Jstor. LCCN LCCN. MR MR. OCLC OCLC. OL OL-22A. OSTI OSTI. PMC PMC. PMID PMID. RFC RFC. SSRN SSRN. Zbl ZBL. Docket. Archived from the original (Format) on Archivedate. Retrieved Accessdate – via Via. Quote {{cite press release}} : |author2= has generic name (help); |edition= has extra text (help); |page= has extra text (help); Check |arxiv= value (help); Check |asin-tld= value (help); Check |asin= value (help); Check |bibcode= length (help); Check |doi= value (help); Check |isbn= value: invalid character (help); Check |issn= value (help); Check |jfm= value (help); Check |jstor= value (help); Check |lccn= value (help); Check |mr= value (help); Check |oclc= value (help); Check |ol= value (help); Check |osti= value (help); Check |pmc= value (help); Check |pmid= value (help); Check |rfc= value (help); Check |ssrn= value (help); Check |zbl= value (help); Check date values in: |accessdate= , |year= , |publicationdate= , and |archivedate= (help); Invalid |ref=harv (help); More than one of |pages= , |at= , and |page= specified (help); Unknown parameter |coauthor= ignored (|author= suggested) (help); Unknown parameter |conference= ignored (help); Unknown parameter |conferenceurl= ignored (help); Unknown parameter |deadurl= ignored (|url-status= suggested) (help); Unknown parameter |doi_inactivedate= ignored (help); Unknown parameter |editorformat= ignored (help); Unknown parameter |editors= ignored (|editor= suggested) (help); Unknown parameter |laydate= ignored (help); Unknown parameter |laysource= ignored (help); Unknown parameter |laysummary= ignored (help); Unknown parameter |publicationdate= ignored (|publication-date= suggested) (help); Unknown parameter |sandbox= ignored (help); Unknown parameter |subscription= ignored (|url-access= suggested) (help); Unknown parameter |titlelink= ignored (|title-link= suggested) (help); Unknown parameter |trans_title= ignored (|trans-title= suggested) (help); Unknown parameter |transcript= ignored (help); Unknown parameter |transcripturl= ignored (help)CS1 maint: numeric names: authors list (link) CS1 maint: unrecognized language (link) CS1 maint: year (link)
|
The option "type=" has been omitted, to show default "Press release". The id has been set as "id=Docket" for emphasis. Page should show "p." and the new parameters include: "issue=" and "publicationdate=" and "agency=" in case used. This test omits "chapter=" and "chapterlink=" and "chapterurl=" because the format is not yet resolved for them, and hence nothing to test against. The original for "ol=" does not show a space between "OL OL-22A" which is a prior, unresolved issue. Those parameters give a general overview of format alignment. -Wikid77 (talk) 19:57/20:19, 30 March 2013 (UTC)
Cite_web unlinked Citation/core 16 hours
FYI: I think I caught the start of the unlinking of those 1.3 million pages from {Citation/core}, which began 23 hours after {cite_web} transitioned, to start unlinking just before 01:00 (UTC) which ran until around 16:00, at the rate of 1380-1600 pages unlinked per minute. That left 158,000 pages still using {Citation/core}, as 95,000 pages with {citation} and 63,000 other pages using {cite_*} forks, or perhaps the "late bloomer" pages of {cite_web} still trying to unlink, for a few more days. -Wikid77 (talk) 18:47, 30 March 2013 (UTC)
- You might be interested in WP:REPLAG. The toolserver tools, including the transclusion count tool, often lag behind real-time due to delays in mirroring database changes. That's why the unlinking appeared to begin 20+ hours after the actual transition. Dragons flight (talk)
Language icon template in the title field exposes the language category
Language icon template in the title field exposing the language category. See Earthquakes_in_2013#cite_note-8:
Carolina Canales (January 30, 2013). "Una persona falleció por paro cardíaco en Copiapó tras fuerte sismo (in Spanish)". El Mercurio. Retrieved February 3, 2013.
The error does not show here as the template uses namespace detection to place the category only on article pages.
My recommendation is to show the icon (and why it is called an icon template is beyond me) and place the page into a maintenance category. The cite template documentation clearly notes the use of the 'language' field and to not use templates or icons. -- Gadget850 (Ed) talk 12:22, 1 April 2013 (UTC)
- My current inclination is to not change the display, but to add a maintenance category if the URL display text has any double bracketed expression in it (i.e. "[[...]]"). Does that seem reasonable? I don't see any particularly reasonable way to make the embedded category actually work the old way without breaking other things. Dragons flight (talk) 15:59, 1 April 2013 (UTC)
- OK by me. We can't fix every misuse. -- Gadget850 (Ed) talk 17:11, 1 April 2013 (UTC)
- I went ahead and trapped this to Category:Pages with citations having wikilinks embedded in URL titles. It is populating faster than I might have expected, which worries me somewhat. Dragons flight (talk) 20:13, 1 April 2013 (UTC)
- If needed, we can task a bot to start fixing this. Then we have templates like {{Ru-pop-ref}}
and {{ru-census2010}}with the icon template embedded. I will fix those later and it should reduce this a bit. -- Gadget850 (Ed) talk 20:57, 1 April 2013 (UTC)- I just removed {{ru icon}} from {{Ru-pop-ref}} which has 3779 uses. It also had 'work' and 'title' wrapped in
{{lang|ru}}
. -- Gadget850 (Ed) talk 00:54, 2 April 2013 (UTC) - And the number just went from 794 to 526. -- Gadget850 (Ed) talk 01:26, 2 April 2013 (UTC)
- Just sampling- found one with a typo- a bracket instead of a pipe; another had the title wikilinked and linked with a URL. -- Gadget850 (Ed) talk 02:23, 2 April 2013 (UTC)
- I just removed {{ru icon}} from {{Ru-pop-ref}} which has 3779 uses. It also had 'work' and 'title' wrapped in
- If needed, we can task a bot to start fixing this. Then we have templates like {{Ru-pop-ref}}
- I went ahead and trapped this to Category:Pages with citations having wikilinks embedded in URL titles. It is populating faster than I might have expected, which worries me somewhat. Dragons flight (talk) 20:13, 1 April 2013 (UTC)
Others
Wikitext | {{cite book
|
---|---|
Live | Jones, John (1945). "My Chapter". My Book. Vol. III. OTHERS. McGraw Hill. |
Sandbox | Jones, John (1945). "My Chapter". My Book. Vol. III. OTHERS. McGraw Hill. |
Wikitext | {{cite web
|
---|---|
Live | Jones, John (1945). "My website". OTHERS. McGraw Hill. {{cite web}} : Missing or empty |url= (help)
|
Sandbox | Jones, John (1945). "My website". OTHERS. McGraw Hill. {{cite web}} : Missing or empty |url= (help)
|
Wikitext | {{cite journal
|
---|---|
Live | Jones, John (1945). My Journal. 86. OTHERS. {{cite journal}} : |article= ignored (help); Missing or empty |title= (help)
|
Sandbox | Jones, John (1945). My Journal. 86. OTHERS. {{cite journal}} : |article= ignored (help); Missing or empty |title= (help)
|
Is there a good reason that OTHERS is sometimes in front and sometimes in the back? It just looks odd to me. Dragons flight (talk) 18:20, 1 April 2013 (UTC)
- I have never understood why there are so many changes when one of the periodical parameters is set. I though I had documented the conditionals here, but missed this one. I don't see a good reason for this one. -- Gadget850 (Ed) talk 18:24, 1 April 2013 (UTC)
publication-date with date
Wikitext | {{cite journal
|
---|---|
Live | Jones, Bob (1975). "My Article". My Magazine. New York: Magazines.com (published May 1980): 125–129. Archived from the original on 2008-05-24. Retrieved 2007-05-34. {{cite journal}} : Check date values in: |accessdate= (help); Unknown parameter |sandbox= ignored (help)
|
Sandbox | Jones, Bob (1975). "My Article". My Magazine. New York: Magazines.com (published May 1980): 125–129. Archived from the original on 2008-05-24. Retrieved 2007-05-34. {{cite journal}} : Check date values in: |accessdate= (help); Unknown parameter |sandbox= ignored (help)
|
Different handling of publication date |
When given both a date= and a different publication-date=, and when working in journal / news mode, the position of the publication date is different in Lua than in the original. Does this matter? I only just noticed the discrepancy. If we want to keep the publication-date inside the parentheses, should there be a separator after the publisher name? Dragons flight (talk) 21:29, 1 April 2013 (UTC)
- Contrast with Cite_web and Citation: Definitely, any parameters must have a separator, and in this case, I suggest to use a comma (if location/publisher non-null) to avoid the common double-dot "(London: Acme Inc.. 2011)". For {cite_web}, the publication-date is surrounded by "(published___)" but {citation} format is similar to {cite_journal}, with the operation as follows:
Wikitext | {{cite web
|
---|---|
Live | Jones, Bob (1975). "My Article". My Magazine. New York: Magazines.com (published May 1980). pp. 125–129. Archived from the original on 2008-05-24. Retrieved 2007-05-34. {{cite web}} : Check date values in: |accessdate= (help); Unknown parameter |sandbox= ignored (help)
|
Sandbox | Jones, Bob (1975). "My Article". My Magazine. New York: Magazines.com (published May 1980). pp. 125–129. Archived from the original on 2008-05-24. Retrieved 2007-05-34. {{cite web}} : Check date values in: |accessdate= (help); Unknown parameter |sandbox= ignored (help)
|
Different handling of publication date |
Wikitext | {{citation
|
---|---|
Live | Doe, Maryann (1975), "My Article", My Magazine, London: Magazines.com (published May 1980), pp. 125–129, archived from the original on 2008-05-24, retrieved 2013-04-01 {{citation}} : Unknown parameter |sandbox= ignored (help)
|
Sandbox | Doe, Maryann (1975), "My Article", My Magazine, London: Magazines.com (published May 1980), pp. 125–129, archived from the original on 2008-05-24, retrieved 2013-04-01 {{citation}} : Unknown parameter |sandbox= ignored (help)
|
Different handling of publication date |
Fortunately, the parameter "publication-date=" is rare enough that it is not seen much. Hence, there is ample time to discuss possible complications. -Wikid77 (talk) 00:38, 2 April 2013 (UTC)
- Counts of publication-date usage: Using the wiki-search Special:Search, I counted only 6,183 pages which use "publication-date" as the rarity is 3-per-1000, or 99.7% of CS1-cite pages do not use it. In a "random sample" of 2,000 of those 6,183 pages, the year/date percentages were: 56% year-only, 9% full date (4% with comma), 4% month-year, 31% blank (as boilerplate template with many blank parameters). The wikisearch results show only the first use of "publication-date" on each page, but on some pages, the use of full dates is >60% while year-only dates are fewer within those pages. The overall usage seems to be rare-obsessive, because while only 3-per-1,000 pages have "publication-date" yet those pages are packed with those dates. Hence, once editors trend towards using "publication-date", then it becomes a 90% parameter in those rare pages. -Wikid77 (talk) 10:37, 2 April 2013 (UTC)
- I expect a lot of those are blank, being copied from the full parameter set. -- Gadget850 (Ed) talk 12:02, 2 April 2013 (UTC)
Wikitext | {{citation
|
---|---|
Live | Doe, Maryann (1975), "My Article", My Magazine, London (published May 1980), pp. 125–129, archived from the original on 2008-05-24, retrieved 2013-04-01 {{citation}} : Unknown parameter |sandbox= ignored (help)
|
Sandbox | Doe, Maryann (1975), "My Article", My Magazine, London (published May 1980), pp. 125–129, archived from the original on 2008-05-24, retrieved 2013-04-01 {{citation}} : Unknown parameter |sandbox= ignored (help)
|
Wikitext | {{citation
|
---|---|
Live | Doe, Maryann (1975), "My Article", My Magazine, Magazines.com (published May 1980), pp. 125–129, archived from the original on 2008-05-24, retrieved 2013-04-01 {{citation}} : Unknown parameter |sandbox= ignored (help)
|
Sandbox | Doe, Maryann (1975), "My Article", My Magazine, Magazines.com (published May 1980), pp. 125–129, archived from the original on 2008-05-24, retrieved 2013-04-01 {{citation}} : Unknown parameter |sandbox= ignored (help)
|
After thinking about several options here, and noticing that there were cases where we were omitting the publication date entirely, I decided to try a version that explicitly prefixed the publication date with "published". This behavior is similar to the labeling in cite web / book, though they still differ on where the parentheses are placed. I basically felt that having multiple year terms was potentially confusing if neither was identified. This is especially true if the publication date is a simple year (e.g. "1980"), since it might otherwise be mistaken as part of the publisher's name. Dragons flight (talk) 15:35, 3 April 2013 (UTC)
propose series:volume
please consider–
- if
series,volume
displayseries:volume
- if
70.19.122.39 (talk) 12:18, 2 April 2013 (UTC)
- Why? Current display:
- -- Gadget850 (Ed) talk 12:38, 2 April 2013 (UTC)
- linkage for context? it may not be obvious to readers that "volume title" is actually a volume title of the defined series. especially if "volume title" is in plain text. it also follows convention used for other couplings
location:publisher
[journal] issue:page(s)
- 70.19.122.39 (talk) 13:40, 2 April 2013 (UTC)
- linkage for context? it may not be obvious to readers that "volume title" is actually a volume title of the defined series. especially if "volume title" is in plain text. it also follows convention used for other couplings
i notice that trailing punctuation for series
has been removed if string(volume data)<5. although this is a good idea, it compounds the confusion, because the punctuation is retained otherwise. also noted at the talk page for Citation Style 1 help. 70.19.122.39 (talk) 13:13, 3 April 2013 (UTC)
Error messages
Please prefix error messages, both shown and hidden with " Citation error: ". This will allow us to search for that text and more quickly pinpoint the citation in error. -- Gadget850 (Ed) talk 14:31, 2 April 2013 (UTC)
- If something is already styled class="error", isn't it redundant to also say "Error:". The bright red text should already make the issue easy to find and pretty obvious that something is amiss. {{Error}}, {{FormattingError}}, {{Error:must be substituted}}, etc. don't include any preface saying "Error:". I'm not sure why searching for "Citation error" in a page is any easier than scrolling the page looking for the red text. Dragons flight (talk) 15:50, 2 April 2013 (UTC)
- It would match the cite error message styles. And some of us have scripts to show other error conditions (Harv error:, Ref error:) so there can sometimes be a sea of red. -- Gadget850 (Ed) talk 16:04, 2 April 2013 (UTC)
- Well if they are your scripts, presumably you can style them to be easier for you to work with? That doesn't seem like much of an argument. Anyway, count me as opposed to adding "Error:" or "Citation error:" to the messages. In my mind it's redundant and just makes them even more garish than they need to be. However, ultimately, I'll accept whatever community consensus favors, so what do other people want? Dragons flight (talk) 01:24, 3 April 2013 (UTC)
- It would match the cite error message styles. And some of us have scripts to show other error conditions (Harv error:, Ref error:) so there can sometimes be a sea of red. -- Gadget850 (Ed) talk 16:04, 2 April 2013 (UTC)
- Prefix "Cite error" would separate from others: For example, if a cite contained a "quote=" with milage, then a user might include a bracketed conversion, such as {{convert|25,000|mi|km}} = 25,000 miles (40,000 km) but put 3 letter 'o' as {{convert|25,ooo|mi|km}} with error message: [convert: invalid number]. Because "all" citation errors would have prefix "Cite error:" then the Help:Desk editors could easily see not a citation error as in Help:Cite_error, but check the quoted text. -Wikid77 (talk) 03:47, 3 April 2013 (UTC)
I concur with Editor Dragons flight. No need to mark the error message with something that says that the message is an error message. I, too, have a script that bleeds red text all over an article page when it encounters harv errors. Distinguishing between CS1 errors and the harv and cite errors hasn't been a problem for me. Even though not bolded, the red message text, a different shade of red from that of a redlink, is sufficient to identify a malformed {{cite whatever}}
.
If it is determined that the error messages need prefix text, it should not be the same as the cite error message style. Those messages are specifically for <ref></ref>
and <references />
errors. If an error message prefix is required it should be something like CS1 error:<message>. Then, appropriate Help space pages should be created to explain the errors. Of course, the individual error messages could link directly to the relevant help page ...
—Trappist the monk (talk) 19:57, 3 April 2013 (UTC)
- One more point: the visually impaired with screen readers are going to run into these messages and not understand they are errors. -- Gadget850 (Ed) talk 18:43, 5 April 2013 (UTC)
- Point. Still, prefix with something different from any other error message – something specific to CS1 citations.
- —Trappist the monk (talk) 18:58, 5 April 2013 (UTC)
- "CS1 error:" anyone? -- Gadget850 (Ed) talk 16:47, 10 April 2013 (UTC)
- —Trappist the monk (talk) 18:58, 5 April 2013 (UTC)
Signpost
We should announce progress on the Signpost. I have started a draft at Module talk:Citation/CS1/Updates. -- Gadget850 (Ed) talk 16:06, 2 April 2013 (UTC)
- The WMF has suggested it might be good to have a guest post on the official WMF blog about the transition of citations to Lua. Dragons flight (talk) 01:14, 3 April 2013 (UTC)
- Perhaps fixed ALL double-dot problems: With the latest /sandbox update to function safejoin(), to avoid double-dot after wikilinks or external links, then the Lua cites enter the realm of "smart software" where a user could wikilink "Washington, D.C." in cites and never see a double-dot problem. While some people complained about ampersand "&" separator for 1-in-20,000 articles, I must note the double-bot problem, in many thousands of citations, was not a "user input error" but rather, professional copyeditors might advise it was a "typesetter's embarrassment". I think it is "fixed" for almost every imaginable case. Not that we need to emphasize this, but for users who think the 6x faster reformat could have waited, or re-adding quick COinS metadata was not a priority for DASHBot to fix dead URLs, then fixing many thousands of double-dots will offset any claims of "broken formatting" about a handful of rare parameters. When counting clerical errors, be sure to count them by the thousands. -Wikid77 (talk) 03:13, 3 April 2013 (UTC)
Template:Citation with patent data
The final testing of {citation} involves the patent-data parameters. Currently:
Wikitext | {{citation
|
---|---|
Live | Title of Invention {{citation}} : Unknown parameter |country-code= ignored (help); Unknown parameter |inventor1-first= ignored (help); Unknown parameter |inventor1-last= ignored (help); Unknown parameter |issue-date= ignored (help); Unknown parameter |patent-number= ignored (help)
|
Sandbox | Title of Invention {{citation}} : Unknown parameter |country-code= ignored (help); Unknown parameter |inventor1-first= ignored (help); Unknown parameter |inventor1-last= ignored (help); Unknown parameter |issue-date= ignored (help); Unknown parameter |patent-number= ignored (help)
|
I have added 3 examples to Module_talk:Citation/CS1/test/citation, but those 122 total testcases run over 9 Lua seconds, and during edit-preview might hit timeout error on the miserly 10-second timeout limit. -Wikid77 (talk) 16:24, 2 April 2013 (UTC)
- Why oh why do we bundle the patents with the rest of {{citation}} basically two separate templates under the same thin wrapper, and the patents only have 1876 transclusions.[5] Surly it better just to split the patents off to their own template.--Salix (talk): 18:05, 2 April 2013 (UTC)
- Are you (or anyone else) volunteering to edit the 1876 articles in order to split it off? I agree that it is a somewhat weird merger, but I'm not personally interested in trying to split it off. Dragons flight (talk) 01:12, 3 April 2013 (UTC)
- Bundling of a few forks together was not slow: The logic to detect the invention/patent parameters has always been quick, and there is no advantage to creating yet another, "24th" {cite_*} fork. In fact, the structure of Template:Citation/old was the way to have made {cite_web} or {cite_book} run 4x times faster in markup, without Lua. -Wikid77 (talk) 00:01, 3 April 2013 (UTC)
Perhaps putsep function to avoid double-dot
- RESOLVED: See end-note Resolved.... -Wikid77 02:42, 3 April 2013 (UTC)
When a trailing-dot item has the dot embedded inside a wikilink, then perhaps there could be a status switch, "hasdot=true
" where a special function, putsep(hasdot, sepc), could check the status of the prior data item's dot. The embedded trailing dot (detected as ".]]" in x:sub(-3,-1) substring) would suppress the double-dot cases. The following are some special cases:
Wikitext | {{cite book
|
---|---|
Live | Doe, John Q. The Rise of Dotcom, Inc. Acme Inc. {{cite book}} : Unknown parameter |Series= ignored (|series= suggested) (help); Unknown parameter |sandbox= ignored (help)
|
Sandbox | Doe, John Q. The Rise of Dotcom, Inc. Acme Inc. {{cite book}} : Unknown parameter |Series= ignored (|series= suggested) (help); Unknown parameter |sandbox= ignored (help)
|
The embedded trailing dots are rare, but the use of putsep(hasdot, sepc) could provide a simple method to track the pending double-dot error, omit the sepc separator when also a dot (not sepc=','), then reset as hasdot=false
so the next item would continue to put the typical sepc separator character. Any data item which has a trailing dot, would again set hasdot=true
, while parameters shown in "(___)" would set hasdot=false
, and the remainder of the data items would use the function putsep() to avoid double-dot cases, no matter where the extra dot was embedded. This simple, almost error-proof tactic was not practical in markup due to the lack of local status variables to remember whether the prior data item had a trailing dot.
However, for now, we can just check the title for a wikilink ending in ".]]" as a common case of an embedded trailing dot. -Wikid77 (talk) 00:01, 3 April 2013 (UTC)
- Resolved - updated function safejoin() in Lua script: I see the whole logic had been changed to use function safejoin() to check for duplicate characters everywhere, and so, I added a quick extra comparison to detect a trailing close-bracket ']' with dot in either a wikilink "[[xx.]]" or in an external link "[http__xx.]" or in italic links. The added checks, with wikilinks or external links, and italicized links, will fix the double-dot problem for almost every case.
Wikitext | {{cite book
|
---|---|
Live | Doe, Dotty D. Dots... Dot Books Etc. Dot Corp. pp. 23 ff. Std. {{cite book}} : Unknown parameter |sandbox= ignored (help)
|
Sandbox | Doe, Dotty D. Dots... Dot Books Etc. Dot Corp. pp. 23 ff. Std. {{cite book}} : Unknown parameter |sandbox= ignored (help)
|
- The extra overhead compares end_chr to ']' or when italic, then compares 2 substrings to
".]]''" or ".]''"
as only for an italic parameter. Hence, for most parameters, the extra overhead is one simple character comparison to ']' for each extra parameter in a cite. -Wikid77 (talk) 02:42, 3 April 2013 (UTC)
COinS
I don't think the COinS output is correct. See Module talk:Citation/CS1/COinS. {{cite web/sandbox}} has the old COinS restored. Here is the resolved COinS for the Lua version, and the resolved COinS for the old version.
The Lua version includes only the first and last authors while the old version includes the first through ninth authors. -- Gadget850 (Ed) talk 12:55, 4 April 2013 (UTC)
- It would be helpful if someone could dig out the COinS spec and clarify precisely what data should be mapped to each field. I suspect there are probably multiple areas where we aren't doing that quite right, and I haven't really made any attempt to validate COinS output beyond taking a handful of COinS links and seeing that they worked correctly with a randomly chosen COinS reader plug-in I found for Chrome. Dragons flight (talk) 16:48, 4 April 2013 (UTC)
- http://ocoins.info/#id3205609421 will help. The two linown.nknks there have the specs for a journal and a book. -- Gadget850 (Ed) talk 17:16, 4 April 2013 (UTC)
- The authors are properly populated now in
rft.au
. - The work/etc. title is in
rft.stitle
(journal short title) where it should be inrft.jtitle
. I don't see a need forrft.stitle
here. - 'page' is in
rft.spage
, which is for a start page; it should more properly be inrft.pages
rft.eissn
is being populated; it could be different from the ISSN and we don't support the eISSN; this should be removed unless we add eISSN.- For {{cite journal}}
rft.genre
is 'book' where it should probably be 'article'; need to look at the values per cite type.
I need to look at the book values. -- Gadget850 (Ed) talk 00:45, 6 April 2013 (UTC)
Made a consolidated list of keys at User:Gadget850/COinS. -- Gadget850 (Ed) talk 02:29, 6 April 2013 (UTC)
Because I got to wondering what happened when a citation included |isbn=
with a malformed value, I speculated that the error propagated into the COinS metadata. I think that is true:
{{cite book |title=Book Title |isbn=hardback 978-1-59714-033-1}}
- →Book Title. ISBN hardback 978-1-59714-033-1.
{{cite book}}
: Check|isbn=
value: invalid character (help) '"`UNIQ--templatestyles-0000029C-QINU`"'<cite class="citation book cs1">''Book Title''. [[ISBN (identifier)|ISBN]] [[Special:BookSources/hardback 978-1-59714-033-1|<bdi>hardback 978-1-59714-033-1</bdi>]].</cite><span title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=Book+Title&rfr_id=info%3Asid%2Fen.wiki.x.io%3AModule+talk%3ACitation%2FCS1%2FArchive+6" class="Z3988"></span> <span class="cs1-visible-error citation-comment"><code class="cs1-code">{{[[Template:cite book|cite book]]}}</code>: </span><span class="cs1-visible-error citation-comment">Check <code class="cs1-code">|isbn=</code> value: invalid character ([[Help:CS1 errors#bad_isbn|help]])</span>
In amongst all of that is this: rft.isbn=hardback+978-1-59714-033-1
. Clearly incorrect.
That leads me to the question: Should CS1 be creating COinS metadata from malformed parameters? I think not. I think that when there are citations that have errors related to parameters that contribute to the COinS metadata, CS1 should not produce metadata. So, in this case, because |isbn=
is malformed, CS1 should only produce:
<span class="citation book">''Book Title''. [[International Standard Book Number|ISBN]] [[Special:BookSources/hardback 978-1-59714-033-1|hardback 978-1-59714-033-1]]<span style="font-size:100%" class="error"> Check <code>|isbn=</code> value ([[Help:CS1 errors#bad_isbn|help]])</span>.</span><span style="display:none;"> </span>[[Category:Pages with ISBN errors]]
Math exposes strip markers in COinS
This citation looks fine, but the resolved COinS shows exposed strip markers:
Benjamin, Arthur T.; Orrison, M. E. (2002), "Two quick combinatorial proofs of " (PDF), College Mathematics Journal, 33 (5): 406–408.
-- Gadget850 (Ed) talk 15:03, 4 April 2013 (UTC)
- COinS has no specification as to how mathematical equations should be handled and I doubt if any method would be understood by third party software. The raw human readable LaTeX seems as good as any other method to me. I also note the rft.title is wrong, this should the journal title[6] not the article title.--Salix (talk): 16:50, 4 April 2013 (UTC)
- I could have worded that better. The rendered COinS is polluted because something is causing the included
<math>
tag to expose it's strip markers. I have been poking at it but don't see the problem. - I saw the title issue and am going to add it to the above discussion. -- Gadget850 (Ed) talk 17:08, 4 April 2013 (UTC)
- I could have worded that better. The rendered COinS is polluted because something is causing the included
- For math, that's not actually an option. Mediawiki transforms <math> into strip markers before this module ever sees them. (It's a generic issue for all Lua code actually.) The only things we can really do are output the marker as is (the current option) or do some form of generic search and replace, e.g. replace all strip markers with an empty string, or replace them all with a nonspecific but vaguely informative phrase like "--math expression--". There is no way to get back the original TeX code, and I'm not sure if any of these options will actually help the COinS reader. Dragons flight (talk) 17:12, 4 April 2013 (UTC)
- Yes. Upon reflection, even if the strip markers weren't exposed, we would be stuck with pushing math markup into the title or something equally inelegant. If it can't be rendered in plain text, then it can't be properly resolved by COinS. We could strip the markers and content, but I don't see a real fix. -- Gadget850 (Ed) talk 18:05, 4 April 2013 (UTC)
- I did once propose adding another parameter coins-title which could be used to provide a human readable title for cases like this where the normal title cannot be parsed.--Salix (talk): 07:41, 5 April 2013 (UTC)
- Yes. Upon reflection, even if the strip markers weren't exposed, we would be stuck with pushing math markup into the title or something equally inelegant. If it can't be rendered in plain text, then it can't be properly resolved by COinS. We could strip the markers and content, but I don't see a real fix. -- Gadget850 (Ed) talk 18:05, 4 April 2013 (UTC)
- For math, that's not actually an option. Mediawiki transforms <math> into strip markers before this module ever sees them. (It's a generic issue for all Lua code actually.) The only things we can really do are output the marker as is (the current option) or do some form of generic search and replace, e.g. replace all strip markers with an empty string, or replace them all with a nonspecific but vaguely informative phrase like "--math expression--". There is no way to get back the original TeX code, and I'm not sure if any of these options will actually help the COinS reader. Dragons flight (talk) 17:12, 4 April 2013 (UTC)
- Lua puts journal/work title in rft.jtitle not rft.atitle: The Lua COinS tag puts the article title into rft.atitle and rft.btitle, while the journal/work title is stored into rft.jtitle, such as:
- rft.jtitle=%5B%5BCollege+Mathematics+Journal%5D%5D
- Months ago, I had checked the COinS data with one author, carefully, but the order of the data fields differs from the markup-based COinS tags, removed last year. -Wikid77 18:41, 4 April 2013 (UTC)
- DASHBot has worked for years with various bad data: The long-term operation of User:DASHBot, to insert "archiveurl=" and "deadurl=" data, into the CS1/CS2 cite parameters, has been working well, despite all the complex wikilinks and math-tag data. However, there were complaints about math-tags in titles with the markup-based {cite_*} COinS using Template:Citation/core, and so hopefully, the markup-based COinS tags can be restored, now that math titles are processed by Lua in all of the "big 5" cites (web/book, news, journal & {citation} ). -Wikid77 18:41, 4 April 2013 (UTC)
- The titles display properly now, it is just the COinS data that is munged. Best I can see is to tell editors to avoid
<math>
as best as possible. -- Gadget850 (Ed) talk 18:50, 4 April 2013 (UTC)- Better to tell the authors of the papers to avoid putting math formulas in their titles. (Which I think is good advice, but in some cases it's a little late...) Once it's in the title, you have to cite it properly. Many formulas including this one just can't be done without math markup, and once MathJax becomes more standard here it will be desirable to use it even in cases where wikimarkup sort of works, for better consistency of formatting. —David Eppstein (talk) 19:00, 4 April 2013 (UTC)
- Some abstracts have text that can be copied, suchas [7] but this isn't one of them. And this particular example illustrates that we aren't the only ones with problems handling this:[8]. I suppose we could put these in a maintenance category and fix what we can. -- Gadget850 (Ed) talk 19:07, 4 April 2013 (UTC)
- Better to tell the authors of the papers to avoid putting math formulas in their titles. (Which I think is good advice, but in some cases it's a little late...) Once it's in the title, you have to cite it properly. Many formulas including this one just can't be done without math markup, and once MathJax becomes more standard here it will be desirable to use it even in cases where wikimarkup sort of works, for better consistency of formatting. —David Eppstein (talk) 19:00, 4 April 2013 (UTC)
Whitelist
If anyone that is curious, counting variant spellings and capitalization, we currently support 177 178 basic arguments and 34 numbered arguments. See: Module:Citation/CS1/Whitelist. Dragons flight (talk) 19:33, 4 April 2013 (UTC)
- Wow. So we could detect unlisted arguments and kick the page into a maintenance category? -- Gadget850 (Ed) talk 20:05, 4 April 2013 (UTC)
- That's what I intend to try, though I also want to look closely at the performance overhead associated with a scheme like this before deciding whether to deploy it. Dragons flight (talk) 20:41, 4 April 2013 (UTC)
- Would save me having to do a database scan. Rjwilmsi 21:01, 4 April 2013 (UTC)
- If it is a performance hit, then perhaps we could enable it one a month and allow the category to populate. -- Gadget850 (Ed) talk 21:39, 4 April 2013 (UTC)
- Would save me having to do a database scan. Rjwilmsi 21:01, 4 April 2013 (UTC)
- That's what I intend to try, though I also want to look closely at the performance overhead associated with a scheme like this before deciding whether to deploy it. Dragons flight (talk) 20:41, 4 April 2013 (UTC)
- Using
mw.loadData('Module:Citation/CS1/Whitelist')
(as the sandbox does) should mean there is very little overhead for a page with many citations, but if wanted, there is an ugly hack that could be used to reduce whitelist lookup time. It turns out that packing the table data into a sorted string, then doing a binary search of the string is surprisingly fast (faster than loadData), and would have less overhead when used on a page with just a few citations. Let me know if you want to try the hack (I have some code). Johnuniq (talk) 23:19, 4 April 2013 (UTC)
- Okay, I've gone ahead and deployed a version of the whitelist. The impact is not completely trivial, but is also not as bad as I feared it might be. With this parameter validation, performance seems to move from about 100 citations per second to about 90 citations per second on the benchmarks I've been using. That's probably an okay hit for something that ultimately should help clean up and prevent many errors. We'll essentially get that back next week when Mediawiki is updated to 1.22wmf1, which has reduced Lua overhead and should give Lua citations about a 20-25% boost in rendering speed. Unexpected parameter values are being shunted to Category:Pages with citations using unsupported parameters. Dragons flight (talk) 03:30, 5 April 2013 (UTC)
- P.S. If anyone notices misnamed parameters that are frequently recurring, it may be worth adding an alias to the citation handling code rather than trying to correct frequent mistakes, especially if users are likely to continue making the same mistakes. Dragons flight (talk) 03:35, 5 April 2013 (UTC)
- Okay, I've gone ahead and deployed a version of the whitelist. The impact is not completely trivial, but is also not as bad as I feared it might be. With this parameter validation, performance seems to move from about 100 citations per second to about 90 citations per second on the benchmarks I've been using. That's probably an okay hit for something that ultimately should help clean up and prevent many errors. We'll essentially get that back next week when Mediawiki is updated to 1.22wmf1, which has reduced Lua overhead and should give Lua citations about a 20-25% boost in rendering speed. Unexpected parameter values are being shunted to Category:Pages with citations using unsupported parameters. Dragons flight (talk) 03:30, 5 April 2013 (UTC)
- Consider warning with alternate spellings: I can appreciate creating a "white list" to check all parameters and prevent adding new parameter names, but I would prefer to focus on common misspellings or none-such names, to issue a warning to suggest a likely alternative name:
- "Found "acessdate=5 May 2012" - expected "accessdate=" with 2 letters 'cc'.
- "Found "accesdate=5 May 2012" - expected "accessdate=" with 2 letters 'ss'.
- "Found "accessed=5 May 2012" - expected "accessdate=" parameter.
- "Found "retrieved=5 May 2012" - expected "accessdate=" parameter.
- "Found "name=J. Doe" - expected "author=" or "last=" or "editor=" etc.
- "Found "Editor=J. Doe" - expected "editor=" with lowercase 'e'.
- "Found "Title=A Tale of Two..." - expected "title=" with lowercase 't'.
- "Found "Series=Foundation Trilogy" - expected "series=" with lowercase 's'.
- "Found "Volume=6" - expected "volume=" with lowercase 'v'.
- "Found "vol=6" - expected "volume=" as full word.
- "Found "No=23" - expected "number=" or "issue=" parameter.
- "Found "pulbisher=Acme Inc" - expected "publisher=" with 'pub'.
- "Found "pulbication-place=Berlin" - expected "publication-place=" with 'pub'.
- "Found "locatoin=London" - expected "location=" with 'ion'.
- "Found "loaction=London" - expected "location=" with 'loca'.
- Perhaps beyond the most-obvious misspellings, then a white-list scan could log into a maintenance category, but the common errors are a reason to use "smart software" to pinpoint the invalid parameters, such as "expected lowercase 'v'..." when catching capital "Volume=" as a common mistake, where merely rejecting the parameter might still leave users wondering when the "Volume" parameter was removed from being valid in cites. In many cases, just rejecting a parameter, without specific pinpointed text, will continue to frustrate some users. It might seem too hard to bother, but it can make all the difference to users. Obviously, having 40 hard-coded parameter checks, for common spelling errors, will not slow the cite formatting, but could make the interface better than "sliced bread" for new/sleepy users. Meanwhile, we can fix those misspelled parameters in live articles, so only new cites will be rejected, such as fixing a dozen articles with "accesdate" (search: title...accesdate). -Wikid77 (talk) 00:33, 5 April 2013 (UTC)
- I added a check for capitalization error, such that if lowercase( parameter ) would have been valid then it adds the lowercase version as a suggestion in the error message. If you want to work on a more general table of messages to give when finding common bad inputs, then I think that would be fine. As long as we are talking about code that only runs for the bad parameters, it probably doesn't matter how large or complex it is. Dragons flight (talk) 03:52, 5 April 2013 (UTC)
- I say force all parameters to lower case and place pages with unrecognized parameters in a maintenance category. We can't guess every misspelling or other oddball use. -- Gadget850 (Ed) talk 02:13, 5 April 2013 (UTC)
- I'm not sure if you mean:
- We should force all parameter names to require lowercase, or
- We should treat all parameter names the same as if they were written in lowercase, i.e. case insensitive.
- The former approach is friendly to existing bots and generally conforms to the history of template usage, though as we already have 25 aliases based on capitalization trying to eliminate those would require plenty of cleanup and a bit of user retraining. The latter approach is friendly to users and would eliminate a common source of user error, though it may require some bots to be modified. Personally, when it comes to a choice between making Wikipedia easier for users to edit and making it easier for machines to process, I will almost always side with the users. Hence, as mentioned in a previous section, I would support making all parameter names case insensitive. Dragons flight (talk) 04:07, 5 April 2013 (UTC)
- The second: parse parameters as case insensitive. -- Gadget850 (Ed) talk 10:04, 5 April 2013 (UTC)
- I'm not sure if you mean:
Here is an oddity. For me, the article Andover workhouse scandal is listed at Category:Pages with citations using unsupported parameters. None of the citations in §References has a "hidden" annotation showing where the error is. But, when I click through Edit and then Show preview (without any changes) the 9th citation shows the error's hidden annotation. This doesn't seem to be an isolated case. The first article I looked at was Cooper Union which also wasn't displaying the annotation but does when in Show preview mode.
—Trappist the monk (talk) 10:37, 5 April 2013 (UTC)
- Purging those pages causes the error to show. -- Gadget850 (Ed) talk 10:55, 5 April 2013 (UTC)
- Interesting that that has not been required of all of the other errors – at least I haven't noticed it.
- —Trappist the monk (talk) 11:48, 5 April 2013 (UTC)
- It's not common, but I've seen it with other error messages too. It's an occasional issue with how Mediawiki handles page versioning and caching, though it seems to me like it has been more common in the last few days than I am used to. Dragons flight (talk) 15:50, 5 April 2013 (UTC)
- —Trappist the monk (talk) 11:48, 5 April 2013 (UTC)
- Beats me where a parameter like 'x-print-url' came from. But, I see the same in infoboxes where novice editors figure they can use any old parameter. -- Gadget850 (Ed) talk 11:59, 5 April 2013 (UTC)
thank you for this. a couple of unrelated ?? that popped up by looking at the list:
- what is "authorformat"?
- i thought "serieslink" was deprecated. if not, please make it a universal param.
thanx. 70.19.122.39 (talk) 12:13, 5 April 2013 (UTC)
- Don't know what
|authorformat=
is.|serieslink=
is listed as deprecated at{{cite episode}}
. Which leads me to wonder if there should be a "graylist" and associated category for deprecated parameters – yeah, I know, feature creep.
- i can see the need for
|serieslink=
in some situations. eg|series=Harry Potter
and|serieslink=Harry Potter
or|serieslink=Harry Potter (film series)
. - in any case, if no performance hit, imo it should be made available anywhere
|series=
would be used. (pretty much everywhere) 70.19.122.39 (talk) 13:59, 5 April 2013 (UTC)- Wjat is the point of separating out the series and serieslink, rather than just including the link in the value of the series parameter? —David Eppstein (talk) 15:00, 5 April 2013 (UTC)
- apart from the minor annoyance of avoiding entry of a piped wikilink? the only other thing i can think of is when
|serieslink=seriesurl
. 70.19.122.39 (talk) 15:40, 5 April 2013 (UTC)- Many of these templates were developed independently. I updated {{cite episode}} to use {{citation/core}} a year ago, along with an number of other templates. I tried very hard to make each template standard, but several of these templates had parameters like 'serieslink' that I retained for backwards compatibility. -- Gadget850 (Ed) talk 16:08, 5 April 2013 (UTC)
- apart from the minor annoyance of avoiding entry of a piped wikilink? the only other thing i can think of is when
- Wjat is the point of separating out the series and serieslink, rather than just including the link in the value of the series parameter? —David Eppstein (talk) 15:00, 5 April 2013 (UTC)
- i can see the need for
Here is a snippet out of Module:Citation/CS1/Whitelist.
whitelist = {
basic_arguments = {
['accessdate'] = true,
['access-date'] = true,
['agency'] = true,
['airdate'] = true,
['archivedate'] = true,
...
It runs on for a long time like that. But you know what never changes? The assignment. Always true
. An unchanging signal conveys no new information. So, could we not do something with this? Like, instead of a boolean, make the assignment an integer, where an assignment of 1 might mean that the parameter is active and in use; where the number 2 might mean that the parameter is deprecated but still usable; where the number 3 might mean that the deprecated parameter is no longer usable? Type 2 parameters would simply cause the page to be dropped into a category of pages using deprecated parameters. Use of type 3 parameters would emit an error message and be dropped into a category of no-longer-valid parameters. I'm not sure why we'd want to hold on to invalid parameters unless it is to keep them around as a way of remembering that we once used them for whatever we once used them for.
Or maybe we just stay with a boolean and if the parameter is assigned false
then it's a deprecated parameter and treated like number 2 above. No longer valid parameters are simply removed or commented out.
—Trappist the monk (talk) 22:48, 6 April 2013 (UTC)
- Yes, the whitelist could be used to convey more meaning that just present / not-present. Explicitly deprecated values is one such application. I actually looked into this. Based on the documentation I could only find one example of a parameter officially listed as deprecated that we still support. In part, that's probably as much an issue with the documentation, as anything. However, it didn't seem worthwhile to create a special set of "supported yet deprecated" parameters when it would be nearly empty. Perhaps in the future we will explore such uses. Dragons flight (talk) 16:58, 8 April 2013 (UTC)
Mysterious parameters
{{cite web}} lists "doibroken" in its prototype. I think this is supposed to be "doi_brokendate", as "doibroken" does not seem to have ever done anything.
{{cite news}} lists "pmd" in its prototype. I think this is supposed to be "pmc", as "pmd" does not seem to have ever done anything.
{{citation}} lists "doi-inactive" in its prototype. I think this is supposed to be "doi_inactivedate", as "doi-inactive" does not seem to have ever done anything.
Are any of these important alternative forms that need to be supported? Or are they just errors that need to be removed from the template documentation pages? Dragons flight (talk) 23:27, 4 April 2013 (UTC)
- I fixed the documentation. -- Gadget850 (Ed) talk 00:37, 5 April 2013 (UTC)
- Removing invalid blank parameters in spare time: For anyone who wants to help copy-edit the hundreds of articles which copied the invalid "boilerplate template" parameters, here are the wiki-searches for misspelled parameters:
- search: "doibroken" (original count: 158 pages)
- search: "pmid | pmd" (original count: 197 pages)
- search: "doi-inactive" (original count: 6)
- As usual, try to fix other clerical errors, in those articles, when editing the pages to remove the invalid parameter names. There's no hurry, as there is little risk of users demanding those parameters, and this is just another general clean-up activity. -Wikid77 (talk) 00:49, 5 April 2013 (UTC)
Author list extraction
Not sure if this is a template problem or a Zotero one - {{cite journal|last1=Fisher|first1=C.|last2=Kear|first2=J. |year=2002| title= The taxonomic importance of two early paintings of the Pink-headed Duck ''Rhodonessa caryophyllacea'' (Latham 1790)|journal= Bull. Brit. Orn. Club. |volume=122|issue=4|pages=244–248}} does not extract two authors into Zotero. Shyamal (talk) 12:43, 5 April 2013 (UTC)
- I think I've fixed the author component of COinS output. Dragons flight (talk) 19:53, 5 April 2013 (UTC)
- Authors are fixed. More issues reported above. -- Gadget850 (Ed) talk 17:47, 6 April 2013 (UTC)
- I think I've fixed the author component of COinS output. Dragons flight (talk) 19:53, 5 April 2013 (UTC)
Thousands of pages reject blank "| |" in Whitelist scan
This thread is related to earlier thread "#Whitelist" but it also involves a blank cite parameter in the second-level cite Template:Catholic (aka {Catholic_Encyclopedia} ). Apparently, the common practice of ending a template name with vertical-bar pipe "{{xxx|" is generating an "unsupported" blank parameter, as in protected template {Catholic} which uses "{{cite encyclopedia|" with the trailing bar "|" and so 4,725 pages which transclude {Catholic} are tending to log into:
I think {Catholic} should be edited to fix "{{cite encyclopedia|" and remove the trailing bar "|". Meanwhile, a final bar "|}}" does not log as an error, which seems good if people want to append an extra bar to be sure a future added parameter is separated from the prior parameter. Anyway, if {Catholic} is changed to fix the trailing bar, then the maintenance Category should drop by about 4,700 pages, because invalid parameters are so rare it is unlikely there are other unsupported cite parameters being used where {Catholic Encyclopedia} is used. -Wikid77 (talk) 16:54, 5 April 2013 (UTC)
- I had intended for empty unnamed parameters to be completely ignored, so things like "| |" don't trigger an error or categorization. However, I didn't quite implement it right, so they were getting categorized (though no error was noted). This has now been corrected, such that blank fields will not be categorized. Dragons flight (talk) 17:09, 5 April 2013 (UTC)
- Okay, that seems fine. -Wikid77 (talk) 17:32, 5 April 2013 (UTC)
Need more visible red-error messages
We obviously cannot show every error condition as a red-error message, due to thousands of pages containing misspelled or "future" parameter names. However, some errors are added unseen by people, every week, and other editors must fix those hidden errors due to lack of red-error messages. These I think need visible messages:
- "Found "acessdate=5 May 2012" - expected "accessdate=" with 2 letters 'cc'.
- "Found "accesdate=5 May 2012" - expected "accessdate=" with 2 letters 'ss'.
- "Found "accessadate=5 May 2012" - expected "accessdate=" without "...adate".
- "Found "accessed=5 May 2012" - expected "accessdate=" parameter.
- "Found "accessed 5 May 2012" - expected "accessdate=" parameter.
- "Found "Editor=J. Doe" - expected "editor=" with lowercase 'e'.
- "Found "Title=A Tale of Two..." - expected "title=" with lowercase 't'.
- "Found "Series=Foundation Trilogy" - expected "series=" with lowercase 's'.
- "Found "Volume=6" - expected "volume=" with lowercase 'v'.
In the past 3 days, more one-ess "accesdate=" parameters have been added into a dozen articles (search: accesdate), and the editor(s) should be warned, with red-error messages. We might need a separate Category for "accessed=" or parameter 1 as "accessed " with no equals-sign, to cleanup many existing "accessed" parameters, before flagging with a red-error when "accessed" is found. -Wikid77 (talk) 17:32, 5 April 2013 (UTC)
- This topic ties in with §Error trapping and checks above. There, at least, is a pseudo-fix that allows interested editors to see "hidden" error messages. As they are, these error messages are quite adequate for locating and correcting malformed citations. However, they are hidden, which, in this editor's opinion, they should not be. Articles on the search result page that you linked do contain the accesdate error and the pages that I checked at random are annotated (when viewed through show preview).
- So, I take one more opportunity to rise in favor of visible error messages without the need for special css – like those errors noted on Category:Pages with archiveurl citation errors.
- —Trappist the monk (talk) 17:51, 5 April 2013 (UTC)
- I agree that the error messages should be shown. And we should do it in the same style: the old messages use the
error
class that applies<strong>
while the new ones are normal font but red. I don't see the need for the strong text. - I suspect that there will be editors glad to see and fix issues and readers who will totally had the red stuff. When the cite error system was updated to add more error messages, there was a lot of complaints and we ended up adding namespace detection to suppress the errors on user and talk pages. Which is then an issue when a draft is developed in userspace and then moved.
- Regardless, these errors aren't new, just the error detection. We really need to have help pages in place before we start barraging the red ink. -- Gadget850 (Ed) talk 18:39, 5 April 2013 (UTC)
- I agree that the error messages should be shown. And we should do it in the same style: the old messages use the
- Ok, so where should those help pages be? I've already made some effort in that way by adding help-like text to the error-capture category pages. That text might be a start and the "hidden" message text can be added to the help pages for additional explanation. By then we should also have resolved the use-or-don't-use-error-prefix issue.
- Acknowledging that I might sound like a stuck record (yeah, dating myself), even though editors may hate the red error text that malformed citations bleed on their nice pretty pages, it works because those errors that editors can see, get fixed. Squeaky wheel syndrome, ne?
- —Trappist the monk (talk) 19:19, 5 April 2013 (UTC)
- Presuming we prefix these with CS1 error, then we would create Help:CS1 errors. We can probably get by with one help page with anchors for these, as most issues are fairly obvious.— Preceding unsigned comment added by Gadget850 (talk • contribs)
- —Trappist the monk (talk) 19:19, 5 April 2013 (UTC)
- Perhaps a better location is Help:Citation Style 1/CS1 errors?
- I think there is broad agreement that showing the errors is ultimately in the best interest of the project. The main issue is deciding when to introduce them. Ideally, it would be good to have helpful documentation on each error and a stable set of error messages / categories that aren't still being tweaked. Because we are also talking about splashing red messages across many existing pages, it might also be nice to try and recruit some friendly wiki gnomes to help clear some of the backlog in order to avoid being too disruptive to existing articles. As an intermediate step, I added instructions on how to see the hidden messages to the relevant error category pages. Dragons flight (talk) 23:55, 5 April 2013 (UTC)
- I'm working on collecting the error messages and creating some sort of helpful documentation. From that, I hope that we can hone in on terse, descriptive error messages so that the help documentation will be more or less supplemental and that the messages themselves will convey just enough information to allow editors to correct what has been broken.
- Right now there are about 125k pages with CS1 errors out of some number of millions of Wikipedia pages. On all of the pages that I've looked at, I have yet to see more than a handful of errors per page. I don't think that the sky will fall if all of a sudden there are a few more articles that have red error message text appearing where there was none before.
Perhaps echo parameter 1 at cite end
As a compromise solution in thousands of pages, to avoid requiring massive prior cleanup, we could "quietly" display the contents of any unnamed parameter 1 (or 2), at the end of each cite, as plain text after the postscript dot. In many cases, the appended data might pass as readable (because "=" omitted), such as "page 45" (not "page=45"), or the common "accessed 5 May 2009" or such. In a sense, this is a "smart software" solution, at the low-tech end, to simply echo what the user entered, and in many cases, it would be "close 'nuff" to pass as readable, such as "date-7 April 2013" where a hyphen "-" was used as creating an unnamed parameter, but displayed at the end (after final dot "."), to quietly indicate an "anomaly" rather than a glaring red error. By that method, within a matter of hours, then thousands of citations would become "half-fixed" to show either obvious data, or decipherable data, or in rare cases just quietly append an "outtake" at the end, such as perhaps parameter 1 being "first9J." which needed to be corrected as "first9=J." for many readers to understand. Also, those echoed parameters could be logged into a separate Category, as not needing the same kind of attention as the hidden data items. Once that "triage" step is made, to echo unnamed parameters, then we would have a better count of categories to judge severity of problem pages. Otherwise, there are likely to be many people horrified of showing red-error messages, until 99.9999% of prior cites are corrected (aka, 6 months from now?). -Wikid77 06:44, 6 April 2013 (UTC)
- I guess I would oppose this simply because the functionality that
"quietly" display[ed] the contents of any unnamed parameter ... at the end of each cite
would need to be removed later. I would also oppose this because editors often see what they expect to see regardless of what is actually there to be seen. This is why yuo cna rdea thsi wthiotu too mcuh trobule – it passes as readable data. If editors see more-or-less what they expect to see then they see nothing to fix. Better to give them something that they don't expect to see so that the error gets fixed.
- Of the 6,914,884 articles, there are about 125k (1.81%) with CS1 errors. A lot, but not an overwhelming amount. Most of these pages have only a few broken citations. If editors can easily see that there is an error, it will get fixed. Masking errors by quiet display just keeps them around longer. Just look at Category:Pages using web citations with no title and Category:Pages with archiveurl citation errors; these two categories have relatively few pages in them because, for a long time, the CS1 templates emitted red error messages.
- Editors will not be horrified; if they are then perhaps they ought not be editing Wikipedia article and instead should be seeking professional help.
- —Trappist the monk (talk) 12:03, 6 April 2013 (UTC)
- I agree. I did a lot of work on the Cite error help and still tweak it based on feedback. I think that error messages and categories help to cleanup problems more quickly. I expect there will be some initial outcry and confusion, but a lot of these citations have had these problems for a long time. -- Gadget850 (Ed) talk 17:45, 6 April 2013 (UTC)
- The percentage of errors shown above is probably not quite right. CS1 categorizes errors regardless of the page's namespace. The calculation above only takes articles into account. This morning when I tallied the total number of pages with errors I got 170,863 pages with errors in a total of 29,968,922 pages which is 0.57% of pages. For the number of pages I used
{{NUMBEROFPAGES}}
. This is also probably misleading because it is likely that most of those pages don't have citations of any kind. Regardless, the number of pages with citations reporting errors is relatively small.
Examples
As we come across good examples of bad uses, please add them to Module talk:Citation/CS1/Rogues gallery. We don't need a lot of duplicates. I think this will be useful in help documentation down the line. -- Gadget850 (Ed) talk 19:27, 5 April 2013 (UTC)
- Some invalid parameters already: There are many missing "=" in parameters, such as "date-May 1994" using hyphen '-' or "first9J." with the text run-together to become unnamed parameter 1. Some people think there should be notation parameters:
- "note=" was used in one page
- "notes=" has been used in some pages
- "unused_data=" has appeared in several pages, as if a formal option.
- Perhaps we should add parameter "note=" or "notes=" to avoid people stuffing extraneous text into "format=" or such. -Wikid77 (talk) 23:46, 5 April 2013 (UTC)
- The purpose of the citation is to identify the source. There are better ways to add notes, and I will look at documenting them. -- Gadget850 (Ed) talk 00:12, 6 April 2013 (UTC)
- I'm working on the bad parameters. "unused_data=" comes from the citation bot. Rjwilmsi 09:23, 6 April 2013 (UTC)
- Any news on this? I am deleting
|unused_data=
to get rid of the error message. (Sometimes it contains an associated intitute, like the author's university). -DePiep (talk) 11:52, 22 April 2013 (UTC)- When Citation bot comes across a value with no parameter, it moves it to
|unused_data=
. When we come across this, we need to evaluate it and try to determine the intent of the adding editor. If it is garbage, delete it; if it is valid, then add the appropriate parameter. -- Gadget850 (Ed) talk 12:03, 22 April 2013 (UTC)
- When Citation bot comes across a value with no parameter, it moves it to
- Any news on this? I am deleting
- I'm working on the bad parameters. "unused_data=" comes from the citation bot. Rjwilmsi 09:23, 6 April 2013 (UTC)
- The purpose of the citation is to identify the source. There are better ways to add notes, and I will look at documenting them. -- Gadget850 (Ed) talk 00:12, 6 April 2013 (UTC)
- Consider moving information that is identified as unused to a section in the article's talk page.
List of current CS1 error messages
Here is the list of current error messages. I'd like to change some of these (ok most of them) and without objection I will. I like short terse error messages that get to the point without taking too many words to convey the message. I think that the proposed messages do that.
"But wait!" you say, "Aren't some of those messages all the same? How will users know what they mean?"
I propose that each error message get a link to an anchor in the help page. For example, if the help page is at Help:Citation Style 1/CS1 errors and there is a |trans_title=
but not a |title=
in a citation, then the error message would look something like this:
- Missing or empty
|title=
(help)
where the help link is something like this: [[Module_talk:Citation/CS1/Help#trans_missing_title|(help)]]
Current | Proposed | Why | Anchor name |
---|---|---|---|
Accessdate used without URL | |accessdate= requires |url=
|
Identifies the missing parameter | accessdate_missing_url |
Bad DOI (expected "10." prefix) in code number | Check |doi= value
|
More concise, softer tone | bad_doi |
Check |isbn= value
|
bad_isbn | ||
Bad OL specified | Check |ol= value
|
Softer tone | bad_ol |
Check |url= scheme
|
bad_isbn | ||
Bad page specification here | Extra |pages= or |at=
|
More concisea | extra_pages |
Bare URL needs a title | Missing or empty |title=
|
Identifies the missing parameter | bare_url_missing_title |
Citation has no title | Missing or empty |title=
|
Identifies the missing parameter | citation_missing_title |
Citation is empty | Empty citation | More concise | empty_citation |
Citation uses old-style implicit et al. for authors | |displayauthors= suggested
|
Identifies the missing parameterb | displayauthors |
Citation uses old-style implicit et al. for editors | |displayeditors= suggested
|
Identifies the missing parameter | displayeditors |
File format specified without giving a URL | |format= requires |url=
|
Identifies the missing parameter | format_missing_url |
If you specify archiveurl= , you must also specify archivedate=
|
|archiveurl= requires |archivedate=
|
More concise | archive_missing_date |
If you specify archiveurl= and deadurl=no , then you must also specify url=
|
|archiveurl= and |deadurl=no requires |url=
|
More concisec | archive_missing_url_not_dead |
If you specify archiveurl= , you must also specify url=
|
|archiveurl= requires |url=
|
More concise | archive_missing_url |
No title= specified | Missing or empty |title=
|
Identifies the missing parameter | cite_web_title |
No URL on cite web | Missing or empty |url=
|
Identifies the missing parameter | cite_web_url |
Translated chapter included without the original | Missing or empty |chapter=
|
Identifies the missing parameter | trans_missing_chapter |
Translated title included without the original | Missing or empty |title=
|
Identifies the missing parameter | trans_missing_title |
Unknown parameter "????=" ignored | parameter_ignored | ||
Unknown parameter "XXXX=" ignored (suggest "xxxx=") | Unknown parameter |XXXX= ignored (|xxxx= suggested)
|
Consistency | parameter_case |
Unnamed parameter containing "????" ignored | Text "????" ignored | More concised | text_ignored |
Wikilink embedded in URL title | wikilink_in_url |
^a As I was writing this I noticed that duplicate parameters, for example |page=27
|page=54
is a condition that isn't caught. The last of any parameter is the one that gets used.
^b Not sure if these are truly errors so I've only suggested a fix.
^c Not really sure about this one. |archiveurl=
requires |url=
when |deadurl=
is missing or blank, so we assume that |url=
is dead. |archiveurl=
also requires |url=
when |deadurl=
is explicitly set to no
indicating that |url=
is not dead. So it doesn't matter what state |deadurl=
is in; whenever |archiveurl=
is defined and not empty, |url=
is required. Right? So isn't it the case that we should be looking for |deadurl=yes
without a matching |archiveurl=
? Which by the way, is a condition that we don't catch.
^d Without '=', whatever is between pipes is merely text – CS1 doesn't support unnamed parameters.
Comments?
—Trappist the monk (talk) 18:03, 6 April 2013 (UTC)
- You beat me to this. For the second, I suggest "Bad doi specified". -- Gadget850 (Ed) talk 18:33, 6 April 2013 (UTC)
- Were we having a race? I concur with Bad DOI specified. Added it to the table.
- At present, Bare URL can also occur if titlelink= is specified in additional to url=. The effect is that title= is wikilinked using titlelink= and url= ends up on its own even though a title was specified. We should probably differentiate that case better, and it isn't that rare. A similar thing happens if both chapterurl= and url= are specified but there is only one item that works as a title=. I suspect there may be some other rare cases that also trigger some of the error conditions when given implausible parameter combinations.
- I am actually inclined to make "
|archiveurl=
requires|url=
" with deadurl=yes into not an error. I'm not sure I understand why the original url should be required if it is long dead. The link certainly isn't likely to help anyone, and all the major archiving services will show you the original URL if you really want to know it. At present this condition is only an error for templates other than {{cite web}}, which is also a weird quirk. Dragons flight (talk) 20:41, 6 April 2013 (UTC)- We keep the original URL because: sometimes it goes live again and you usually need it to find and archive, and some archives go dead. -- Gadget850 (Ed) talk 21:31, 6 April 2013 (UTC)
- Then we should require it for cite web too, but personally I don't see the point of indefinitely preserving links that are dead. That's doubly true with cite book / journal, etc. where the URL is only really supposed to be a convenience and shouldn't be essential information for locating the source anyway. The only case where the URL is often essential would seem to be cite web, and that's the only case where missing the original url is not presently considered to be an error. Dragons flight (talk) 21:43, 6 April 2013 (UTC)
- We keep the original URL because: sometimes it goes live again and you usually need it to find and archive, and some archives go dead. -- Gadget850 (Ed) talk 21:31, 6 April 2013 (UTC)
Umm,|titlelink=
, really? This:
{{cite episode |title=a title |titlelink=Module talk:Citation/CS1 |url=http://www.example.net}}
→ "a title".{{cite episode}}
: Missing or empty|series=
(help); Unknown parameter|titlelink=
ignored (|title-link=
suggested) (help)
Example please?
- {{cite episode}}? That's not presently Lua, and apparently doesn't support titlelink= at all. Try:
{{cite news|title=a title |titlelink=Module talk:Citation/CS1 |url=http://www.example.net}}
→ "a title".{{cite news}}
: Unknown parameter|titlelink=
ignored (|title-link=
suggested) (help)- Dragons flight (talk) 23:13, 6 April 2013 (UTC)
- Yeah, so I'm not always an idiot – at least I like to tell myself that.
- If one is to believe the
{{cite news}}
documentation,|titlelink=
isn't supported. And, this comparison would seem to bear that out.
- If one is to believe the
Wikitext | {{cite news
|
---|---|
Live | "a title". {{cite news}} : Unknown parameter |titlelink= ignored (|title-link= suggested) (help)
|
Sandbox | "a title". {{cite news}} : Unknown parameter |titlelink= ignored (|title-link= suggested) (help)
|
- If you check the notice at the top of each Lua-base template (LBT), the documentation is not up to date. In the old series of templates, 'titlelink' is supported only by the {{Cite DVD-notes}}. Apparently it has been implemented for the Lua versions.
- Again, many of of these templates were developed without using {{citation/core}} and later converted (mostly by me). This means that some parameters are different. -- Gadget850 (Ed) talk 01:06, 7 April 2013 (UTC)
- In several of the old-style templates, parameters
|albumlink=
,|episodelink=
,|serieslink=
, and|titlelink=
were deprecated. Clearly the collective thinking then was that that type of parameter was unnecessary – thinking with which I concur. When we dredge these parameters out of the scrapheap, we end up with cases like this where two urls compete for a single title. Deprecating|titlelink=
puts the onus on the editor implementing the citation to choose how the title will be linked which is better than leaving it up to the CS1 citation module to figure out.
- In several of the old-style templates, parameters
- I am aware of your significant and certainly under-appreciated work in wrestling the disparate CS1 templates into a nicely coherent suite.
- —Trappist the monk (talk) 03:35, 7 April 2013 (UTC)
- It wasn't collective thinking, it was just me deprecating those parameters in order to harmonize the series after gaining consensus to update. I see you cleaned up the examples sections on the doc pages: that is one area where I had not done much work, so thanks. -- Gadget850 (Ed) talk 10:49, 7 April 2013 (UTC)
- —Trappist the monk (talk) 03:35, 7 April 2013 (UTC)
- I started Module talk:Citation/CS1/Help. Just going to populate the headings and anchors for now. We can move it whenever.
- And yes, cite web should do the same- all templates should.
- 'titlelink' is used by {{Cite DVD-notes}}. -- Gadget850 (Ed) talk 22:24, 6 April 2013 (UTC)
Wikitext | {{cite web
|
---|---|
Live | "Title". {{cite web}} : |archive-url= requires |url= (help); Check date values in: |archivedate= (help); Missing or empty |url= (help)
|
Sandbox | "Title". {{cite web}} : |archive-url= requires |url= (help); Check date values in: |archivedate= (help); Missing or empty |url= (help)
|
We should keep the help sections and the error cats synchronized. I think I found my first use for the new <section>
. -- Gadget850 (Ed) talk 13:07, 7 April 2013 (UTC)
- That works pretty much. I added
<section>
tags to |archiveurl= requires |archivedate= and the matching call to Category:Pages with archiveurl citation errors and, sure enough the help text magically appeared on the category page.
- Need to learn about positioning of the tags; the end tag seems where it is at the moment seems to have added an extra newline. Not surprising. I also, without prior intent, chose a section that contains named reference tags so they present a problem to be dealt with. Also the sentence about automatically adding pages to the category ... bold text looks bad.
Note: In the sandbox, I've moved most of the error text and categorization into a new configuration page, so it is easier to adjust and maintain. In doing this migration, I've largely adopted the proposed error messages suggested above. Feel free to adjust it further as necessary. Dragons flight (talk) 21:39, 7 April 2013 (UTC)
- Yep, I saw what you did there. I've tweaked the redundant parameter stuff a bit. You should look to make sure I didn't break anything. Also tell me if there are any differences between your error text and the proposed messages above. Our help page should use exactly the same wording as CS1 emits.
I have worked my way through Module talk:Citation/CS1/Help and made the text more or less read correctly. Each section has <section>
tags so that this document can be the master Help document for at least all of the category pages and perhaps elsewhere. I solved the bold text category issue with a little bit of #ifeq:
.
Have a look. Fix what I didn't get right.
—Trappist the monk (talk) 01:43, 8 April 2013 (UTC)
- I think you did very good work with the error page. I went through and edited it a bit, but it seems to be in a fairly good state. Dragons flight (talk) 05:27, 9 April 2013 (UTC)
cite journal: Embargo
In {{cite journal}}, the title link is formed from 'pmc' if 'url' is not defined:
Wikitext | {{cite journal
|
---|---|
Live | Bannen RM, Suresh V, Phillips GN Jr, Wright SJ, Mitchell JC (2008). "Optimal design of thermally stable proteins". Bioinformatics. 24 (20): 2339–43. doi:10.1093/bioinformatics/btn450. PMC 2562006. PMID 18723523.{{cite journal}} : CS1 maint: multiple names: authors list (link)
|
Sandbox | Bannen RM, Suresh V, Phillips GN Jr, Wright SJ, Mitchell JC (2008). "Optimal design of thermally stable proteins". Bioinformatics. 24 (20): 2339–43. doi:10.1093/bioinformatics/btn450. PMC 2562006. PMID 18723523.{{cite journal}} : CS1 maint: multiple names: authors list (link)
|
Unless 'Embargo' is set to a future date:
Wikitext | {{cite journal
|
---|---|
Live | Bannen RM, Suresh V, Phillips GN Jr, Wright SJ, Mitchell JC (2008). "Optimal design of thermally stable proteins". Bioinformatics. 24 (20): 2339–43. doi:10.1093/bioinformatics/btn450. PMC 2562006. PMID 18723523. {{cite journal}} : Unknown parameter |Embargo= ignored (|pmc-embargo-date= suggested) (help); Unknown parameter |sandbox= ignored (help)CS1 maint: multiple names: authors list (link)
|
Sandbox | Bannen RM, Suresh V, Phillips GN Jr, Wright SJ, Mitchell JC (2008). "Optimal design of thermally stable proteins". Bioinformatics. 24 (20): 2339–43. doi:10.1093/bioinformatics/btn450. PMC 2562006. PMID 18723523. {{cite journal}} : Unknown parameter |Embargo= ignored (|pmc-embargo-date= suggested) (help); Unknown parameter |sandbox= ignored (help)CS1 maint: multiple names: authors list (link)
|
So, 'Embargo' is not supported. It really should be supported as both upper and lower case, as I have no idea why it was implemented this way. -- Gadget850 (Ed) talk 12:03, 7 April 2013 (UTC)
- I just linked the PMC webpage, unaware of "embargo=" as an option, and when editing pages, I just deleted "embargo=" as thinking it was some extra deprecated complexity because 230 other parameters already seems like a few too plenty. Is that parameter really needed? -Wikid77 (talk) 14:48, 7 April 2013 (UTC)
- I accidentally deleted it from {{cite journal}} once and got immediately reverted, so somebody cares. -- Gadget850 (Ed) talk 15:27, 7 April 2013 (UTC)
- Embargo parameter is useful, though the majority of citations will be to papers that are > 1 year old, so pubmedcentral embargoes (believe they're normally around a year when applicable) aren't common on Wikipedia. Rjwilmsi 21:14, 7 April 2013 (UTC)
- I accidentally deleted it from {{cite journal}} once and got immediately reverted, so somebody cares. -- Gadget850 (Ed) talk 15:27, 7 April 2013 (UTC)
- I just linked the PMC webpage, unaware of "embargo=" as an option, and when editing pages, I just deleted "embargo=" as thinking it was some extra deprecated complexity because 230 other parameters already seems like a few too plenty. Is that parameter really needed? -Wikid77 (talk) 14:48, 7 April 2013 (UTC)
- Fixed in sandbox. Dragons flight (talk) 15:36, 8 April 2013 (UTC)
Auto-correction in cite triage
The categories already show more than 75,000 pages (more than 3 per 100) have CS1 cite errors, to be fixed. However, now with Lua, we have the power to auto-correct many errors, but leave internal warning messages. The basic triage concept is: minor fix, major fix, or undecipherable. For many cases, the easiest auto-correction is to just echo the contents of parameter 1 (or 2). For other parameters, they can be processed, but with hidden warnings:
- "accesdate=" - can be treated as "accessdate="
- "accessadate=" - can be treated as "accessdate="
- "pub=" or "published=" - can be treated as "publisher="
- "news=" or "newpaper=" - can be treated as "newspaper="
- "PDF" - could be "format=PDF".
- "date ..." - can be "date=..." if not already present, else just logged.
- "notes=" or "note=" can be inserted before the postscript text.
- "Title of work" can be echoed, until fixed as "title=" or "chapter=" or such.
Another issue, of the cite-triage processing, is to have 3 triage categories, where the 3rd, "undecipherable" category is what's left after auto-correcting the first 2 cases of minor/major fixes. In that 3rd category, I would put unusual parameter names, such as "size=" or "unused_data=" or "dfuhji=" with no attempt to auto-correct. It seems some users put unknown parameters as internal notes to editors, not displayed to readers. After deployment, the page counts within the cite-triage categories would indicate how many thousands of cite errors were auto-corrected within the few hours to reformat. However, all problems would still be logged, and not considered as error-free. -Wikid77 (talk) 14:48, 7 April 2013 (UTC)
- Not convinced that this suggestion has any significant benefit. What you've described as minor and major fixes aren't really fixes, just masking of cite errors. If Lua could actually repair malformed citations then I might think differently. But it can't, so this seems like a mechanism that will perpetuate poor practice – maybe not actively, but certainly passively by keeping malformed citations out of sight.
- Better, perhaps would be the creation of a bot that looks for all of those things that you've enumerated, fixes them, and adds a
|<bot name>=<what it fixed message>
parameter to the citation (<bot name>
needs to be added to the whitelist to avoid Unknown parameter "|<bot name>=
" ignored error messages). When evaluated by Module:Citation/CS1, the|<bot name>=<what it fixed message>
parameter will cause the repaired citation's page to be added to a category of pages auto-repaired by that bot so that the repairs may be checked by a human editor. There are already bots out there that do citation repair so this rather elaborate repair mechanism that I've described may be overkill.
- Or, just let CS1 categorize and emit errors so that the army of editors can deal with the malformed citations. We should not passively perpetuate poor practice. Yeah, had to write that - I like alliteration.
- —Trappist the monk (talk) 16:30, 7 April 2013 (UTC)
- I agree: Engineering, education, enforcement. Many editors simply copy/paste from other articles and seem very surprised that templates are documented. -- Gadget850 (Ed) talk 20:42, 7 April 2013 (UTC)
- The Lua module should identify problems, provide error messages and populate tracking categories, as we've already started to do. I don't think the Lua module needs to get too deep into trying to fix these errors, I'd rather have cleaner & leaner module code and bot tasks/manual fixes for the cleanup. I've got an existing bot task and some scripts & AWB logic that work in the area of fixing the errors under discussion, so I'll report back on what I find. Rjwilmsi 21:21, 7 April 2013 (UTC)
- I agree: Engineering, education, enforcement. Many editors simply copy/paste from other articles and seem very surprised that templates are documented. -- Gadget850 (Ed) talk 20:42, 7 April 2013 (UTC)
- I don't think we should be in the business of trying to work around every misspellings and typo (though I am in favor of making things case insensitive). That said, I agree with the notion of trying to be helpful to users. When an unknown named parameter is encountered, I've added a hook into Module:Citation/CS1/Suggestions, which we can use to provide the editor with a likely alternative. This will in turn be shown to the editor as part of the error message. So, for example, when someone accidentally enters "accesdate" the software will now suggest "accessdate". This list can be expanded to cover a large number of common typos and errors and provide editors with feedback on what they are probably looking for (and Wikid77, you might be a good person to start filling out the list). In the future, we might extend this suggestion mechanism with additional logic to handle other types of cases where a one-to-one map isn't sensible, but its a start. Dragons flight (talk) 21:24, 7 April 2013 (UTC)
- This I like. We will have to be somewhat cautious and perhaps even limit editing access because you know someone is going to figure out that he can edit the list and have CS1 suggest in technicolor what editors who misspell accessdate can do or go or whatever. I know, that never happens on Wikipedia.
- An obvious source of suggestions is the parameters used in non-English versions of the CS1 templates. Copy/pasta citations don't always get fixed.
- Cites could be auto-corrected plus message: During the validation phase, when the invalid parameters are being scanned, then auto-corrected values would be repaired and placed into the internal parameters, for both display and assignment into the COinS metadata. For example, a misspelled "newpaper=" would be logged as invalid, but treated as "newspaper=" unless already present, or "date-4 May 2012" would set the missing "date=" value to show on the page and put in the COinS tag. The word "PDF" would set format=PDF (unless already set), and similar. There would be "no hurry" to run a Bot, simply to adjust a few cites because they would function as if already fixed, instantly. Then during the coming months, the helpful people could progressively correct all the unusual 10,000 non-auto-corrected names, such as "sentence=It is there." or such. We already auto-correct for text such as "pages=56 and 57." when the "57." would have formerly shown double-dot "57..". However, for auto-corrected parameter names, there would be extra messages logged internally, such as:
- CS1: Parameter text "date-4 May 2012" treated as date.
- CS1: Parameter text "accessed 2013-04-07" treated as accessdate.
- Yet, those auto-corrected cites would also log into a separate Category, as a separate level of concern. Then, based on page-counts in each category, the extent of various types of errors could be better assessed. -Wikid77 (talk) 22:25, 7 April 2013 (UTC)
Cites still run 125/second with bad parameter
I have timed the current speed of the Lua cites, with {cite_web}, to process a simple cite of 6 parameters, with a bad "x=3" or unnamed parameter 1, at 125/second, to confirm no extra validation slowdown of simple cites, even if every {cite_*} had to detect a bad parameter, log a message, and link a tracking category. Taking advantage of Sundays being a "light day" for busy servers, I ran over 50 repetitions of timing the typical 500 cites, and averaged the lowest 4 runtimes as 4.0 seconds, or 125/second. The 2 lowest runtimes were ~3.8 seconds, or ~132/second. In general, the runtime trended around 4.5 seconds per 500, or ~110 cites/second. Each of the 500 cites had 6 parameters:
- # {{cite web |x=3 | author=John Doe |title=Study 17 of life
| date=May 1999 | url=http://www.google.com | accessdate=1 June 2009}}
- # {{cite web |x=3 | author=John Doe |title=Study 17 of life
For the unnamed-parameter trials, the parameter 1 was just "x". Before looking at how the Lua whitelist table validates the parameters, I found no significant runtime difference between detecting/logging an invalid parameter name ("x=3") versus an invalid unnamed parameter ("x"). After reviewing the Lua script, which processes a whitelist table (array indexed by parameter name), then I saw that no runtime difference should be found due to all named parameters being checked before the (one or two) unnamed (numbered) parameters are checked. Specific results, for 500 testcases in 50 trials (7 April 2013):
- Runtime: lowest average 4.0, lowest case 3.817, highest 8.2 seconds
- Lua time usage: lowest 0.578s, highest 1.024s
- Lua memory usage: lowest 1.17 MB, highest 1.19 MB (also saw 1.18 MB)
In general, the logged "Lua time usage" was nearly independent of total runtime, where the servers might slow more during some Lua processing, or more outside Lua, in various runs, and a faster runtime of 5.3 seconds might show Lua high as 0.864s, while a longer runtime of 6.4 seconds might show Lua lower as 0.761s. I concluded that busy servers can slow an edit-preview somewhat, at any point, whether during Lua processing or outside in markup parsing. -Wikid77 (talk) 22:25, 7 April 2013 (UTC)
Wannabe parameters
Although half of all "invalid parameters" seem to be parameter 1 data (such as "PDF" or "hockeydb.com" or "2009-05-02" or "05-02" or "Agent France-Presse") which could be echoed after the postscript dot, there are several logical wannabe parameters which could be auto-corrected (even if not officially endorsed parameters). The wannabes include:
- "name=" seems to want to be "author="
- "subtitle=" to follow "title=" (on many pages, perhaps 900, with 500 blank)
- "paragraph=" to follow "page=" such as "paragraph=5th paragraph"
- "note=" or "notes=" to insert a notice.
- "accessed=" to be accessdate.
- "archive_url=" to be archiveurl (rare, few pages)
- "origdate=" to be origyear? (2,604 pages; search: origdate)
It will be interesting to see how many of the wannabe parameters, compared to misspelled words (such as common "accessedate"), keep occurring in the various cites. Parameter "origdate" was also used with "origmonth" in 1,353 pages, but both are almost always blank. -Wikid77 01:38, 8 April 2013 (UTC)
Names for cite-triage categories
To help focus effort on corrections of the more than 100,000 wp:CS1 cite errors, we can use separate categories to diagnose the level of effort needed to fix various errors. Although the basic concept of "triage" is typically a 3-way split into groups (here, minor corrections, major corrections, or complex issues), instead the cite-triage processing splits errors into several groups, where the category for unknown parameters (with over 26,000 pages) has been:
- Category:Pages with citations using unsupported parameters (now: 0 pages)
Obviously, with over 25,000 pages in a heap, the editors cannot quickly fix most errors within 6 months, nor even easily sort out priorities, of which pages to fix first. Instead, the cite-triage categories could separate errors by structural types:
- Category:Pages with citations using misspelled parameters ("agenyc", "frist=", etc.)
- Category:Pages with citations using proposed parameters (wannabes)
- Category:Pages with citations using unknown parameters
- Category:Pages with citations using unnamed parameters (parameter 1/2)
Because of the vast ocean of CS1 cite errors, they cannot be fixed, by editors, quickly, and so any numerous red-error messages could be left in pages for weeks, months, or years, to mar the appearance of pages, as if cite errors were the major concern of Wikipedia quality, compared to no red-error messages for grammatical errors in text, spelling errors, outdated facts, or unsourced statements (etc.). Hence, many cite-error messages must remain hidden, but simple errors could be auto-corrected yet log hidden warning messages, such as "Parameter 'frist=' treated as first name" or "Parameter 'pg=' treated as 'page=' number" and similar fixes. For the pages with auto-corrected cites, then there would be less urgency to fix the auto-repaired data, while providing human editors the ability to focus, primarily, on cite errors which could not be machine-corrected and might be left for years unless editors can be notified to help redo the cite parameters, in manageable groups, not as an immense heap of "25,000" unsorted articles which contain various unsupported parameters. As the numerous prior cite errors are corrected, in coming months, then some red-error messages could be made visible, as unlikely to remain uncorrected for years. -Wikid77 16:02, 8 April 2013 (UTC)
- Unknown vs. unnamed categories could easily be separated by editing the configuration file (as well the help page and creating the associated category pages). I have no objection to separating those two. Beyond that, coming up with more sophisticated logic for handling and sorting errors is not something I'm likely to work on in the near future. Dragons flight (talk) 16:48, 8 April 2013 (UTC)
Error message: More than one of |param1=, |param2=, and |param3= specified
On Module talk:Citation/CS1/testcases I have seen two versions of this error message:
- More than one of
|param1=
and|param2=
specified - More than one of
|param1=
,|param2=
, and|param3=
specified
So I wondered, how long can the error message get? To find out I conceived of this rather extreme case:
{{cite book |editor=editor |editor1=editor1 |Editor=Editor |Editor1=Editor1 |EditorSurname=EditorSurname |EditorSurname1=EditorSurname1 |editor-last=editor-last |editor1-last |editor-last1 |title=Title |last=Blow |first=Jo}}
That might produce an error message that looks like this:
- More than one of
|editor=
,|editor1=
,|Editor=
,|Editor1=
,|EditorSurname=
,|EditorSurname1=
,|editor-last=
,|editor1-last=
, and|editor-last1=
specified
Too long methinks. What if the error message becomes:
- n parameters synonymous with
|param=
specified – where n = total number of synonymous parameters−1
or keep the two and three parameter versions but when CS1 detects more than three synonyms, emit this (probably pretty rare):
- More than one of
|param1=
,|param2=
, and n others specified – where n = total number of synonymous parameters−2
—Trappist the monk (talk) 15:31, 8 April 2013 (UTC)
- If you do intentionally pathological things, then you'll get pathological results. However, I think it would be very rare for an editor to accidentally specify more than three redundant non-empty fields. Even three ought to be much rarer than two. For that reason, I'm disinclined to worry about this. Dragons flight (talk) 15:45, 8 April 2013 (UTC)
- I come from a world where it's important to design for extremes. So, yeah, I often look at something and wonder what would happen if ... Call me pathological if you wish, but I won't stop doing it.
- It wasn't meant personally. I'm just saying you can often find bad behavior by looking for pathological inputs. In this case it leads to long output, but I'd consider that to be an acceptable response to what ought to be a very unusual use case. (Even more acceptable if we agree to use normal sized font.) Dragons flight (talk) 16:32, 8 April 2013 (UTC)
- If I had taken offence, you'd know it.
class = error and text sizing
In the recent updates, I added "class = error" to all the error messages. This has the effect of increasing the font size to 120% and making it red:
- Some text with an error message...
Personally, I don't actually like the resizing of the error text. I think it makes it more prominent than is really necessary and at times can interfere with citation flow. Compare to:
- Some text with an error message...
What would people think of keeping the error messages the same font size as the rest of the citation? Dragons flight (talk) 15:56, 8 April 2013 (UTC)
- Yes, same size as citation text.
- —Trappist the monk (talk) 16:04, 8 April 2013 (UTC)
- Concur: normal size for all errors, and the same style for all errors. -- Gadget850 (Ed) talk 16:49, 8 April 2013 (UTC)
- In the sandbox, I've made all the errors normal sized. Dragons flight (talk) 22:17, 8 April 2013 (UTC)
- See thread: "#Perhaps consider fix-cite superscripts". -Wikid77 16:51, 8 April 2013 (UTC)
Help page location
It seems odd to me to use Module talk:Citation/CS1/Help for the help page. For starters, keeping it in Module talkspace means it can't easily have a talk page of its own. If this is intended to just be a temporary location before moving it to somewhere else, such as Help:Citation errors (or some such thing), then perhaps it would be good to just go ahead and move it. Dragons flight (talk) 16:24, 8 April 2013 (UTC)
- As noted above someplace, I intended for this to be moved based on whatever prefix we came up with. -- Gadget850 (Ed) talk 16:47, 8 April 2013 (UTC)
- The next time I resync the sandbox into the live version, the (help) link will be added to all the error messages, including the presently visible ones. I'd prefer that the help page have a slightly more permanent home before that. Obviously, we can still continue to move it after that. I just think it would be good to get it out of Module_talk. I don't really have a strong opinion about the name beyond that. Dragons flight (talk) 17:17, 8 April 2013 (UTC)
- Help:CS1 errors was first suggested by Editor Gadget850. I suggested Help:Citation Style 1/CS1 errors. Editor Dragons flight has suggested Help:Citation errors. Three editors, three locations.
- Because CS1 seems to be a common term that has come to be associated with this project and because two of us have used it in our help page location suggestions, I think that we should use Editor Gadget850's Help:CS1 errors suggestion.
- Without objection, I will make it so, shall I?
- I'm fine with whatever. At some point we may need to take up the potential confusion that the WP:CS2 template {{citation}} is also based on this "CS1" module, but we don't have to sort that out right now. Dragons flight (talk) 18:34, 8 April 2013 (UTC)
- Right, that looks like three in favor of Help:CS1 errors assuming that nominator Gadget850 concurs.
- —Trappist the monk (talk) 18:57, 8 April 2013 (UTC)
- Help:CS1 errors works for me. When codified and named CS1 and CS2, I really did not anticipate that CS2 would really catch on. CS2 uses some different default punctuation and it doesn't have a bunch of specific purpose templates. -- Gadget850 (Ed) talk 19:14, 8 April 2013 (UTC)
- —Trappist the monk (talk) 18:57, 8 April 2013 (UTC)
- Done
- Excellent. Dragons flight (talk) 22:17, 8 April 2013 (UTC)
Perhaps consider fix-cite superscripts
Because of the logistical nightmare of fixing over 27,000 pages of unsupported parameters, which will take many wiki-manmonths (years), then I think we also need to consider "non-red" error messages (superscript notes), which would be acceptable when displayed in pages. I am now thinking to hide a blue-wikilinked "#note" (explaining each specific error) inside a fix-cite superscript, for various complex cases. Such as:
- John Doe (7 May 1987). "Name of article". Our Journal.[fix cite]
In that hypothetical example, the linked text is "Help:Cite errors#Fix misspelled journal parameter" where the cite contained misspelled "jounral=Our Journal" and the Lua would have auto-corrected the spelling error but left a "quiet" [fix-cite] superscript, which would not be as garrish as a red-error message, but breaks the "glass ceiling" of Lua being able to talk to numerous editors, in the human world, with messages to send details in a major way. The red-error-message interface, as a "programming paradigm" to contact users, is severely limiting our ability to notify the 9,200 monthly article editors to help fix cite errors. Instead, let's consider a lot more use of the tiny "[fix cite]" superscripts (with error-message text inside wikilinks), to break the communication barrier with thousands of human editors. -Wikid77 (talk) 16:51, 8 April 2013 (UTC)
- My initial reaction to this is that a small blue superscript will be lost in the sea of blue citation links, wikilinks, and
{{dead link}}
s (who ever sees them?). I also wondered about the need to link to another page to find out what "fix cite" meant.
- But then I thought about
{{abbr}}
which might obviate the need to link to another page:- John Doe (7 May 1987). "Name of article". Our Journal.[fix cite]
- Not really kosher with web content guidelines, though.
- —Trappist the monk (talk) 18:26, 8 April 2013 (UTC)
- The defacto standard for error messages of this type is strong red. I am for ditching the strong. We could add a few classes so that editors can do some styling as they desire- I can take a stab at that. -- Gadget850 (Ed) talk 23:19, 8 April 2013 (UTC)
- I'm already impressed that we can pull out the errors into a number of tracking categories. At this point in time I wouldn't worry about further large changes to the error display as I have confidence that the majority of existing errors can be corrected in a handful of weeks, and parameter renaming bot tasks etc. can be extended to correct future errors. Rjwilmsi 22:33, 9 April 2013 (UTC)
- Regarding superscripted error messages: These use {{fix}} and includes messages such as ‹The template Fake citation needed is being considered for merging.› [citation needed]. These types of messages are added after the fact by other editors or bots. The defacto standard for immediately recognized errors is the red message. -- Gadget850 (Ed) talk 13:04, 18 April 2013 (UTC)
- I guess I'm not as optimistic as Rjwilmsi that these errors will be fixed quickly, based on our other numerous backlogs. I wish we could remove the red messages and keep the tracking categories, work hard to clear up the errors, and then only readd the red or superscripted errors to clean up issues that bots can't/don't do. GoingBatty (talk) 22:48, 18 April 2013 (UTC)
- Then how will editors learn to properly use the templates? If we don't immediately inform them that there is a problem, then they will simple propagate the issues. It would then be a matter of you broke it and someone else will fix it. We have had red error messages for other systems like
<math>
and Cite. When the footnotes system introduced the error messages, there were complaints, but we got a help system in place and resolved the issues. There are editors and bots actively patrolling the footnotes error categories and fixing stuff. Those messages were imposed by a developer and appeared by surprise (but they were better than the previous numeric error messages). Here, we had the help system in place before the messages started to appear. If we eliminate the messages, then we shouldn't bother to do the error checking. -- Gadget850 (Ed) talk 00:04, 19 April 2013 (UTC)
- Then how will editors learn to properly use the templates? If we don't immediately inform them that there is a problem, then they will simple propagate the issues. It would then be a matter of you broke it and someone else will fix it. We have had red error messages for other systems like
- I guess I'm not as optimistic as Rjwilmsi that these errors will be fixed quickly, based on our other numerous backlogs. I wish we could remove the red messages and keep the tracking categories, work hard to clear up the errors, and then only readd the red or superscripted errors to clean up issues that bots can't/don't do. GoingBatty (talk) 22:48, 18 April 2013 (UTC)
- Regarding superscripted error messages: These use {{fix}} and includes messages such as ‹The template Fake citation needed is being considered for merging.› [citation needed]. These types of messages are added after the fact by other editors or bots. The defacto standard for immediately recognized errors is the red message. -- Gadget850 (Ed) talk 13:04, 18 April 2013 (UTC)
- I'm already impressed that we can pull out the errors into a number of tracking categories. At this point in time I wouldn't worry about further large changes to the error display as I have confidence that the majority of existing errors can be corrected in a handful of weeks, and parameter renaming bot tasks etc. can be extended to correct future errors. Rjwilmsi 22:33, 9 April 2013 (UTC)
- The defacto standard for error messages of this type is strong red. I am for ditching the strong. We could add a few classes so that editors can do some styling as they desire- I can take a stab at that. -- Gadget850 (Ed) talk 23:19, 8 April 2013 (UTC)
Error categories
Right now, all of the thirteen eighteen current and sandbox categories specified by CS1 are subcategories of Category:Articles with incorrect citation syntax. I'm wondering if they should be moved into another category specifically for CS1 messaging. The category should be general in nature so that should, for example, a decision be made to reduce the number of parameters by deprecating some of them, CS1 might categorize pages with deprecated parameters into categories that would be added to the general CS1 messaging category.
Category:Articles with incorrect citation syntax holds pages that are categorized by the old-style citation templates and by {{citation error}}
. There are eight individual pages in the category what will likely never go away.
So, I propose a new category for CS1 messaging: Category:CS1 errors and messages or, maybe, Category:CS1 messaging. Within that there could be separate subcategories for messages and errors or, since there are relatively few error-message categories, perhaps it would be better to simply give all of the subcategory names that clearly identify them as error categories or as message categories. Individual pages do not get categorized in to the CS1 messaging category.
Here is a list of the current and sandbox error categories and proposed replacement names that clearly identify them as error categories. If a decision is made to have separate error and message subcategories, then the proposed error categories might all drop the "errors" word that ends the category names – or not.
Neat and tidy and the new not mixed with the old.
—Trappist the monk (talk) 17:29, 8 April 2013 (UTC)
- I have no strong objection to this naming scheme (or frankly most any other naming scheme). However, in the past I recall doing something similar and having people complain that I needed "Pages with" or the equivalent in front of each one. Apparently some people feel it is important to distinguish that "X" contains an error but "X" is not itself an error, and hence use "Pages with error of type" rather than "Errors of type". Or something like that anyway. Frankly, it all seemed very silly to me, but if it is likely to be an issue you might want to ask for recommendations at WT:CFD or somewhere.
- Since you are talking about the categories, I will also mention that I do like Wikid77's suggestion of splitting the "unsupported" parameters into "unknown" and "unnamed" (or some words like that) to distinguish and separate the case of parameters we don't understand from the case of random text with no field indicator. Dragons flight (talk) 20:15, 8 April 2013 (UTC)
- I have managed to ask at Wikipedia talk:Category names#Seeking input regarding the naming / renaming of categories used by the new Lua-based citation templates. Not where you suggested. I followed several links from WT:CFD and ended up at category names. I'll see what comes from that. I did peruse WP:NCCAT and didn't find anything that said "Pages with blah blah blah" are required. In fact, the string "Pages with" does not appear there.
- No problem splitting unsupported into unknown and unnamed. Can we wait for some response from those who lurk Wikipedia talk:Category names?
- —Trappist the monk (talk) 22:34, 8 April 2013 (UTC)
- We went through a few different names before we got Category:Pages with citation errors straight. -- Gadget850 (Ed) talk 23:10, 8 April 2013 (UTC)
- —Trappist the monk (talk) 22:34, 8 April 2013 (UTC)
- I went ahead and created Category:Pages with citations using unnamed parameters to split off the unnamed parameters, and I also put some text in Category:Pages with citations having redundant parameters. Of course we can go fix all the names later, whenever we figure out what we should be using. I think we are doing pretty good now, and hope to sync the sandbox tomorrow morning, which will expose the first set of updated error messages (those corresponding to already visible errors). Dragons flight (talk) 05:25, 9 April 2013 (UTC)
- The separate category, for unnamed parameters, helps to confirm if half of all "unsupported parameters" omit the "=" as evidenced when editing many of the current cite errors. -Wikid77 06:30, 9 April 2013 (UTC)
Returning to this subject. Conversation at WT:CfD#Seeking input ... has been somewhat ... well, there hasn't been much of it. What conversation there was, did show that my originally proposed names were somewhat ambiguous. I think that I've fixed that and new proposed names are in the table.
If we are to adopt new names for the CS1 tracking categories, I see no reason not to proceed.
Further input on the archiveurl inconsistency
I've asked the broader community, via VPP, for input on the inconsistent errors associated with archiveurl=
Wikipedia:Village pump (policy)#Citations: Should the original url.3D be required when using archiveurl.3D. Dragons flight (talk) 18:43, 8 April 2013 (UTC)
Scribunto upgrade runs 190/second as 50% faster
- Previously: "#Cites still run 125/second with bad parameter"
Newsflash - this just in (changes are happening so fast now). The MediaWiki software has been upgraded now to 1.22wmf1 (7bb4399), and this includes the quicker Scribunto interface to make Lua script functions start much faster. During the next few days, there might be some minor performance problems in various gadgets, or such, as they are adjusted for the upgraded software. However, the preliminary tests have revealed that the Lua-based wp:CS1 cites are reformatting, today, at over 190 per second, versus only 14 per second for the markup-based cites of last year (190/14 = 13.6x faster). Compared to the March Lua performance, typical wp:CS1-style citation footnotes now reformat ~50% faster.
I have, again, timed the current speed of the Lua cites, with {cite_web}, to process a simple cite of 6 parameters, with a bad "x=3" or unnamed parameter 1, at 190/second, to confirm no extra validation slowdown of simple cites, even if every {cite_*} had to detect a bad parameter, log a message, and link a tracking category. Specific results, for 500 testcases in 20 trials (9 April 2013):
- Runtime: lowest average 2.6, lowest case 2.579, highest 4.1 seconds
- Lua time usage: lowest 0.488s, highest 0.731s
- Lua memory usage: lowest 1.26 MB, highest 1.27 MB (no others)
An interesting side note is the Lua memory increasing to 1.26 MB, from ~1.18 MB, for the exact same 500 testcases. Also, the Lua time usage seems to be trending lower, but perhaps that is a result of the Scribunto interface running very quickly (over 60% faster?), so that the busy distraction of the servers had less of a chance to extend the Lua time usage, as not as much as when the runtime was almost 4 seconds (rather than 2.6). In general, the longer the runtime, the greater the chance of severely busy servers extending various aspects of page reformatting. That is why page runtimes over 21 seconds would sometimes hit the 60-second timeout for wp:Wikimedia Foundation error. -Wikid77 (talk) 06:30, 9 April 2013 (UTC)
- For the record, my test bed of 310 citations taken from Barack Obama moved from about 100 citations / second to about 130 citations / second, i.e. 30% faster. It's a more complex sample, so it is not entirely surprising that the reduced Lua overhead doesn't make quite as large a difference. That said, it is still excellent progress. Dragons flight (talk) 06:41, 9 April 2013 (UTC)
Taxon template tweak for Category:Pages with citations using conflicting page specifications cleanup
Would a template expert please help to see if we can tweak the MS3 family of taxonomy templates to get transclusions out of Category:Pages with citations using conflicting page specifications. See User_talk:WolfmanSF#Template:MSW3_Didelphimorphia_and_page_numbers. Thanks Rjwilmsi 08:27, 9 April 2013 (UTC)
- Was this not fixed a few days ago [9]? Dragons flight (talk) 09:36, 9 April 2013 (UTC)
- Ah, seems it was. I'll thank that editor. Rjwilmsi 22:22, 9 April 2013 (UTC)
ISBN checking
Just a note for watchers here who might not be following developments in the code. Last night, I added a function to the sandbox to check for invalid ISBNs. In this context, invalid means wrong length, extraneous characters, or wrong check digit. This compliments existing checks on OL and DOI numbers. Dragons flight (talk) 16:45, 9 April 2013 (UTC)
Wikitext | {{cite book
|
---|---|
Live | Doe, John (1943). My Book. ISBN 1234567890. {{cite book}} : Check |isbn= value: checksum (help); Unknown parameter |sandbox= ignored (help)
|
Sandbox | Doe, John (1943). My Book. ISBN 1234567890. {{cite book}} : Check |isbn= value: checksum (help); Unknown parameter |sandbox= ignored (help)
|
- As we may have dozens of ISBNs on a page, I feel some sort of visual indicator is called for, in addition to the category. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:43, 9 April 2013 (UTC)
- At this phase in development / testing, most of the new error messages are hidden by default. You can follow the directions at Category:Pages with ISBN errors to make the error message visible for you. In the future, most or all of these messages will eventually be made visible for everyone. Dragons flight (talk) 22:49, 9 April 2013 (UTC)
A bug? Shouldn't |ignore-isbn-error=true
quash this error?
{{cite book | title=My Book | author = Doe, John | year = 1943 | ISBN = 1234567890 | no-tracking = true |ignore-isbn-error=true}}
- →Doe, John (1943). My Book. ISBN 1234567890.
{{cite book}}
: Check|isbn=
value: checksum (help); Unknown parameter|ignore-isbn-error=
ignored (|isbn=
suggested) (help)
And while it occurs to me, a space between the ISBN and the error message?
—Trappist the monk (talk) 00:58, 13 April 2013 (UTC)
- All better. Dragons flight (talk) 01:14, 13 April 2013 (UTC)
- Error when both ISBN 10 and 13 together: I quickly found an example where the user put both valid ISBN 10+13 into the same "isbn=" and that combination also gave an erroneous error message "Check
|isbn=
value (help)" in the following cite:
- Mumford, Stan Royal (1989). Himalayan Dialogue: Tibetan Lamas and Gurung Shamans in Nepal (illustrated ed.). Madison, WI: University of Wisconsin Press. p. 264. ISBN 029911984X, 9780299119843. Retrieved 23 January 2013.
{{cite book}}
: Check|isbn=
value: invalid character (help)
- Mumford, Stan Royal (1989). Himalayan Dialogue: Tibetan Lamas and Gurung Shamans in Nepal (illustrated ed.). Madison, WI: University of Wisconsin Press. p. 264. ISBN 029911984X, 9780299119843. Retrieved 23 January 2013.
- Google Search confirms they are the correct ISBNs, as hence both valid ISBN 10 and 13. Creating retro-active rules, to outlaw prior actions or parameters, can lead to numerous artificial problems, where none actually existed. I suggest to drop the ISBN validation, to free "12,582" pages as no longer rejected as erroneous. -Wikid77 (talk) 06:51, 13 April 2013 (UTC)
- No, it is still an error since including both versions of the ISBN breaks Special:BookSources and makes the link worthless. Secondly it is unnecessary since every 10-digit ISBN XXXXXXXXX-Y can be mapped directly into a 13-digit ISBN 978-XXXXXXXXX-Z, where 9 digits are copied over and the final check digit is recalculated. There is no real information content gained from reporting both values when all you are doing is including the universal 978 prefix and a new check digit. (13-digit ISBNs can be issued with other prefixes, but all 10-digit ISBNs map to the 978 prefix.) The redundant ISBN should be removed to allow Special:BookSources to function. Dragons flight (talk) 07:07, 13 April 2013 (UTC)
- Agree. I have seen some odd stuff in the 'isbn' field, mostly where someone wants to include ISBNS for different versions of the book. -- Gadget850 (Ed) talk 12:12, 13 April 2013 (UTC)
- No, it is still an error since including both versions of the ISBN breaks Special:BookSources and makes the link worthless. Secondly it is unnecessary since every 10-digit ISBN XXXXXXXXX-Y can be mapped directly into a 13-digit ISBN 978-XXXXXXXXX-Z, where 9 digits are copied over and the final check digit is recalculated. There is no real information content gained from reporting both values when all you are doing is including the universal 978 prefix and a new check digit. (13-digit ISBNs can be issued with other prefixes, but all 10-digit ISBNs map to the 978 prefix.) The redundant ISBN should be removed to allow Special:BookSources to function. Dragons flight (talk) 07:07, 13 April 2013 (UTC)
Returning to |ignore-isbn-error=true
, should we consider categorizing citations that use this parameter? I haven't talked myself into the need for a visible error message yet (though there should be a hidden message), but it would seem that citations with less than pristine ISBNs should be categorized perhaps as a subcategory of Category:Pages with ISBN errors.
—Trappist the monk (talk) 17:43, 13 April 2013 (UTC)
ISBN cat
Editor Dragons flight left this message on my talk page. I've moved it here because I think that discussions about my work should take place where all can participate should they so choose.
Regarding the ISBN cat, I chose Category:Articles with invalid ISBNs because it already existed and I figured it wasn't really necessary to create a new Category:Pages with invalid ISBNs for this. Do you disagree? Obviously it doesn't fit in with the current naming structure, but we're going to be reworking that anyway. Dragons flight (talk) 16:15, 9 April 2013 (UTC)
Just as it occurs to me that you chose Category:Articles with invalid ISBNs because it already exists, here comes your message. How did you do that. In the time since I changed the category to Category:Pages with invalid ISBNs I have changed it yet again to Category:Pages with ISBN errors so that it is consistent with Category:Pages with DOI errors and Category:Pages with OL errors.
I have this notion in my head that all errors that CS1 detects should be categorized into its own categories so that you and Editor Gadget850 and anyone else who is interested can see what errors are being caught by CS1 without having to sort out those errors caught by other methods. Using preëxisting categories like Category:Articles with invalid ISBNs doesn't accomplish that goal. Certainly, Category:Pages with ISBN errors or whatever it becomes could be made a subcategory of Category:Articles with invalid ISBNs but I think that CS1 should use its own categories. This means that the DOI and OL category assignments in data.error_conditions
as they are now should be changed.
Does any of this make sense or am I up in the night?
—Trappist the monk (talk) 16:57, 9 April 2013 (UTC)
- How did I do it? Module:MindReading of course, you don't use that? I figured it was simpler to use an existing cat, and that matched the process with DOI and OL errors. However, if we do use a separate cat, then it wouldn't matter if it is enabled it as a hidden error which has the benefit of letting us triple check the validation before showing it to people. Either way is probably okay. Do you want to document the separate cat? Dragons flight (talk) 17:21, 9 April 2013 (UTC)
- Is that the Module:MindReading that requires Ctrl+M+R+Insert to install? That's probably why all I get is a blank page. Fingers aren't long enough.
- Documenting a separate category is pretty much a case of copy/pasta from another category and changing the
#lst:
magic word to{{#lst:Help:CS1 errors|bad_isbn_help_text}}
, isn't it? I can do that when I return in a couple of hours but if you'd like ...
- Documenting a separate category is pretty much a case of copy/pasta from another category and changing the
- Okay, I created Category:Pages with ISBN errors. I also got to discover bugzilla:47049 in the process. Now don't you feel like you missed out. :-) Dragons flight (talk) 18:51, 9 April 2013 (UTC)
- Yes I do, though I'm not sure I completely understand the issue. I've add a comment to Help:CS1 errors so that we are reminded of this issue when and if we change all of the category names.
Too many excessive error messages
Among all the thousands of excessive cite error messages, there is still "PDFFile format specified without giving a URL" where the URL is generated from the "doi=" option which apparently links to a PDF-format file "dx.doi.org/10.1098%2Frstl.1823.0020" at the dx.doi.org website. Note:
- Davy, Humphry (1823). "On the Application of Liquids Formed by the Condensation of Gases as Mechanical Agents". Philosophical Transactions. 113 (0): 199–205. doi:10.1098/rstl.1823.0020.
{{cite journal}}
:|format=
requires|url=
(help)
I guess we should list all the excessive error messages here, to remove them from the /sandbox version. Did we decide that "accessdate=" can be used with "doi=" or "bibcode=" or really with no restriction, to let editors focus instead on real cite errors, not 45,000 pages where people did not set "url=" with the accessdate. -Wikid77 (talk) 19:32, 9 April 2013 (UTC)
- Perhaps these should then be warnings rather than errors? Rjwilmsi 20:31, 9 April 2013 (UTC)
- Less than 3% of articles have errors. If the messages are shown by default and fixed, then that number will go down a lot.
- What is the difference between an error and a warning?
- 'format' applies to 'url' and shows after the linked title; it has never been intended to apply to any other parameter; the link in question is a web page with a framed PDF.
- 'accessdate' has never been applied to 'doi or 'bibcode', as these should be static documents that do not change.
- -- Gadget850 (Ed) talk 21:08, 9 April 2013 (UTC)
- Personally, I find it a little sad that less than half of the 6,914,884 "articles" have any citation template at all. Dragons flight (talk) 23:18, 9 April 2013 (UTC)
- Many articles use other systems and many are just stubs. Some editors just hate citation templates. -- Gadget850 (Ed) talk 00:32, 10 April 2013 (UTC)
- Performance was one of the main complaints, so the Lua conversion should help. Rjwilmsi 05:05, 10 April 2013 (UTC)
Large update
I've resync the main code to the sandbox. This is large substantial diff [10] due to a number of internal changes in the code, so the chance of overlooking something is larger than average.
The new version brings a configuration file for the handling of errors, IDs, and translations. It also introduces new conditions for ISBN errors and redundant parameters. In addition the unnamed parameters where shifted to a separate cat a separate cat from the unsupported parameters. This version also adds a suggestion list function for unsupported parameters.
This update also introduces the (help) link into error messages. At present, the visibility of errors has not yet been changed, so the visible message and help link will be evident to users only on errors that were already visible (e.g. Bad DOI, Bad OL, Empty Citation, Archive URL errors, etc.), but not on any of the new error conditions that we have been adding. Dragons flight (talk) 21:09, 9 April 2013 (UTC)
- Good. Glad to see that you've protected Module:Citation/CS1/Configuration from us lesser mortals. I think you ought to do that with Module:Citation/CS1/Suggestions as well. We can play in the sandbox without hurting anything but shouldn't have the opportunity to suggest rude words and phrases when an editor misspells accessdate.
- I'm inclined to leave it open for now in the hopes that doing so will encourage it to be expanded. If it becomes a problem it can be locked down of course. However, it is only consulted on the 10-20 thousand pages with named problem parameters, and then the output value is only displayed when there is a match on the specific input value, so the potential for vandalism isn't actually that great. And that potential for vandalism impact will actually go down as the unsupported parameters are fixed. Dragons flight (talk) 22:36, 9 April 2013 (UTC)
Migration of cite press release
I have now converted {{cite press release}} to use Lua. Testcases: Module talk:Citation/CS1/test/press. This was the most prominent of the remaining cite_* templates, but still has only 16,800 uses. After this update propagates through, {{citation/core}} should be reduced to around 50,000 pages for the combined effect of all the remaining templates. Dragons flight (talk) 22:12, 9 April 2013 (UTC)
A bug I think
This citation from Elena Ivanova:
{{cite web | url = http://samara-sport.ru/?page_id=37 | title = | language = Russian | trans_title = Figure Skating | publisher = GUOR official website | accessdate = 2009-03-07 }}
- (in Russian). GUOR official website http://samara-sport.ru/?page_id=37. Retrieved 2009-03-07.
{{cite web}}
: Missing or empty|title=
(help); Unknown parameter|trans_title=
ignored (|trans-title=
suggested) (help)
This citation produces two hidden error messages. The first is: Missing or empty |title= ([[Help:CS1 errors#trans_missing_title|help]]) The second is: Wikilink embedded in URL title (help)
I have modify the above {{cite web}}
to use and empty |chapter=
and |trans_chapter=
:
{{cite web | url = http://samara-sport.ru/?page_id=37 | chapter = | language = Russian | trans_chapter = Figure Skating | publisher = GUOR official website | accessdate = 2009-03-07 }}
- (in Russian). GUOR official website http://samara-sport.ru/?page_id=37. Retrieved 2009-03-07.
{{cite web}}
: Missing or empty|title=
(help); Unknown parameter|trans_chapter=
ignored (|trans-chapter=
suggested) (help)
This citation also produces two hidden error messages. The first is: Missing or empty |chapter= ([[Help:CS1 errors#trans_missing_chapter|help]]) The second is: Wikilink embedded in URL title (help)
—Trappist the monk (talk) 00:58, 10 April 2013 (UTC)
- Fixed. Dragons flight (talk) 01:23, 10 April 2013 (UTC)
- Not really related, but I also created and then tweaked Module:Citation/CS1/Configuration/sandbox to fix an anchor error. That needs to be checked and then synced.
- Also done. Dragons flight (talk) 01:51, 10 April 2013 (UTC)
- Good. Thanks. I think that all of the error message links go to the correct places in Help:CS1 errors.
Proposed release schedule for revealing errors
Grouping | Error Type |
---|---|
1 | Citations using unsupported parameters |
Citations using unnamed parameters | |
Citations having redundant parameters | |
Citations with ISBN errors | |
Citations having wikilinks embedded in URL titles | |
Citations using conflicting page specifications | |
2 | Web citations with archiveurl missing url |
Citations having bare URLs | |
Citations lacking titles | |
Citations using translated terms without the original | |
Citations with URL errors | |
3 | Web citations with no URL |
Citations with format and no URL | |
Citations with accessdate and no URL | |
Citations with old-style implicit et al. | |
Additional error conditions not yet defined... |
In the interest of not scaring everyone with red ink, I would like to propose introducing the hidden error messages to the general public in three batches. The first batch is grouped around errors where the user input is known to be malformed or can't be understood. The second batch is grouped around title and url errors, and generally something has to be added to correct these errors. The last batch are errors that generally produce reasonable output as is and usually require only relatively minor fixes (as well as other error conditions that we might define later on). I'm imagining making group one visible this week, group two next week, and group three some time later. Part of it may be dealing with community reaction. If there is a lot of push back, it may make sense to roll things out more slowly. Anyway, those are my thoughts. Other opinions? It would be nice, though not essential, to get the category names fixed before we dive into this. Dragons flight (talk) 05:28, 10 April 2013 (UTC)
- Yes, definitely categories first. There has been no response to my category naming questions at WT:Category names §Seeking input regarding ... Lua-based citation templates so I've posted at WT:CFD to see if anyone there has opinions.
- At #Error messages there was some discussion of prefixing the error messages. That discussion has apparently stalled without conclusion.
- At #List of current CS1 error messages footnote c is this:
|archiveurl=
requires|url=
when|deadurl=
is missing or blank, so we assume that|url=
is dead.|archiveurl=
also requires|url=
when|deadurl=
is explicitly set tono
indicating that|url=
is not dead. So it doesn't matter what state|deadurl=
is in; whenever|archiveurl=
is defined and not empty,|url=
is required. Right? So isn't it the case that we should be looking for|deadurl=yes
without a matching|archiveurl=
? Which by the way, is a condition that we don't catch.
- At #Further input on the archiveurl inconsistency, Editor Dragons flight reports a post he made to VPP. My read of the comments there seems to indicate that
|archiveurl=
requires|url=
for all CS1 citations.
- I got these last three by scanning all of the talk above. I have likely missed something important.
- I think the "(help)" provides much the same function as something like "CS1 Error:", i.e. it provides a convenient link handle and could help identify an error to people with screen readers, etc. It's not quite as obvious as saying "CS1 Error:", though as before I'd generally prefer not to be too aggressive with our error messages. Though if there is a consensus to use such labels, then it isn't hard to add them.
- I intend to fix all the archiveurl= uses to require url= before we get to group 2, as well as generally making those errors the same across all of the templates. (I do have an expectation that archiveurl= without url= in cite web is going to run up large numbers, but people seem to want that to be considered an error, so we can do that.) I'm not actually sure about adding the deadurl=yes error condition you mention. It is not illogical to want to note that a url is dead even if you haven't yet found an archive. Of course, if a user does try to note that, then at present we don't do anything with that information, but maybe we should. Perhaps deadurl=yes with no archiveurl should be marked with something similar to {{dead-link}} rather than an error message.
- It would be nice to sort out the category names, and I'm willing to wait a bit if some discussion seems likely to be productive, but based on the response so far I suspect that most people just don't care. Dragons flight (talk) 17:23, 10 April 2013 (UTC)
- The #Further input on the archiveurl inconsistency discussion at VPP has been peremptorily closed.
- I have moved my post from WT:Category names to WT:CFD at the prompting of an editor there.
I'd like some feedback on the specific issue of whether we should wait for the category names to be settled before revealing new error messages. There has been some feedback at the section of WT:CFD, but it doesn't seem to have generated a ton of consensus yet. At the same time, we know that categories can often take a long time to populate / depopulate (e.g. #Beware 3-day delay in new categories). Module:Citation/CS1/Configuration was installed three days ago, and presently has only about half in the number of transclusions as Module:Citation/CS1, meaning we might have to expect that updates to take a week or more to reach all of the associated pages. With those features in mind, I'm inclined to go ahead with revealing the first batch of new error messages, and let the category updates happen whenever they happen. This is not to say that resolving the category names isn't important, it is, but it looks like it could take quite a while to decide on new names and repopulate the categories, and I'd rather move ahead more quickly than that and let the categories catch up when they are ready. Other thoughts? Dragons flight (talk) 17:49, 12 April 2013 (UTC)
- My guess is that category names are only important to me. I don't foresee consensus arising at WT:CFD in the next couple of hours. While it would be easier for me if you waited, if you're ready to go, then go. I am not playing the don't worry about me, I'll be fine card. If you're ready, go.
- Okay, I went ahead and turned on the first block of errors. Dragons flight (talk) 00:05, 13 April 2013 (UTC)
- Just because I am curious to see what happens, I collected these data points:
Error messages made visible at 2013-04-13T00:04 UTC Category Number of pages
2013-04-13T01:50 UTCCategory:Pages with citations using conflicting page specifications 2305 Category:Pages with citations having wikilinks embedded in URL titles 7419 Category:Pages with citations using unsupported parameters 17227 Category:Pages with ISBN errors 12582 Category:Pages with citations having redundant parameters 7233 Category:Pages with citations using unnamed parameters 14371
- :-) I noticed someone fixing one of these errors when glancing at recent changes. Dragons flight (talk) 03:18, 13 April 2013 (UTC)
I've updated group 2 to reflect the release that actually occurred. Dragons flight (talk) 23:09, 19 April 2013 (UTC)
The vertical bars in error messages
- Error messages became more confusing with bars inserted: The prior style of error messages, last week, seemed clearer, without the bar-parameters which look like unprocessed parameters in red. Perhaps reword some messages for better clarity:
- Now:
|archiveurl=
requires|url=
when|deadurl=
is missing or blank - New: Cite 'archiveurl=' requires 'url=' when 'deadurl' missing or blank.
- Now:
- When reading for error-message text, I am trying to think in the manner of a new user who thinks the literal text of the cite contains the bars "|" which have turned red, but actually, it is a strange form of quoting "url=" with "deadurl=" by changing font+bars, as showing: "
|url=
when|deadurl=
is missing". My first reaction is to see "|url=when" (as the URL set to word "when") while not reading in font-speak mode, where the font face changes the meaning of the text. The bar "|" is typically a parameter delimiter, and it complicates matters when trying to copy/paste to show other users the error-message text. However, I have learned to read in font-speak mode, with embedded bars ("|"), but it is still a foreign language when plain English uses simple quotation marks to note parameter "url=" when "deadurl=" is blank. -Wikid77 (talk) 13:14, 10 April 2013 (UTC)
- I'm not wedded to the use of vertical bars. In fact, I think they look rather superfluous, but I copied them as suggested at #List of current CS1 error messages and because they seem to be commonplace in error messages and descriptions due to the {{para}} template (e.g.
{{para|something}}
=|something=
). If people want to drop the vertical bars, I'd be fine with that, but what do other people think? Dragons flight (talk) 16:45, 10 April 2013 (UTC)- The doc pages use bold for the parameter names. Perhaps wrapping in
<code>
would be simplest, as it offsets the text in monospace while keeping the semantics. -- Gadget850 (Ed) talk 18:56, 10 April 2013 (UTC)- To claricy: I'm for dropping the = as well. Instead of
|url=
, justurl
. -- Gadget850 (Ed) talk 19:44, 10 April 2013 (UTC)
- To claricy: I'm for dropping the = as well. Instead of
- The doc pages use bold for the parameter names. Perhaps wrapping in
- I'm not wedded to the use of vertical bars. In fact, I think they look rather superfluous, but I copied them as suggested at #List of current CS1 error messages and because they seem to be commonplace in error messages and descriptions due to the {{para}} template (e.g.
- For what it is worth, I actually like the equals sign. I think
url=
does a good job of highlighting that something is a parameter, whileurl
is perhaps a little too subtle and|url=
might be a bit too much. Altogether though, such issues are pretty trivial. We are probably doing pretty well if we have now been reduced to debating about such little style issues. :-) Dragons flight (talk) 00:45, 11 April 2013 (UTC)
- For what it is worth, I actually like the equals sign. I think
I suppose that I chose |parameter=
style for the error messages because the style is commonly used. |parameter=
style exemplifies parameter use in real templates. Yeah, a first-time reader encountering one of these error messages might be perplexed but there is the (help) link to follow for assistance. Anyone who has spent even a few hours editing Wikipedia articles will see the |parameter=
styling for what it is. And there is the help link provided with every error message:
|archiveurl=
requires|url=
when|deadurl=
is missing or blank (help)
|parameter=
styling is used throughout the Help:CS1 errors help page so that there is a consistent look to parameter labels from the error report in the article to the repeated error message to the examples in the text. In looking at them, Bad DOI specified, Bad ISBN specified, and Bad DOI specified aren't consistent with the other error messages that refer to parameters. They should probably be changes to Bad |doi=
value, Bad |isbn=
value, and Bad |ol=
value.
Though not explicitly related to this topic, Help:Citation Style 1, which deals with other aspects of the CS1 citations, makes extensive use of |parameter=
styling. I don't think I've ever touched that page.
I am curious to know what editors outside our little cloister think. Perhaps you could arrange an appropriate RfC?
—Trappist the monk (talk) 21:53, 10 April 2013 (UTC)
I have changed the DOI, ISBN, OL, and new URL error messages to read:
- Check
|doi=
value (help) - Check
|isbn=
value (help) - Check
|ol=
value (help) - Check
|url=
scheme (help)
A little softer than Bad.
Parameters that don't cause redundant parameter errors
This pathological citation should have a few errors because of redundant parameters. I suspect that this occurs because these parameters aren't selected through selectone()
. Should they be?
{{cite web |url=http://www.example.com |URL=http://www.foobar.com |title=Title page |dictionary=Dictionary page |encyclopedia=Examples for All |accessdate=2013-04-10 |access-date=yestermorrow |archiveurl=archiveurl |archive-url=archive-url |archivedate=today}}
- → [archive-url "Title page"]. Archived from the original on today. Retrieved yestermorrow.
{{cite web}}
: Check|archive-url=
value (help); Check date values in:|access-date=
and|archivedate=
(help); More than one of|URL=
and|url=
specified (help); More than one of|accessdate=
and|access-date=
specified (help); More than one of|archiveurl=
and|archive-url=
specified (help); More than one of|dictionary=
and|encyclopedia=
specified (help); Unknown parameter|encyclopedia=
ignored (help)
There are several parameters that might bear review. I don't know if this is all of them.
local Coauthors = args.coauthors or args.coauthor local PublicationDate = args.publicationdate or args["publication-date"] local Title = args.title or args.encyclopaedia or args.encyclopedia or args.dictionary local TitleLink = args.titlelink or args.episodelink local TransChapter = args["trans-chapter"] or args.trans_chapter local ArchiveURL = args["archive-url"] or args.archiveurl local URL = args.url or args.URL local ChapterURL = args["chapter-url"] or args.chapterurl or args["contribution-url"] local ConferenceURL = args["conference-url"] or args.conferenceurl local PublicationPlace = args["publication-place"] or args.publicationplace local AccessDate = args["access-date"] or args.accessdate local ArchiveDate = args["archive-date"] or args.archivedate local DoiBroken = args.doi_inactivedate or args.doi_brokendate or args.DoiBroken local ASINTLD = args["ASIN-TLD"] or args["asin-tld"] local TranscriptURL = args["transcript-url"] or args.transcripturl local no_tracking_cats = args["template doc demo"] or args.nocat or args.notracking or args["no-tracking"] or "";
—Trappist the monk (talk) 13:06, 10 April 2013 (UTC)
- You're right, I haven't wrapped every possible case in selectone, rather I focused on the cases where redundancy was more likely to be surprising. This was a combination of wanting to see the performance impact and laziness. Having gotten here, I think we probably could wrap the rest in selectone without the sky falling down. Dragons flight (talk) 16:07, 10 April 2013 (UTC)
- Ok, being bold, in the sandbox I've tweaked all of the assignments listed above to use
selectone()
except for:
- Ok, being bold, in the sandbox I've tweaked all of the assignments listed above to use
local Title = args.title or args.encyclopaedia or args.encyclopedia or args.dictionary local no_tracking_cats = args["template doc demo"] or args.nocat or args.notracking or args["no-tracking"] or "";
- I didn't do those because
Title
uses some of the same parameters asPeriodical
. I guessed thatselect()
will report an error when the citation legitimately uses parameters that can be used for bothTitle
andPeriodical
. I didn't dono_tracking_cats
because of the empty string assignment at the end of theor
list.
- I didn't do those because
- Tell me if I've buggered everything up.
- Looks good to me. At some point I need to figure out a better way of handling the messed up encyclopedia case. Dragons flight (talk) 20:25, 10 April 2013 (UTC)
- Redundant parameters are very rare, typically author names: After editing over a thousand of the wp:CS1 cite articles, the most common problem I have seen is many cases of last+author, as "last=Doe |first=John" with "author=John K. Doe". I am thinking the middle initial is what drives many people to restate the author's name with middle initial "author=John K. Doe". Only once have I seen 3 "author=" together, as in the example:
- Parameters: {{... |author=John Doe |author=Mary Dough |author=Mark Z. Smith |title=Title|date=1 June 2005|pages=45}}
- Cite book/old: {{cite book/old |author=John Doe |author=Mary Dough |author=Mark Z. Smith |title=Title|date=1 June 2005|pages=45}}
- Cite book/lua: {{cite book/lua |author=John Doe |author=Mary Dough |author=Mark Z. Smith |title=Title|date=1 June 2005|pages=45}}
- We know markup could not see multiple "author=" and kept only the final value. For other cases (such as "pages=" with "at="), then most of them have one parameter blank. In general, about 30%-40% of the typical lesser parameters are included as blank (such as "|publisher= |") with no actual value, else omitted entirely. The greatest danger is 2 "last=" or else both "last=|last1=" where Lua will reject the both last/last1 case, but then every problem is the work for the user to fix. -Wikid77 (talk) 13:52, 10 April 2013 (UTC)
- As far as I can tell, it is not possible in Lua to detect that a parameter was called repeatedly. So "| author = John Don | author = Mary Smith | author = James Ford" is just something we are stuck with. The redundant parameters check will allow us to pick off the alias conflicts (e.g. |last1= and |last=), but not the strict repetitions. Dragons flight (talk) 16:31, 10 April 2013 (UTC)
Separate error for redundant names
The most common case of redundant parameters seems to be redundant naming parameters, such as using both |last= and |author=. These conflicts can be a little surprising, since it might not be immediately obvious to users why something like |last= is functionally the same as something like |author= even though their names might suggest different intentions. I'm wondering if it might be helpful to break the redundant author parameters case off as a separate error so that it might be given a more specific error message on the help page. Thoughts? Dragons flight (talk) 16:50, 10 April 2013 (UTC)
|author=
,|last=
, and|last1=
are already used in the example at More than one of |param1=, |param2=, and |param3= specified. Not sure how to make that better. We could perhaps, create a parameters thesaurus/dictionary that might be linked to from the error message for in-detail discussion ... I don't think that this would have any impact on Module:Citation/CS1.
Bad URL check
Please note that I've added a check for malformed URLs to the sandbox. If someone could edit Module:Citation/CS1/Configuration/sandbox to whatever error message and cat you feel is appropriate as well as add appropriate text in the associated help page / category, then that would be much appreciated. Thanks. Dragons flight (talk) 23:36, 10 April 2013 (UTC)
- Error message tweaked, category page created, help text written, done.
Showing content of unsupported parameters in error message
What do people think of adding the content of unsupported parameters to the error message. For example, change:
Unknown parameter NewPaper=
ignored
to
Unknown parameter NewPaper=Boston Herald
ignored
That would have the effect of making the broken content visible to readers, who often would be able to figure out what was intended even if the citation hadn't been corrected yet. Dragons flight (talk) 00:54, 11 April 2013 (UTC)
- It would seem that we would also need to do this with
- Unknown parameter
|NewPaper=Boston Herald
ignored (|newspaper=Boston Herald
suggested)
- Unknown parameter
- if the unknown parameter could be matched in suggestions which makes the error message rather long. You could do this:
- Unknown parameter
|NewPaper=Boston Herald
ignored (|newspaper=
suggested)
- Unknown parameter
- A bit shorter but then it looks like we're suggesting that editors replace
|NewPaper=Boston Herald
with|newspaper=
which isn't correct. We could do this:|NewPaper=Boston Herald
(|newspaper=Boston Herald
suggested)
- because the unknown parameter hasn't really been ignored – we just painted it red. Or even more minimalist:
|newspaper=Boston Herald
(suggested)
- But after all of this, I have to wonder: is display of the
|unknown=value
pair more likely to invite editors to fix the error? I don't know.
- Just say "Found" and not "ignored" and omit "|" bars: The mixed display of the internal bar "|" separators gives the impression of literal text being echoed in the error message, which would be a lie if showing the first parameter of a cite (has no leading "|"). Instead use quoted text:
- Then link each "expected" to the help-page section. Otherwise, many articles will be awash in a sea of red ink. -Wikid77 (talk) 08:31, 11 April 2013 (UTC)
- I sort of like the first example of using found and expected as a replacement for
- Unknown parameter
|xxxx=
ignored (|yyyy=
suggested) (help)
- Unknown parameter
- I sort of like the first example of using found and expected as a replacement for
- The second example is the case of
- Text "????" ignored (help)
- The second example is the case of
- CS1 can't know what to expect from raw text between pipes or between a pipe and the closing }}. So perhaps this:
- Found text: "????" (help)
- CS1 can't know what to expect from raw text between pipes or between a pipe and the closing }}. So perhaps this:
- We all know that I favor
|parameter=
style and you do not. We aren't likely to change our positions so we shouldn't keep restating them. The deadlock will have to be broken by others.
- We all know that I favor
Beware 3-day delay in new categories
I am seeing evidence that many thousands of pages have been delayed in moving the "12,000" pages to the unnamed-parameter category, for over 2 days. Concurrently, Template:Convert has also been changed to link 531,000 pages into several tracking categories, and that might be slowing the relinking of the CS1-cite categories. However, a null edit of any page will force the re-categorization (without logging a history change), to prove that the Lua cites are correctly reassigning the new categories. The whole process is just experiencing a 3-day delay. In general, when releasing ("unleashing") the red-error messages upon the readers, we need to have the categories resettled and ready, with most thousands of pages in each proper category, not Jekyll/Hyde switching categories after the first minor edit to a page. -Wikid77 (talk) 08:31, 11 April 2013 (UTC)
- You can also use the API sandbox to force an upate. Regardless, this is just an annoyance, not a major issue. -- Gadget850 (Ed) talk 12:37, 11 April 2013 (UTC)
- The problem is that the job queue sucks. Updating this module requests that nearly 2 million other pages be updated. The job queue tries to space them out so that the burden of doing all those updates is distributed over a long period of time. In practice, I tend to find that that the server farm will update roughly a million pages per day. More significantly, when an update for a particular page is requested, the job queue also checks if the database servers are overly busy, and such updates can simply be abandoned if servers are too busy. So in practice, even after the updates are "completed" it is not uncommon to find that many pages have been skipped. Those skipped pages won't get updated until they are either edited directly or some other process adds them to the update queue again. This general issue with slow and skipped updates, can be particularly noticeable with populating and depopulating categories. Dragons flight (talk) 17:36, 11 April 2013 (UTC)
- Repopulating categories quickened with {Convert} relinked: Within a few hours, the unnamed-parameter category has jumped from the misleading "6,200" pages to nearly 9,000 pages, which better reflects the proportion of articles which merely have a few cites with extra text after a "|" bar (and no other cite errors). Apparently, {Convert} has finished relinking the 532,000 pages where it is used. Unlike many of the misspelled cite parameter names or wannabe names (such as "paragraph=" or newspaper "column="), the extra cite text could be simply displayed, after the postscript dot, with a simple "[fix cite] " superscript note to indicate the problem has been auto-corrected, but could benefit with a specific fix at a later time. That auto-correction process would keep from marring thousands of articles with garrish red-error messages, to allow editors to focus on editing thousands of pages with more-severe format problems, instead. Anyway, we are beginning to see the true counts of pages with very trivial cite errors. Wikid77 (talk) 21:25, 11 April 2013 (UTC)
Handling different author and editor parameter types
Is there a case to be made for handling author and editor parameters |authors=
and |editors=
differently from |author=
, |authorn=
, |editor=
, and |editorn=
differently from |last=
, |lastn=
, |editor-last=
, and |editor-lastn=
. At least semantically, these three groups of parameters are different.
|authors=
and|editors=
– multiple authors / editors in free-form lists|author=
,|authorn=
,|editor=
, and|editorn=
– separated whole names|last=
,|lastn=
,|editor-last=
, and|editor-lastn=
– separated last names with the expectation that there are a matching first name parameters
I don't know exactly how this could or should be implemented but it seems to me that mixing of the different parameter styles should be discouraged and that when encountered, CS1 should report a Mixed author style error.
—Trappist the monk (talk) 16:08, 11 April 2013 (UTC)
- 'author' has typically been used to include the full name of an author and 'authors' to include a list of authors; ditto for editor. This is now discouraged, as it does not create a proper anchor for Shortened footnotes and other systems. Down the road we might want to throw these into a category and see how they are used. 'author' is also used for Asian and other names where last, first is not appropriate. Right now, it really isn't an issue. -- Gadget850 (Ed) talk 17:31, 11 April 2013 (UTC)
URL missing when archiveurl used on cite web
As a purely transitional measure, I've added a hidden error on cite web to identify the case of |archiveurl=
being specified without |url=
. These are now being fed to Category:Pages with archiveurl cite web errors. I'm a little worried about how common this error might be, but I guess we'll see. My intention is to unify all of the archiveurl errors, probably sometime next week. Dragons flight (talk) 17:13, 11 April 2013 (UTC)
- I was going to suggest that as well, since they are the same issue. If there are errors, then they just need to be fixed. -- Gadget850 (Ed) talk 17:18, 11 April 2013 (UTC)
Quotes
It would be great if quotes symbols are moved to Module:Citation/CS1/Configuration too. Some languages like Ukrainian use «text» instead of "text", so it would simplify migration (and further maintaince) of the template to other wikis. --DixonD (talk) 20:43, 11 April 2013 (UTC)
- This should be in CSS; see Module talk:Citation/CS1/Feature requests#Presentation and content. -- Gadget850 (Ed) talk 22:49, 11 April 2013 (UTC)
- How are going to distinguish an opening quotation mark from a closing one by CSS? --DixonD (talk) 05:01, 12 April 2013 (UTC)
- The English Wikipedia uses straight quotes to open and close, so the example uses \22 for both. For the guillemet, you would use \AB and \BB respectively. You could include CSS styling for bold, underline or whatever your local style desires. -- Gadget850 (Ed) talk 15:22, 12 April 2013 (UTC)
- How are going to distinguish an opening quotation mark from a closing one by CSS? --DixonD (talk) 05:01, 12 April 2013 (UTC)
- Configuration data should combine live parameter values: The canned messages should be modified, live, to also show the value of the actual parameters. For example, to quote a raw date:
- message = 'Found «' .. value .. '» (expected «=»)'
- message = 'Found «2 April 1997» (expected «=»)'
- The static canned text for messages is stored in Module:Citation/CS1/Configuration; however, to use the same pattern of static-message text, then a canned message could be split into the 2 static prefix/suffix portions.
- pre_msg = 'Found «'
- post_msg = '» (expected «=»)'
- say = pre_msg .. value .. post_msg
- For other languages, the alternate quotation-mark characters (« ») could be inserted into each pair of pre_msg and post_msg text strings, along with other changes in punctuation typical of the other language. By changing each static 'message' variable, to become the two static variables, 'pre_msg' and 'post_msg', then the specific quotation-mark characters would be inserted into those 2 variables, along with any extra punctuation needed for words in the other language. However, this whole situation has become overly complex, rather than simply auto-correcting the obvious typos in citation templates, and quietly showing results, which editors will either accept or want to edit the page to alter values. There is little need for wp:Grandstanding of the cite errors as glaring red-error messages. -Wikid77 (talk) 05:15, 13 April 2013 (UTC)
- I don't know how error messages crept into this, but the feature request discusses the difference between content and presentation and how to handle the presentation. This is a discussion for down the road a bit. -- Gadget850 (Ed) talk 11:59, 13 April 2013 (UTC)
- And by using CSS, editors can create their own style. For example, the straight quotes could be replaced by curly quotes. -- Gadget850 (Ed) talk 13:32, 14 April 2013 (UTC)
- I don't know how error messages crept into this, but the feature request discusses the difference between content and presentation and how to handle the presentation. This is a discussion for down the road a bit. -- Gadget850 (Ed) talk 11:59, 13 April 2013 (UTC)
- In the Configuration sandbox, I've defined controls for the quotes and several other message styles. This could be used as is for translation. The same interface could also be used to add CSS spans for styling, if we intend to go that way. Dragons flight (talk) 18:18, 16 April 2013 (UTC)
Watch growth/reduction of cite-error pages
Because there are a massive number of cite-error messages being shown to users, the impact of those messages can be gauged by the counts of affected pages. Perhaps compare category-index counts every week, now that some red-error messages have been made visible. Also, beware that the natural tendency is for more errors to be created, as editors add more cite templates without fully checking (or bothering to fix) the output. So, the hope is that the red-error messages will catch the attention of many concerned editors, who will work, each week, to fix all cite-errors on each page, as faster than the new cite-error pages are logged into each category. The first day was 13 April 2013:
Table updated 2013-04-15T00:25 UTC
Table updated 2013-04-17T00:28 UTC
Table updated 2013-04-20T00:37 UTC
Table updated 2013-04-22T09:35 UTC Note:Category suppression of user pages was enabled on the 2013-04-20
Added additional categories 20:07, 22 April 2013 (UTC)
Table updated 2013-04-28T11:48 UTC +live column
Table updated 2013-04-30T01:00 UTC +3 categories
Table updated 2013-05-10T01:00
Again, the red-error messages should obviously capture the attention of the 9,200 active article editors (>25 edits per month), and the question is, "How fast will they correct those cite errors?" (with all the other issues to consider, or other articles of interest). A page can be listed into multiple categories, so if all cite-errors are fixed, then the counts of all those categories will lower by one (within a few minutes). If only a few cite-errors are corrected, on a page with many, then it might seem as if little progress has been made for that page. Obviously, where many cite-errors are displayed (6-30), then fixes might be scared away as "too big a can of worms" and also, the accuracy of the page is likely to be questioned, if so many other "errors" exist in the page then how can it be trusted. Previously, each category was growing every day, so a slower growth could indicate how new errors are being corrected to slow the typical addition of new cite-error pages. -Wikid77 06:11, 13 April 2013 (UTC)
- Early visibility despite 2-day delay: Because the cite-error messages are not yet auto-correcting trivial cases, they are appearing in numerous major articles, either during edit-previews or after other edits, and so people can see many cite-error messages even before the full reformatting of the Lua cites during the next few days. Also, again Template:Convert has been updated within 4 hours after the Lua cites, to re-reformat 534,000 pages to remove (depopulate) hidden tracking categories which killed some conversions (as non-numeric data), and so the overall period might be another 3-day delay to reformat all 2 million CS1-cite pages to show error messages. -Wikid77 15:04, 13 April 2013 (UTC)
- I suggest adding Category:Pages_using_citations_with_accessdate_and_no_URL. It is on top with ~45,000 pages today. Or simply all 18+1 Category:Articles_with_incorrect_citation_syntax. -DePiep (talk) 09:23, 29 April 2013 (UTC)
- A note about the decrease/increase trend we may see. Not only are cite templates with errors added by editors as you state, but also when another CS1-using template is converted to Lua. Then hundreds or even thousands are added to a category. These jumps will die down when more CS1 templates are converted. Today the old {{citation/core}} only has 67.000 transclusions left! The statistics become more valuable over some more weeks. I am most interested in the "edit errors out" rate. -DePiep (talk) 09:23, 29 April 2013 (UTC)
Long-term methods to auto-correct cites
Already, thousands of pages have been tagged with numerous red-error messages for very trivial cite-data issues. We can later upgrade the Lua module to auto-correct for several trivial issues:
- Allow "author=" with prefix "first=" as valid: A very common habit, in many prior cites, has been to set author name to include a middle initial, while parameter "first=" has only the first name. To auto-correct, allow "author=" to contain "first=" as the prefix text for the given name, and do not log into a category.
- Stop rejecting ISBN variations: There are numerous "close-enough" variations in ISBN numbers, to stop validation efforts. Some 4 common forms are:
- ISBN 9 as ISBN-10 form dropping the final check digit (identical 9 digits)
- ISBN 9 as ISBN-10 form dropping lead zero "0" digit (identical final 9 digits)
- ISBN 10+13 as ISBN-10 form, then ISBN-13 form (both in "isbn=").
- ISBN as "isbn=ISBN..." with prefix "ISBN" (link will work).
- It is just not worth the effort to validate the ISBN number.
Again, these auto-correction actions are a long-term concern, to greatly reduce the number of errors when using the wp:CS1 cite templates. -Wikid77 (talk) 07:30, 13 April 2013 (UTC)
- Invalid ISBNs generate invalid links. They are broken and need to be fixed. -- Gadget850 (Ed) talk 11:49, 13 April 2013 (UTC)
- Early Lua validation of ISBN can be reduced: Since the validation of ISBN numbers was recently added to the Lua cites, then the results will show better ways to validate in the next iteration. Many various forms of ISBN numbers will generate valid links, such as "isbn=hardback 978-1-59714-033-1" (link: hardback 978-1-59714-033-1) even though that was considered a cite-error problem in the early validation. Avoid excessive validation where many results are actually acceptable links. -Wikid77 15:04, 13 April 2013 (UTC)
- No, ISBN is included in COinS data so
|isbn=hardback 978-1-59714-033-1
likely buggers that up. How a link like Special:BookSources/hardback_978-1-59714-033-1 works within Wikipedia is outside of CS1's bailiwick.
- No, ISBN is included in COinS data so
- —Trappist the monk (talk) 15:25, 13 April 2013 (UTC)
- Not likely, it does munge the metadata. -- Gadget850 (Ed) talk 20:19, 13 April 2013 (UTC)
- —Trappist the monk (talk) 15:25, 13 April 2013 (UTC)
- Yeah, so I discovered, and which has cause me to post at #COinS
Newline in title
I thought we filtered newlines from the title fields. Problem is that it breaks the italics:
{{Cite book |title=Title copied from somewhere that included a newline}}
Title copied from somewhere that included a newline. {{cite book}}
: line feed character in |title=
at position 18 (help)
-- Gadget850 (Ed) talk 20:21, 13 April 2013 (UTC)
- Fixed in sandbox. Dragons flight (talk) 13:34, 14 April 2013 (UTC)
Template in isbn
At Help talk:Citation Style 1#My_Funny_Valentine is some discussion of an interesting ISBN error where |isbn=
includes a number and a {{Please check ISBN}}
template. The template only substitutes text when it is used in article space. In the citation at My_Funny_Valentine#References the ISBN field contains:
- ISBN [[Special:BookSources/1-4144-0140-9|1-4144-0140-9 [[Category:Articles with invalid ISBNs]]]]
That's clearly incorrect but may be how we intend it to be. I remember sometime someplace reading what the order is when the wikimedia software process a page. Can I find it again? No. Is this a case where we are ignoring the wiki markup (like we ignore wiki markup in a title) or is this a case where wikimedia is inserting the expanded {{Please check ISBN}}
after CS1 has done its thing? That just seems backwards.
I'd like to know 'cause I'd like to know and because surely, this issue will come up again. I've started adding notes to {{Please check ISBN}}
and other templates to try to forestall this kind of usage in future.
ISBN tweaks
I've modified the ISBN logic to make it more forgiving. Specifically, I used similar logic to what is done in Special:BookSources to strip non-ISBN characters. This means that things like |isbn=0123456789 (hardcover)
and |isbn=01–234–5678–9
(dashes instead of hyphens) will no longer report an error. To compensate for this, I added a similar cleaning routine in front of the COinS output, to avoid transmitting garbage via COinS. I think it makes sense, at least initially, to focus on ISBN problems that break the special page. Dragons flight (talk) 14:35, 14 April 2013 (UTC)
- CSI should still report an error when there is extraneous text in
|isbn=
. We should not be passively perpetuating poor practices.{{cite book |title=Title |isbn=hardback 978-1—234–56789—7}}
- →Title. ISBN hardback 978-1—234–56789—7.
{{cite book}}
: Check|isbn=
value: invalid character (help)
- Further, this
{{cite book |title=Title |isbn=978-1—234–56789—7 978-1—234–56789—7}}
- →Title. ISBN 978-1—234–56789—7 978-1—234–56789—7.
{{cite book}}
: Check|isbn=
value: invalid character (help)
- produces this:
'"`UNIQ--templatestyles-0000031D-QINU`"'<cite class="citation book cs1">''Title''. [[ISBN (identifier)|ISBN]] [[Special:BookSources/978-1—234–56789—7 978-1—234–56789—7|<bdi>978-1—234–56789—7 978-1—234–56789—7</bdi>]].</cite><span title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=Title&rfr_id=info%3Asid%2Fen.wiki.x.io%3AModule+talk%3ACitation%2FCS1%2FArchive+6" class="Z3988"></span> <span class="cs1-visible-error citation-comment"><code class="cs1-code">{{[[Template:cite book|cite book]]}}</code>: </span><span class="cs1-visible-error citation-comment">Check <code class="cs1-code">|isbn=</code> value: invalid character ([[Help:CS1 errors#bad_isbn|help]])</span>
- To clarify, it is not that I think
|isbn=hardback 978-1—234–56789—7
is right. It is rather than I don't want to force the issue right now. Things that break Special:BookSources are clearly errors and it is easy to explain that to people. I would argue that many things that add extraneous text or strange formatting, but don't break the special page, are also errors. However, those cases are really an issue to be addressed by the manual of style. As far as I can tell there is no document that says ISBNs always ought to be formatted a certain way, and I worry that we start to get ahead of ourselves if we force people to adopt a certain style. I think we ought to revisit the issue later, but for now I suggest focusing on the ~12000 pages with ISBNs that are blatantly erroneous and not fight over the ~1000 that are styled strangely. Dragons flight (talk) 15:14, 14 April 2013 (UTC)
- To clarify, it is not that I think
- Seems like a step backwards and then a step forward at a 45° diagonal. I have no complaint about ignoring hyphens, dashes, or spaces within the ISBN. Carry on.
- As far as I can tell, WP:MOS is mute on extraneous text in an ISBN. WP:ISBN does make some statements about how the magic link in regular text should not contain extraneous characters. The
{{cite book}}
documentation has quite a few examples of citations with|isbn=
, all of which contain only the ISBN. Yeah, pretty much unstated convention, somewhat akin to consensus by silence, but still it is an apparently wild-held convention. Unless there is some really good reason to flout that convention, we ought not passively permit editors to do so.
- As far as I can tell, WP:MOS is mute on extraneous text in an ISBN. WP:ISBN does make some statements about how the magic link in regular text should not contain extraneous characters. The
- For
|isbn=hardback 978-1—234–56789—7
, the "hardback" text properly belongs in|type=
.
- For
- Lastly on this topic, I fear that the camel's nose is entering the tent. Once we (and by "we" I mean you because it's your code) begin allowing exceptions to widely held conventions, where does it end. Where there are standards and conventions, hew to them; where there are not, feel free to do as you will. A gentle tap on the nose is usually sufficient to get camel to withdraw it.
- In my last example above (which I've tweaked to remove "hardback"), is this:
- rft.isbn=978-1234567897978-1234567897
- In my last example above (which I've tweaked to remove "hardback"), is this:
- Clearly erroneous content, and I note incomplete even though erroneous, should not be passed into COinS metadata. Ever.
Long-term trend toward forgiving software
All these issues of reduced error messages, of accepting a variety of data values in parameters, are long-term issues, to be considered in the coming months, and should not be considered a reason for instant frustration. There is ample time to discuss what to do with the many thousands of cite-data errors already present, and we know from past "maintenance tag-boxes" that perhaps only 1-in-27,000 readers will help fix errors in pages. Most errors are here to stay awhile, and running some Bots is one option. Meanwhile, we have the opportunity to expand Wikipedia's use of "fault-tolerant software" or "forgiving software" along the way (from book Create Forgiving Software [flylib.com-36 ]: "At a deeper level than the need to prevent and detect errors is the need to create software that is forgiving to users as errors occur."). Also know, there are some users insisting that we write software that logs data into "self-learning databases" which it would analyze and teach itself how to better interact with the growing population of editors who write articles. Hence, there many ideas to consider in future years, and all that can be debated much later. However, any techniques which we develop, along the way, might be used in the 100,000(?) varieties of infobox templates, in future years. -Wikid77 (talk) 00:39, 15 April 2013 (UTC)
Errors with pipes in titles
Please see WP:VPT#Errors with pipes in titles. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:18, 15 April 2013 (UTC)