Wikipedia talk:Naming conventions (use English)/Archive 2

Archive 1Archive 2Archive 3Archive 4Archive 5

Suggestion for increasing granularity

  • For proper names in languages that are spelled in the Latin alphabet natively,
    • If the native spelling is different from the most common English one only by the presence of diacritics
      • use the native spelling for article titles, use either spelling in article texts, depending on the context (above proposal, it goes without saying that there will be redirects). Example: El Nino, Vladislav Vancura
        • it may be a special case how Canadian placenames (Montreal) should be transcribed, since the spelling without diacritics is arguably 'native' (French-English bilingual situation).
    • otherwise
  • For proper names in languages not written in the Latin alphabet natively,
    • if there is a common spelling in English,
    • if the name is not current in English, or if diacritics serve to disambiguate, the technical transliteration can be used:

This suggestion would restrict the above policy suggestion to proper names spelled in the Latin alphabet natively, which may be enough for a first step. We may need special guidelines for titles transliterated from Chinese, Japanese, Vietnamese, Cyrillic, Greek, Cuneiform, etc.; I would like to ask the people who voted 'oppose' above, if they would consider changing to 'support' if the proposition was so restricted. dab () 08:23, 12 Apr 2005 (UTC)

I don't understand the reasoning behind using diacritics if the English spelling omits them. At first sight (and at second and third--that seems absolutely perverse. Surely it should be the other way round--only use diacritis if the English spelling normally uses them (there are not many such names). --Tony Sidaway|Talk 11:30, 12 Apr 2005 (UTC)
the reasoning is that for some names there is no English spelling. "English" spelling is in accordance with English phonological rules. If there are anglicized forms (such as Lucerne), use those. If not, in cases like Zurich (not pronounced /zuh-rich/) and El Nino (not pronounced /el-nine-o/), we are looking at non-English spellings anyway, and we are here addressing the question how to handle them. dab () 11:39, 12 Apr 2005 (UTC)
I really like the way Curps put it: "...because the absence of diacritics bothers the people who care about them a lot more than their presence bothers the people who don't care about them." Dpbsmith (talk) 15:59, 12 Apr 2005 (UTC)
Your example makes me more opposed. The ṇ in Pāṇini is shown as a square on the browser I am using right now. There is little point using letters in titles or repeated through articles if many readers cannot read them. --Audiovideo 12:23, 12 Apr 2005 (UTC)
well, this policy should be assuming Unicode. Unless you're into retro systems, or will stay on Windows 98 until 2008, I assume in one or two years virtually all systems will support Unicode. Already, you'll have a hard time on WP if your browser has no Unicode support. These are technical issues that should not affect our decisions on how we ideally want to title our articles. Note that Pāṇini is not yet supported by the software, but will be (and already is on most wikis except for en:) dab () 12:36, 12 Apr 2005 (UTC)
Audiovideo, can you see the n with Template:Unicode applied: Pāṇini? Michael Z. 2005-04-12 17:15 Z
I can read Spin̈al Tap but not Spin̈al Tap. Other browsers may have different patterns. Assuming the whole world is bleeding-edge and using the same browser is a bit much. I cannot type any of these dubious characters into a search engine without a lookup table. --Audiovideo 12:41, 12 Apr 2005 (UTC)
this isn't about search engines or keyboard layouts. Redirects will always take care of that. We are discussing a policy on typography, i.e. how would we want to spell things, ideally. Feel free to make a suggestion that will improve consensus. dab () 13:56, 12 Apr 2005 (UTC)
Most people come to Wikipedia pages via external search engines which do not use redirects. How will most users of most keyboards be able enter funny foreign squiggles without a lookup table? How is a native English speaker expected to know of every funny foreign squiggle in ever foreign language? For example how many British people (Spanish is not a common second language in Britain) know about the "Ú" in Úbeda or how many English speakers know about Polish "ń" Gdańsk. What about those who are learning English as a second language? How many of them are also fluent in both Spanish and Polish etc?. This is also true for the technology. The British Google http://www.google.co.uk differenciates on all characters with and without diacritics. The Canadian google seems to know about German, French and Spanish diacritics but does not know about Polish ones http://www.google.ca ["about 1,230,000 for Gdansk", "about 819,000 for Gdańsk"]. Experience suggests that there are enough editors of Wikipedia around who insist that the word is only spelt with diacritics, either through ignorance or because they are on a mission, (look at the history of Ubeda, Goering and Gdansk) which makes those pages invisible to many people using external searches. Putting the name under artical titles with diacritics encourages them to do this. We do not live in an idyllic world, we live in the real world and using diacritics makes things difficult for people. Why not stick with common Engish usage and make life easier for most people? Philip Baird Shearer 09:04, 19 Apr 2005 (UTC)
  • Most people come to Wikipedia pages via external search engines which do not use redirects.
I disagree with the notion that the behaviour of search engines trumps encyclopaedic naming criteria. It shouldn't be the deciding consideration. Effort would be better spent prompting search engine developers to make better software.
  • "about 1,230,000 for Gdansk", "about 819,000 for Gdańsk"
This figure looks bogus to me. I just looked at Google and Gdansk seems to be a superset of Gdansk and Gdańsk. So Gdansk actually has only 411,000 matches if you remove the duplicates.
Basically, I don't understand your line of reasoning: if you search for either "Gdańsk" or "Gdansk" in Google, both get you the Wikipedia article in the first 10 matches. Most relevantly a search for Gdansk matches pages called Gdańsk - so how exactly is using a diacritic "hiding" a page from searches?
--kjd 15:41, 19 Apr 2005 (UTC)
Not sure about the google.ca seach for all pages, but if restricted to just English pages the results are "about 36,600 English pages for Gdańsk", "about 26,900 English pages for Gdańsk -Gdansk" and about "533,000 English pages for Gdansk", "about 530,000 English pages for Gdansk -Gdańsk". Seems that their indexing is not as precise as it might be. However it is clear that Gdansk and Gdańsk return a diffrent set of pages.
You may think "Effort would be better spent prompting search engine developers to make better software" to find words which are not standard Engish, personally I disagree. But as that is beyond the scope of this project, so lets not discuss it further. Within the scope of the project, if the word is included without diacritics (which is perfectly correct in Engish and the way most monolingual English speakers would type almost all words) then the problem is solved. For example Gdansk shows up both ways because at the moment the article includes the word "Gdansk" as well as "Gdańsk". However Ubeda does not include "Ubeda" and will not show up using many search engines. As I said above "Experience suggests that there are enough editors of Wikipedia around who insist that the word is only spelt with diacritics, either through ignorance or because they are on a mission and putting the name under artical titles with diacritics encourages them to do this. Philip Baird Shearer 17:00, 20 Apr 2005 (UTC)

Wiggle room needed

We should be careful not to adopt a policy that ends up strait-jacketing us. In the real world spelling is inconsistent and awkward and no general rule is going to work in all cases.

In particular, names can change the way that they are translated or transliterated into English over time. Since Wikipedia has articles covering the whole of history, we need to be prepared to preserve inconsistencies in spelling, for example:

It would cause no end of headaches if a over-broad policy were used to force an artificial consistency of spelling. Gdr 11:01, 2005 Apr 12 (UTC)

of course! "Peking Man" has become a proper name in its own right and is unaffected by our policy on "Beijing". The same goes for your other examples. There are cases where versions of a name propagate in parallel, such as Charlemagne vs. Charles the Great. I am happy to have the article at Charlemagne, but I suppose both would be arguable. We definitely need enough wiggle-room to allow informed judgement in such cases. dab () 11:25, 12 Apr 2005 (UTC)
For the record: I, too, think wiggle room is needed. And the rough proposal I made should not be adopted without tweaking, refining, and debugging. And, anyway, saying that something is policy does not cause its instant and rigid adoption. All that statements of policy do is a) help to bring initiates up to speed on what existing consensus positions are and why they were adopted, b) sometimes have a beneficial effect in shortening debate and influencing opinion when specific cases are debated. Dpbsmith (talk) 15:56, 12 Apr 2005 (UTC)
But then we also have Kiev, transliterated from Russian, instead of Kyiv, from Ukrainian. It also seems to me that a usage like 3rd Belorussian Front should be changed to 3rd Belarusian Front. Michael Z. 2005-04-12 17:39 Z
I do not think Kiev vs. Kyiv even enters this particular discussion (no diacritics involved). The Kiev case is exactly like the Lucerne one: In English, the Russian/French form is preferred over the native Ukrainian/German spelling. We do not suggest a change to the unfamiliar native form, but will stay with Kiev/Lucerne. dab () 06:46, 13 Apr 2005 (UTC)

With/without diacritics: how about "anything goes if you can prove you can clean up your own mess?"

Howzabout: there is no preference between forms with and without diacritics (analogous to the position on British versus U. S. spelling), and that moves are OK provided the person proposing the move can demonstrate convincingly that there is a team of people who are willing to do all the necessary work that the move would entail.

Stated crudely, "Anything goes, as long as you can prove you can clean up your own mess?"

Stated more precisely:

Establish some kind of project page for "diacritical place name moves" that would apply to

  • pairs of place name forms, one with diacritics, expressible in ISO 8859-1, and one without, expressible in ASCII,
  • for which the ASCII form has the visual appearance of the ISO 8859-1 form with diacritics removed, and
  • whose form with diacritics can be used as an article title without any need for HTML entities, and
  • for which there is consensus that, in English, the combined frequency of usage of the two forms exceeds the frequency of usage of any other form

Anyone can propose to perform the move and the associated work it necessitates. This work consists of performing the move and doing all editing work needed in all articles that reference the name to remove double redirects and make usage consistent. And whatever other tasks the community decides are needed. The proposal must include a date for completion of the work. The proposal must contain an estimate of the number of pages affected. The proposal must contain an invitation for users to sign up to perform the work.

Accompanying the proposal would be some form of vote. The vote would be based, not on the merits of the move, but on the degree of confidence that the work would, in fact, be completed by the specified date in a workmanlike manner.

On the completion date, there would be a formal review and second vote as to whether the move had, in fact, been adequately completed.

Of course, if anyone tries to perform a page move without having assembled a team and garnered a consensus vote that the team is adequate, the move would be reverted as against policy.

Note that in early stages of an article's existence, it would be plausible that a single person could do the work him- or herself, but would still have to make a public declaration and garner consensus before making the move. The main point is that the consensus would be a judgement that the person making the move would do the whole job responsibly, not a judgement on whether people agree with the move itself.

Just a thought. Dpbsmith (talk) 13:17, 13 Apr 2005 (UTC)

This is a completely unreasonable amount of bureaucracy, seemingly designed to intentionally discourage page moves. Assemble a team, two votes and a formal review for every page move? And why would all of this apply only to diacritic-related moves... why aren't you proposing this for every single move of any kind?
It's also not clear exactly what you're saying... why would articles need to be edited to remove double redirects? To remove a double redirect, you just edit the redirect itself. -- Curps 17:47, 13 Apr 2005 (UTC)
That's instruction creep. --cesarb 17:52, 13 Apr 2005 (UTC)
Sure. I would acknowledge that it's instruction hopscotch, instruction gallop, instruction pole-vault... Dpbsmith (talk) 20:40, 13 Apr 2005 (UTC)
we're not aiming for consistent use at all, that would be almost impossible. Article texts are free to refer to either the article directly, or the redirect, regardless of which is the one with diacritics. We're only discussing where to actually place the article. "wherever you like" stops working as soon as people involve with the article disagree among themselves. dab () 18:00, 13 Apr 2005 (UTC)

Wrongtitle excess

'Someone' has gone and used the wrongtitle tag on Abraham, Mao Zedong, and Al-Khwarizmi - apparently in the view that the proper titles should be used are in their native language. This person apparently hasnt heard of the WP:UE policy, and doesnt understand the true consistent use of wrongtitle to emphasize either a minor technical difficulty particular to the entire world wide web, or else the cultural dignity embodied the diacritic. -SV|t 22:31, 14 May 2005 (UTC)

Looking at the histories of those pages, I see that this "someone" is you. If you haven't heard of the WP:UE policy, simply read that page. It clearly states "Only use the native spelling as an article title if it is more commonly used in English than the anglicized form." So why exactly did you add those templates? If you think that the rules for titles should be changed, simply bring them up on this talk page rather than adding them to articles where they don't belong under the current guidelines. See WP:POINT. — Ливай | 23:10, 14 May 2005 (UTC)
Doesnt work that way. One has to make a point in order for a point to be made. -SV|t 23:48, 14 May 2005 (UTC)
Points can be made with words. There are plenty of places to do so here that you can use to draw attention to your point rather than putting out-of-place things on popular articles. If you feel particular articles are holding the template that should not, don't hesitate to bring it up on the articles' talk pages. If you feel that the template itself is flawed, say so on the template talk page. If you feel this is a larger consistency issue that affects the Wikipedia more broadly, there's always the Village Pump. If you don't state explicitly what you think the problem is, nobody is going to do anything about it. I asked you why you put the templates up and you did not tell me the point you are trying to make with them; I'm still not sure what exactly your point is, and I doubt I'm alone. On Wikipedia, actions don't always speak louder than words, especially when your actions are just going to be undone by the other editors anyways. — Ливай | 01:32, 15 May 2005 (UTC)

Existence versus common.

I am trying to follow this policy, but I don't really know if I am doing so correctly. I work a lot with European works of music, and the titles are renamed and moved all the time. Everyone is trying to help. Consider Puccini's most famous opera. According to Title your pages using the English name, if one exists, it should be called "The Bohemian Girl" -- this is the original title given by the composer for the first English edition. According to If you are talking about a person, country, town, movie or book, use the most commonly used English version of the name for the article (as you would find it in other encyclopedias), this would be "La bohème" (by the encyclopedia rule, (the Italian title)), but "La Bohème" or "La Boheme" by the commonly used rule. Opera is a rather specialized subject. When titling pages, does commonly used refer to the general public or to people educated in the field? I don't want to move this page; it has been through enough (Regretably, some of it was my doing.) I just want a better understanding of how to follow this rule when titling new pages. --DrG 13:26, 2005 Jun 13 (UTC)

You raise some good points there. In my opinion we should lean towards the more academic usage where there is a choice. This is what we're currently proposing at Wikipedia:Naming_conventions_(Old_Norse/Old_Icelandic/Old_English). To me academic usage has connotations of academic accuracy (which we probably want). In your Bohemian problem I would probably prefer "La bohème". - Haukurth 17:31, 15 Jun 2005 (UTC)

Even foreign words used as foreign words need to be fully Anglicized - Jimbo Wales

Taken from Talk:Fucking Åmål Revision as of 19:55, 7 December 2001 [1]. Here are some dated thoughts by Jimbo Wales:

-------------------------START------------------------------------------------------

I would guess that the reason we don't write Pokemon in Katakana is that the vast majority of our readers wouldn't have any clue how to pronounce it. That's not the case with Å -- it's still an "A", after all.


I suppose it's worth mentioning that these letters do actually exist in English anyway, in a few cases where we've ripped a word steaming from the chest cavity of another European language. I grant you that they're rare -- Noël is the only common one I can think of -- but they do exist. -- Paul Drye


I disagree very strongly. We should not include Katakana in the text of any articles, except perhaps an article on Katakana. Katakana is completely unreadable to people who have not studied Japanese. Browsers don't consistently render any versions of it of which I'm aware, either.


In this particular case, the Amal "a with ring" was written in some way that actually broke the link (at least on this machine, my home pink iMac using Netscape 4.7 or something like that).


I suppose it could be a matter of some controversy as to whether Noel (see, I don't even know how to type 'e with two dots') exists in English. English is a mongrel language, without even the pretense of central authority as found with, for example, French. English is as English does. My perspective is that if I don't see it on my keyboard, and if I didn't sing it in the alphabet song, it's 'fancy' and therefore should be avoided.


Try searching on the net for Gödel -- it's not a good thing. Try either Godel or Goedel, both common Anglicizations, and you're good to go.


--Jimbo Wales


This loops back to another discussion I was in a few days ago. If one's concern is getting a hit from a search engine, don't avoid variant spellings. It's not a case of Godel or G&oumldel, it's a case of Godel and G&oumldel. Designate one as the name you're going to use in the article, but be liberal about listing useful variants. It helps both machine searches, and reassures the reader that he's on the right page even if he approached it with an unusual spelling in mind. Hence the title of this article being non-accented English, and the first sentence giving us rings and the "translated non-controversial" title. We've got all the bases covered. -- Paul Drye


I find this argument completely and overwhelmingly compelling, and I withdraw all objections to placing as many variants as deemed necessary within articles, so long as they are renderable by most browsers. --Jimbo Wales



Yes, titles must not use characters that are not legal in URLs, and that precludes any non-7bit-ASCII. But we're talking about the body text of the article here. Whether or not one chooses to use diacritical marks in standard English borrowings (in words like coördinate, naïve, résumé, etc.) is a separate issue. I generally leave them out; CMS is non-committal. But when the word in question is not a borrowed one, but actually a foreign one used as such, I think it's important to get it correct in the body of the article at least in the initial sentence. If it requires non-ISO characters, then the Anglicization used for the title can also be used for the rest of the article. This has been discussed in a lot of diverse places; we should probably have a policy article that consolidates them. --LDC


But Amal is not being used as a foreign word is it? No more so than, for example, Junichiro Koizumi, Prime Minister of Japan. Shall we write Osama Bin Laden in Arabic? Amal is a name, like those.


Even foreign words used as foreign words need to be fully Anglicized, I think. Perestroika. Glasnost. Writing those in Russian characters would render the article unusable to English speakers.


Obviously, my examples don't fully address the issue. The important distinction between Japanese and Arabic names and Swedish names is that Swedish names are 'borrowable' in the sense that I can at least still read it.


Also, see above, where I agree completely with Paul Drye's reasoning, and so withdraw all these objections.


--Jimbo Wales


Also, p.s., take a look at Junichiro Koizumi. Someone has placed something fancy after his name, which I assume would render it in perfect Japanese. Perhaps this will draw Japanese speakers to wikipedia, I don't know, but that would be a good thing. But in my browser, all I see is some question marks.  :-( That doesn't strike me as a good thing.

-------------------------END------------------------------------------------------

Food for thought Philip Baird Shearer 18:57, 27 July 2005 (UTC)

Native spelling

Only use the native spelling as an article title if it is more commonly used in English than the anglicized form. and If there is no commonly used English name, use an accepted transliteration of the name in the original language. What exactly is "native spelling" or "original language". Which language is cosidered "native" in multilingual environments ? What about disputed (or occupied) territories, rivers that flow through multiple countries/language zones etc. ? --Lysy (talk) 13:50, 31 July 2005 (UTC)

River is a very good example! w:nl:Rijn , w:de:Rhein , w:fr:Rhin . In English we find a fourth spelling Rhine --Patio 10:49, 7 August 2005 (UTC)
And, accordingly, Rhine is the name for the river in English, hence the correct name for the article. Any territorial or riparian disputes belong in the body of the article, not the title. Robert A West 19:27, 9 August 2005 (UTC)

Write for the reader

The target reader of the English-language Wikipedia is, perforce, a native speaker of English, the majority of whom are functional monophones. Moreover, there is no rational expectation that all of them will update to full support for any particular character set by a date certain. Most users will employ IE with the factory settings, which simply do not support many of the characters that are currently being used in titles.

I realize that those two facts displease a substantial number of people in this discussion, but it is not the mission of Wikipedia to change either fact. Some people may think the title does not matter, but it is more jarring to to see an unreadable mush at the top of the screen than to that a relatively small part of the text is unreadable (at worst) or uses unfamiliar marks (at best), especially since the title-form is the one that generally will be repeated throughout the article itself.

It is better to be reader-friendly than to be on the cutting edge of any social or technical movement. Robert A West 00:26, 10 August 2005 (UTC)

target reader is english native? I would not support this. I like reading english WP, whilst beeing german native. At the same time I prefer non-diacritics in article titles. I wanna read english, not internationalish. BTW even people that are not monophones seldom can read Vietnamese. Monophony shouldn't be the argument. Tobias Conradi (Talk) 20:36, 18 August 2005 (UTC)
Just because lots of people for whom English is a second language show up here, don't assume that such people are a primary target for this Wikipedia - and to the degree they are, this may well be transitory, to boot. (Right now, the English Wikipedia has a lot more articles than Wikipedias in other languages. However, over time, this will change. Once the X Wikipedia has a good percentage of the number of articles that the English one does, native speaker of X are going to look there for content.)
Your point about Vietnamese is a good one, though - we are discriminating against languages which don't use (basically) the Latin alphabet, in saying "use native spellings in German, Croat, etc, but not in Russian, Greek, Arabic, Japanese, Chinese, etc, etc, etc". Noel (talk) 23:36, 29 August 2005 (UTC)
We have to "discriminate" against languages which do not use the Latin alphabet. Cyrillic, greek, arabic, japanese etc must be transliterated for us to be able to read the word. Accents over or under existing latin characters is however a much different scenario.
Any citation of technical limits to display such characters is incredulous. I've not yet found an english system which does not come with the ability, out of the box, to display áä etc, EXCEPT for a limited version of PCs donated by Microsoft which was pure ASCII.
If you wish to eliminate non-ASCII characters entirely, it will be to the detriment of Wikipedia itself, you would have to change every word in every article to conform to the outdated ASCII format which does not include for example the majority of monetary signs, to give you an idea of what a computer alphabet contains. I do wonder also what it does for the geographical knowledge of English readers, I'm not sure how far I could get in the USA if I insisted I did have a flight booked to Fönix or Vasíngton or Englaborgin. A traveller doing his research on Wikipedia will find out when arriving that his information is written wrong and it hinders his ability in the country, having used a devalued version of a "guide".
Plus the comment about the readers being unable to comprehend non-ASCII characters in the article, completely spoiling it for them, is showing complete lack of faith in their abilities. Plus, the more they see accents, the more comfortable they will be with them and that may lead them to be able to go through everyday life without falling into a catatonic shock of seeing a stray ä. --Stalfur 11:51, 8 October 2005 (UTC)
Archive 1Archive 2Archive 3Archive 4Archive 5