Wikipedia talk:Manual of Style/Arabic

(Redirected from Wikipedia talk:MOSAR)
Latest comment: 11 months ago by Kencf0618 in topic Houthi vis-à-vis Houthis

The apostrophe in Qur'an

edit

The table at WP:Manual of Style/Arabic#Examples gives “Qur'an” as the standard transliteration, but the article itself is at Quran, the link in the title above being a redirect. (I notice that the {{Cite quran}} template lacks the apostrophe as well.) But titles are sometimes subject to special policies. What is the consensus for the spelling in articles, where current practice seems to be mixed? Should the example be changed, leaving the apostrophe only in the “strict” column?—Odysseus1479 (talk) 23:02, 10 November 2012 (UTC)Reply

In article titles, the single apostrophe (') is often used for `ayn ع (e.g. Ta'awwudh, but in ordinary transcription style it would represent a glottal stop. Not sure that such de facto quasi-contradictions between article title practices and other transcription practices have been resolved. Notice that the "strict" column uses (’), not (')... AnonMoos (talk) 05:36, 11 November 2012 (UTC)Reply
Though in Nastaʿlīq script an actual (ʿ) appears in the article title. AnonMoos (talk) 14:28, 14 November 2012 (UTC)Reply
Removed the contentious example. --Francis Schonken (talk) 13:36, 14 September 2015 (UTC)Reply

Preferred form where no Wikipedia article exists

edit

For some reason I hadn't come across these guidelines before, but I'm finding them very helpful. I'm working on several articles about the history and architecture of southeast Anatolia and al-Jazirah between the time of the Muslim conquest of this area (638) and when it was added to the Ottoman Empire (around 1515). A lot of my sources for this period are historians writing in English, who themselves rely primarily on Arab sources. These accounts tend to have quite a lot of references to historical figures who don't currently have Wikipedia articles (and may never). As I don't read Arabic, I am taking personal and geographic names from these secondary sources, where they are typically rendered in a strict transliteration (e.g. al-Muwaḥḥid ʿAbd Allāh), rather than making my own transliterations from Arabic.

These MOS guidelines are pretty clear when dealing with the subject of an article, but I can't find a clear statement of how to handle transliteration of other names and terms that don't merit their own articles. My interpretation is as follows, but please correct it if I have misunderstood.

  • For any name or term taken from Arabic use the primary transcription where one exists.
  • In the absence of a primary transcription, use the standard transliteration (ALA-LC Romanization, no diacritics).
  • If the name or term merits its own Wikipedia article, list the strict transliteration (and original Arabic form) in the lead paragraph. In many cases this would be the only place we would use the strict transliteration. Direct quotations and titles of reference works might be the other places.

Assuming that my understanding above is correct, I will need to fix a number of articles to use standard transliteration in place of strict transliteration.

Would it be possible for someone to rewrite this MOS guideline to clarify this aspect? I think what's needed is to add a subsection for "Preferred form" within the "Proposed standard" section. It could be as simple as stating that where an Arabic name or term does not meet the criteria for its own article, it should still appear in other Wikipedia articles using the approach laid out in Wikipedia:Naming conventions (Arabic).

That does raise another question for me: A lot of the relevant historical figures during this period have pretty long names with chains of nasabs (e.g. Amir Saif al-Dīn Shīrbārīk Maudūd bin ʿAlī (bin Alp-Yaruq) bin Artuq). Where the person has their own article, it seems fine that I just refer (and link) to that form . Where there is no article for that individual, it would seem best to use the full name on first reference, and then use the most common form for subsequent references. Is there an MOS guideline that covers this area? Rupert Clayton (talk) 19:44, 5 January 2015 (UTC)Reply

Guidelines for non-Arab people in Arabic sources

edit

I have another transliteration question related to the one above. A lot of the historical figures who appear in the articles I'm working on are non-Arabs who operated in an Arabic-dominated culture, or whom we know about primarily from Arab historians. They seem to break into three main groups:

  1. Non-Arabs operating primarily in an Arab culture: e.g. Saladin (Salah ad-Din Yusuf ibn Ayyub)
  2. Non-Arab Muslims whose lives were primarily documented in Arabic, but who may have primarily spoken other languages: e.g. the Rum Seljuks, the Türkmen Artuqids and Aq Qoyunlu, Ilkhanid Mongols.
  3. Non-Arabs who only appear in Arab sources as the object of alliances, military campaigns, etc: e.g. Byzantine Emperors, Georgians, etc.

It seems clear to me that people in group 1 should follow the guidelines for Arabic names, and people in group 3 should follow separate guidelines. Is there any guidance for people in group 2? Rupert Clayton (talk) 20:22, 5 January 2015 (UTC)Reply

Thinking this through a little further, the languages these dynasties used were something like the following (very roughly):
Dynasty Court/official language Lingua franca Military language Scholarly and literary language
Great Seljuks Persian Persian Oghuz Turkish Arabic
Artuqids Arabic? An oghuz language similar to Azeri? ? Arabic
Rum Seljuks Persian Old Anatolian Turkish ? Persian
Aq Qoyunlu Azerbaijani Azerbaijani Azerbaijani Arabic and Azerbaijani
Ilkhanids Persian Mongolian? Mongolian? Persian?
Most words that need transliteration are personal names of rulers and nobility, so in most cases, these can follow the transliteration style for Arabic or Persian as appropriate. In some cases, elements of names are clearly derived from a Turkic or Mongolian language. Where there is no primary transcription evident, I'm tempted to follow modern Turkish standards for Turkic names. I'll deal with the Mongols' names when I get to them. Thoughts? Rupert Clayton (talk) 01:51, 6 January 2015 (UTC)Reply
Using Modern Turkish spelling to romanize Ottoman names is correct and standard in ALA-LC. That wouldn't extend as far back as Aq Qoyunlu or Seljuk names. For pre-Ottoman Turkic names, I think it would be best to either follow the prevailing form used by scholars or, failing that, a faithful transliteration from the Arabic alphabet. I doubt that modern Cyrillic orthography in Mongolian could be of any use for historical figures from centuries ago; you always see those names romanized from the original Classical Mongolian forms written in the vertical Uyghur script. Cyrillic Mongolian orthography is based on pronunciation changes that have taken place since the classical times and is sometimes very divergent from the original. Johanna-Hypatia (talk) 18:52, 8 February 2015 (UTC)Reply

Misuse of the definite article

edit

The lead of the article on the topic of the Arabic definite article states:

  • "Unlike most other particles in Arabic, al- is always prefixed to another word and it never stands alone. Consequently, most dictionaries will not list it as a separate word, and it is almost invariably ignored in collation." I am curious to know the implications of this especially as it applies to the representation of Arabic titles in English.

This comes in a context in which has a very clear policy / guideline on WP:THE and a content at WP:CRITERIA that presents:

  • Naturalness – The title is one that readers are likely to look or search for and that editors would naturally use to link to the article from other articles. Such a title usually conveys what the subject is actually called in English.

In many cases, unless the "al-" is familiar, it will just get in the way and further problems are raised in situations as when a name for the article is used "to link to the article from other articles". It may often be nonsensical for an article to present something like "... of the al-Link ...." and, in these cases, the problems of WP:THE seem to me to be exacerbated.

I am no great fan of WP:THE but I think that, if anything, there tendency has been to edit Arabic titles in a way that is quite contrary to this policy / guideline and am curious as to why this might be.

There is currently a discussion at Talk:Masjid al-Haram#Requested move 1 May 2015 in which I have presented quotes such as from the Encyclopaedic Dictionary of Religion: G-P which presents:

Masjid al-Haram : The Grand Masjid on Makkah. The Ka'bah (the Qiblah of the Muslims) is situated within it.
and I have invited editors there to present their own research and have suggested the use of searches such as:

(I mention this because I think there may be a perhaps related issue in regard to the presentation of the most commonly used name for a subject as it is used in English).

I would be grateful for any thoughts in regard to issues mentioned and on ways in which any potential issues might best be addressed.

I will also present content on other inappropriate or possibly inappropriate uses of related titling in a sub-thread below and, at editors discretion, suggest that any related discussion/response may continue in a "related discussion" section after that. Thanks. GregKaye 08:58, 5 May 2015 (UTC)Reply

Examples and suspected examples of the misuse of "Al-" in title texts

edit
Feel free to add examples of related articles:
GregKaye 08:58, 5 May 2015 (UTC)Reply
edit
Replies, thoughts, responses related to the above will be gratefully received: GregKaye 08:58, 5 May 2015 (UTC)Reply

Sun letters, once again

edit

The MoS has a statement in the section on definite articles "For this manual of style, assimilated letters will be used, as it aids readers in the correct pronunciation." This seems a bit strange. Is it stating that assimilated letters must be used? If so, it seems an unusually strict requirement which opens up quarrels and confusion. ALA-LC does not use assimilated letters, and the preponderance of ALA-LC transcription in English texts (certainly in the US) mean that most academic and other source material will not be using them. Are we requiring Wikipedia editors to know that the transcriptions they are coming across and using in articles are "wrong" by that policy statement? Most Wikipedia editors are not Arabic-speakers, and have no idea about assimilated letters or pronunciation, and will transcribe names as they read them (e.g. "Harun al-Rashid" rather than "Harun ar-Rashid"). This seems a very strange demand to make upon them. Particularly as it opens up quarrels between editors, between the few who insist on assimilated letters (unusual in the literature) and the bulk who insist on keeping it unassimilated (as they commonly read it). It will also likely introduce inconsistencies across articles, e.g. the main article being forcibly named "Harun ar-Rashid", but other editors continue to innocently write "Harun al-Rashid" in other articles, not imagining it might be inconsitent. It seems to me WP:MOSAR statement is being unproductive here, and imposing a rule that contradicts common usage, and will likely lead to quarrels between editors. I would like to propose a modification in the MoS which states that assimilated letters can be used, if shown to be the preponderant transcription, and eliminate the statement that "it will be used", which seems an anomalous imposition. Walrasiad (talk) 23:48, 1 June 2015 (UTC)Reply

I’m inclined to agree. The premise quoted above seems flawed: pronunciations and native orthographies usually appear near the beginning of the lead, so I don‘t think promotion of correct pronunciation is generally an important consideration for our spelling norms, especially where the result would differ from the most frequent form. I confess, however, that I don‘t understand the rationale for many of our spellings (for example how we came up with the peculiar Quran, neither the commonly encountered fish Koran nor the more precisely transcribed fowl Qur’an, so to speak), and I‘m not familiar with the academic literature, so I may be missing something.—Odysseus1479 00:36, 2 June 2015 (UTC)Reply
The project page explains that per Wikipedia's Manual of Style, if the title of an article is to be based on standard transliteration from Arabic, then assimilated letters will be used. I strongly disagree with your proposed modification. Thousands of article titles on Wikipedia, about place names, people, etc, use the assimilated letters per WP:MOSAR. Some of the titles (only where the assimilated definite articles occur at the start of the title) can be found in these lists: All pages with prefix at-, All pages with prefix ad-, All pages with prefix adh-, All pages with prefix ar-, All pages with prefix az-, All pages with prefix as-, All pages with prefix ash-, All pages with prefix an-. Khestwol (talk) 08:18, 2 June 2015 (UTC)Reply
This is not about article titles. This is about common usage in article text. I've written hundreds of historical articles on Wikipedia, many of them relating to North Africa, with hundreds of Arabic names, and I assure you that not once did I use assimilated letters, for they were not used in my sources. This was not a conscious decision - I was simply following common usage in academic sources and other texts, and was blissfully unaware of this weird quirk in MOSAR. I presume a lot of editors are like me, in that they write based on sources, and wouldn't in a hundred years imagine that MOSAR has a tiny codicil which contradicts common usage. (Not least that a lot of editors simply don't know Arabic, and wouldn't know what a sun letter was if it hit them in the face, so wouldn't even imagine this to be a question; this codicil seems to expect that any articles that refer to an Arabic name are only going to be written by Arabic-speakers, aware of nuances of Arabic pronunciation?) In other words, articles will continue to be written with unassimilated letters. If this is a policy, then it is patently unenforceable, and will foster inconsistency and quarrels. Will Wikipedia establish a "sun letter task force" to comb through all these articles to ensure compliance with this bizarre requirement? Or is this going to remain a hidden weapon for sun letter warriors to surprise writers, contradict common usage, and provoke quarrels and acrimony between editors? This is an immensely unproductive codicil that needs to be rectified. If not rectified, then I would request that the statement at the top of MOSAR requesting editors to follow ALA-LC be corrected or removed entirely, as it is very deceptive. MOSAR statement at the top needs to be re-written to clarify that Wikipedia intends to promote pronunciation, not common usage. Walrasiad (talk) 12:29, 2 June 2015 (UTC)Reply
Walrasiad, BTW, I personally find that most relevant titles on Wikipedia, especially those which are about the Arabian Peninsula or areas east of the Maghreb region, do assimilate al- before a sun letter. Also see the lists I provided above. You maybe right about the Maghreb, but I guess that is because the transliteration from that region are more under the French influence maybe? Many of the articles from that region do not use spelling normal in English. But I myself try to assimilate al- before a sun letter in titles. And BTW, this is coming from a non-Arabic speaker.   I have very little knowledge of Arabic. However, I am aware of the correct pronunciation of the definite article so I always try to follow WP:MOSAR. Khestwol (talk) 13:10, 2 June 2015 (UTC)Reply
I don't think it is a regional issue. It is a source issue. If articles are written based on written English sources, particularly academic or recent ones, ALA-LC will very likely be followed and sun letters will be unassimilated. While you may be aware of correct Arabic pronunciation, I don't expect most Wikipedia editors are. They write as they read. And they read unassimilated. Walrasiad (talk) 13:28, 2 June 2015 (UTC)Reply
ALA-LC is not the only romanization. See Romanization of Arabic, there are also others in common usage. I think that most do assimilate al- before a sun letter. A change like what you are recommending won't be constructive to implement in the present. It will lead to incorrect, simplistic pronunciations which are not based on correct Arabic, and many many articles will have to be moved away from their present, correct titles. I think the MoS is fine as is. Khestwol (talk) 13:46, 2 June 2015 (UTC)Reply
ALA-LC is, far and away, the most widely-used romanization in modern English sources, particularly academic ones, and increasingly in journalism and elsewhere. In short, it's what Wikipedia editors are reading, and the sources they will be using to write articles. If ALA-LC leads to incorrect Arabic pronunciation, so be it. This is not an Arabic language class, it is an encyclopedia. Communicating information, not teaching pronunciation, should be the paramount concern. The question of assimilation is evidently something that both the ALA and LC carefully weighed and decided upon, and one the bulk of academics agreed with. If you think they were all wrong-headed, and you know what is best, perhaps you should write a letter to the Library of Congress expressing your misgivings. I wish you luck. That said, I remind you that I am not requesting imposing ALA-LC in Wikipedia. What I am requesting is flexibility, removing that draconian codicil so as to to allow unassimilated spellings to also be used here, and ensure they aren't automatically and controversially overridden by MOSAR, regardless of common usage and causing edit wars and quarrels. Walrasiad (talk) 14:40, 2 June 2015 (UTC)Reply
PS.- I am going to pause my commentary, to allow other editors to speak. While this is an outgrowth of Talk:Harun ar-Rashid#Requested move 19 May 2015, I do not want this to become a continuation of that. I would be interested in hearing what other editors, not involved in that particular page, think about the proposal. Walrasiad (talk) 14:50, 2 June 2015 (UTC)Reply
Well I think that also allowing unassimilated spellings will gain for us nothing and will only cause a big WP:CONSISTENCY issue. Khestwol (talk) 15:13, 2 June 2015 (UTC)Reply

"Dynastic Al"

edit

As discussed above, "Al" when not the article means "family", not "dynasty" - even if dynasties take their family's name. Whenever the name of the ancestor is well-known, its descendant may be know as "Al ...". See for example Al ash-Sheikh. Naming it "dynastic" may let people think these are a reigning familyn which is not necessarily true.

Also, this section should also be part of Al (disambiguation) and arabic names - I copied it there--Df (talk) 20:56, 24 June 2015 (UTC)Reply

In-line Arabic

edit

Some articles require the use of the original word in Arabic script. It could be me, but I am coming across an increasing use of the big template for enlarging Arabic words. Personally, I don't like this since it increases the whitespace between lines, adds to a sometimes already complex syntax code, and I can't see much difference anyway (could be my browser settings). Is this the new standard for Arabic or does a normal font size suffice? - HyperGaruda (talk) 11:19, 8 August 2015 (UTC)Reply

By the way, I recently compared this issue on Internet Explorer 11 vs Firefox 38 (my standard browser). It seems that Firefox currently uses a different character set for Arabic in the Arial font, a set which is larger/much more legible and Arial-like. IE11 on the other hand uses the illegible Arabic character set which I was used to until some time ago and which would justify the use of the Big template. - HyperGaruda (talk) 08:02, 9 August 2015 (UTC)Reply
I agree, {{big}} is a kludge. The problem is that Arial is terrible for mixing Arabic with other scripts. It has a very large x-height for Latin letters, but a small dal-height for Arabic. (Cyrillic and Greek are matched to the Latin in x-height, weight, baseline and stroke taper, but its Arabic differs in all four.) I wonder if something could be done with an "Arabic-script" CSS class and the default stylesheets to select a different font? Pelagic (talk) 13:54, 18 September 2015 (UTC)Reply
There is {{Script/Arabic}} which adds a script-arabic class, but it also has its own hard-coded font list, and sets the size to 125%. Pelagic (talk) 14:26, 18 September 2015 (UTC)Reply
When using {{lang}}, most languages in the Arabic script be displayed in a more legible font on Chrome. Is there not a way to request a change for which default font is chosen for the Arabic script? Abjiklɐm (tɐlk) 17:51, 18 September 2015 (UTC)Reply

The apostrophe (again)

edit

Not sure if anyone's still watching this page, but there is some discussion going on about the standard transcription of ayin and alif. At least concerning the title. The Arabic naming convention only says to use the standard transcription method, but then there's also the convention to limit apostrophe usage to ' (straight apostrophe), so that ayin and alif would be transcribed identically instead of with ` and '. Any comments on that?

If we're going to use a straight apostrophe for both letters in the title, perhaps it's also an idea to extrapolate that to the standard transcription rules. It would not be completely absurd, considering that we're also writing ض/د and ص/س with the same letters. - HyperGaruda (talk) 14:24, 16 September 2015 (UTC)Reply

The problem I see, with `ayn in Arabic, is that the sound is described as a vocalized fricative somewhere in the back of the throat. That makes it closer to h or gh in English, though I don't know of any transliteration schemes that use those for `ayn (c.f. ghayn).
Perhaps the symbol ʻ or or ʿ was chosen for its similarity to Greek polytonic spiritus asper / dasia ʽ (which we transliterate as "h" when dealing with Greek)? But most English-speakers when seeing an apostrophe-like character will either ignore it, or if between vowels treat it as a hiatus or stop rather than a fricative. Given that I can't pronounce the Arabic `ayn sound, is it worse for me to mangle it to "hayn" or "ayn"?
From a purely typographic perspective, I do find backtick/grave ` visually distracting in the simple/loose transcriptions. I was in favor of substituting straight-apostrophe ', until I learned how different the sounds are for `ayn and hamzah. (Funny that it's called hamzah and not 'amzah?) Now I just don't know which approach is best (or less worse).
According to the hamza article, the form of ء is derived from ع, so it's not absurd that we might choose to conflate the two.
Is there a Wikipedia standard for transcribing Hebrew ayin in running text and page titles? Hebrew alphabet#Transliterations and transcriptions says that the "Standard Israeli transliteration" uses apostrophe ' for both alef א and ayin ע in medial positions and omits them when initial and final. This might be appropriate for modern pronunciation, but do we also use that for ancient or medieval terms?
  • Good luck finding Tiberian Hebrew in running text and titles. As far as I understand, the Hebrew wikipedians always use the modern Hebrew pronunciation in the main text (save for the strict transliterations in leads, like us), so per definition you'll only find the standard rules you mentioned. - HyperGaruda (talk) 15:58, 21 September 2015 (UTC)Reply
You're right. I found the Hebrew naming conventions document, which also covers running text, and recommends "general-purpose, diacritic-less transliteration" from SBL Handbook of Style. That seems pretty close to the transcription for modern Hebrew. I'm not sure how much it's followed in practice (many biblical and religious terms would have established common English names), but where applied would mean apostrophe-or-omitted for both ayin and aleph. Modern Arabic still distinguishes the two sounds, so it's a different situation; I mentioned Hebrew to compare what editors are doing with another Semitic language/script. Pelagic (talk) 21:19, 22 September 2015 (UTC)Reply
Of course we need to be really careful not to use ` for ayin in other languages like Turkish or Persian.
  • I do not think that is going to be a problem, since these languages don't have the ayin sound except for in Arabic loanwords (which are covered by this MOS). It additionally seems that Persians pronounce ayin like hamza (glottal stop). - HyperGaruda (talk) 15:58, 21 September 2015 (UTC)Reply
Pelagic (talk) 21:12, 20 September 2015 (UTC)Reply
Although ayn is described as a fricative, it is virtually always approximated as a glottal stop or omitted when the speaker cannot pronounce it. For example, the names Ali and Omar both start with ayn. So to answer your question, it should definitely be 'ayn, not hayn since ignoring the ayn is a better alternative than to approximate it with an h. As for the hamza, the name starts with h because the actual Arabic name for this symbol also starts with h. But the sound it denotes is a glottal stop nonetheless.
About the symbol `, I also find it less than appealing. Since both the hamza and ayn are ignored in English, I think it is acceptable to transcribe both with an apostrophe within titles, and to use the strict transliteration in the lead. Abjiklɐm (tɐlk) 01:46, 21 September 2015 (UTC)Reply
If it's acceptable for a foreigner to pronounce `ayn the same as hamza, or omit it altogether, then perhaps it is more like the situation with sīn س and sād ص than I had thought. Pelagic (talk) 21:34, 22 September 2015 (UTC)Reply

So far it looks like we are heading towards ayin&hamza='. I propose to update the MOS on October 1st, unless someone objects in the meantime. - HyperGaruda (talk) 15:58, 21 September 2015 (UTC)Reply

I agree with using a straight apostrophe (') for both ayn and hamza. The grave accent (`) is ugly, especially in the middle of words.--Axiom292 (talk) 04:55, 22 September 2015 (UTC)Reply
In the original MOS discussion, Bgwhite cites some titles that might be changed including `Alya', the name of a village near Medina. Does that indicate a (maybe rare) example where conflating both ayn and hamza to the apostrophe character could increase confusion for an English-speaking reader? Doesn't 'Alya' (which is currently a redirect) look like we're using "scare quotes"? Rupert Clayton (talk) 00:09, 23 September 2015 (UTC)Reply
That is a legitimate concern. However, after looking at that article in Arabic (ar:العالية (وادي الصفراء)), I should point out that the correct transliteration should actually be 'Aliya (al-ʻĀliyah). Abjiklɐm (tɐlk) 03:13, 23 September 2015 (UTC)Reply
Also, the current convention says to omit alif/hamza in the initial position of a word. I think we should extrapolate that to ayin. - HyperGaruda (talk) 09:52, 23 September 2015 (UTC)Reply
I was going to ask about the initial position. Had been expecting that initial 'ayn would have a leading apostrophe but initial hamzah would be omitted, and wondered if it would cause confusion with people mistakenly deleting apostrophes. (Though the looks-like-quotes problem hadn't occurred to me.) Pelagic (talk) 12:23, 23 September 2015 (UTC)Reply
The initial hamza is omitted because that is specified in the ALA-LC transliteration scheme. Following these rules, we should not omit initial ʻayn. What confusion do you mean? Abjiklɐm (tɐlk) 18:32, 23 September 2015 (UTC)Reply
Thanks, Abjiklɐm. As a non-speaker of Arabic, it didn't occur to me that there might be bigger problems with transliteration of `Alya' / 'Aliya / al-ʻĀliyah. I don't think a convention to render both 'ayn and hamza with the apostrophe character would cause confusion in this case. You point out that ALA-LC advises omitting an initial hamza, and retaining an initial 'ayn. Are there cases where the proposed rule would result in an apostrophe character for hamza or 'ayn at the end of a word or phrase that also starts with 'ayn? If so, it seems these titles could be confused with a name in quotes. Is that likely to be a significant problem? Rupert Clayton (talk) 23:32, 23 September 2015 (UTC)Reply
I had a quick look in the Hans Wehr dictionary, and there are indeed words that start with ʻayn and finish with a hamza (ʻibʼ, ʻabāʼ). I've honestly no idea whether this will be an actual problem we're likely to run into. Is there a way to get a list of all articles starting with `?
While we're on the topic of apostrophes, I've noticed the Hawaiian ʻokina is shown with a curly ʻ even in titles. Is there any opposition to using ʼ and ʻ for Arabic titles instead of '? With a redirection from straight quotes of course.Abjiklɐm (tɐlk) 14:33, 24 September 2015 (UTC)Reply
  • Re. Is there a way to get a list of all articles starting with `? – here they are
  • Re ...whether this will be an actual problem we're likely to run into – `Alya' is the example given above, and we ran into it.
  • Re. Is there any opposition to using ʼ and ʻ for Arabic titles instead of '? – yes, see "some discussion" link provided in the post opening this talk page section. Following the link you'll see I wrote: "Brought back some former guidance on this to WP:AT#Special characters, second bullet (this only applies to the article title, but as the discussion ... is on a page move this is rather AT than MOS matter)". Then you'd have to follow that link to the AT policy page and look at the content of the second bullet of that policy section. Sorry this discussion got a bit fractured.
  • Again, the "generally avoid apostrophe-like characters apart from the apostrophe itself and even use that one sparingly" is to make sure articles can be found by typing in the "search" box (upper right corner of any Wikipedia page) using only direct keys on a standard QWERTY/AZERTY keyboard and not knowing how to generate the other special characters there. Redirects can take care of the rest.
  • This is not about redirect pages, which, of course, can be created with all the exact characters in the article title, redirecting to the variant with the apostrophes (or without them if an even less strict transliteration or translation is chosen for the article title of the page where the content is).
  • Also, the AT policy allows ayn, hamza, okina and whatever in article titles of content pages, but considers it an exception that can only be allowed on a case by case basis, with a solid consensus for that particular case, and with all the redirects from less strict transliterations/translations in place like you can see here)
  • Also again, this is not about how such names are written in article text where exact transliterations can be used (which is MOS matter, not AT matter). --Francis Schonken (talk) 15:24, 24 September 2015 (UTC)Reply
Thanks for breaking out the issues. Would redirects not also take care of the problem of typing special characters in the search box? I can type "Abdul-Baha" there and go straight to the `Abdu'l-Bahá article (by virtue of the redirect). Presumably that would work even if the article were titled ʻAbduʼl-Bahá. That would seem to indicate that ease of use of the search box is not by itself a strong reason to avoid special characters in titles. The policy at WP:TSC doesn't actually say why to avoid "apostrophe-like characters" in titles. Perhaps the prioritization of popular articles in the search box doesn't carry over to redirects? Perhaps external search indexing (e.g. Google) would be impacted? If there's no strong case that using ʼ and ʻ in article titles would cause harm, it's starting to look like AT policy is simply defending a convention from an earlier age, in the same way that some people insist on using two spaces after a period. Rupert Clayton (talk) 16:24, 24 September 2015 (UTC)Reply
Answer is in 6th bullet. --Francis Schonken (talk) 16:33, 24 September 2015 (UTC)Reply
Hmm, your 6th bullet says there's an option for case-by-case exceptions, which is not what I'm talking about at all. If "apostrophe-like characters" in titles cause no actual harm, then it would seem they should be allowed as standard and not require a special exception. Conversely, if the policy remains that there's needs to be solid consensus before an exception can be made to allow "apostrophe-like characters" in a title, then presumably there's some clear downside to allowing them. What that is has not been made clear. Rupert Clayton (talk) 18:59, 24 September 2015 (UTC)Reply
See what you mean, the question at least. The problem is with the redirects not being created. --Francis Schonken (talk) 19:04, 24 September 2015 (UTC)Reply
Redirects were surely not the problem in the case of Sha'ban, yet it got this whole discussion starting again. - HyperGaruda (talk) 05:31, 25 September 2015 (UTC)Reply
Re. "redirects ... not the problem" – Correct, but lack of consensus was. Again, I draw your attention to the content of the 6th bullet in my list above: it mentions both aspects:
  1. "solid consensus"
  2. "all the redirects from less strict transliterations/translations in place"
Both conditions need to be met.
Re. "got this whole discussion starting again" – Let's face it: Wikipedia:Manual of Style/Arabic is a mess, so don't feel bad that a single example led to an invitation to shape it up. As long as that is not possible (like it has been for the last 10 years or so), the policy at WP:AT will have to make do. --Francis Schonken (talk) 08:03, 25 September 2015 (UTC)Reply

Point taken, so back to the issue at hand: the initial apostrophe for ayin. So far there have been three opinions on that:

  • Use curly apostrophes (with redirects), distinguishing between ayin (‘) and hamza/alif (’); essentially this is the strict transliteration.
  • Use straight apostrophes, but only in the case of ayin and not for hamza/alif.
  • Leave the initial apostrophe out altogether.

I still stand behind the third option, not (only) because it is my own, but because primary transcriptions (i.e. common names used in the media) tend to do it that way. For example, names like Abdullah and Ali or nouns like Eid (as in Eid al-Fitr) tend to be written without initial apostrophes despite beginning with ayin. - HyperGaruda (talk) 08:55, 25 September 2015 (UTC)Reply

Agreed with the third option. The current convention is to leave initial apostrophes (as in Asr prayer, Eid al-Adha, etc). Khestwol (talk) 09:00, 25 September 2015 (UTC)Reply
  • I've no opposition against the third option. To be clear, does the proposition include using the same symbol for both hamza and ayn in non-initial position? Abjiklɐm (tɐlk) 16:19, 28 September 2015 (UTC)Reply
    • Yes. As far as the standard -not strict!- transcription goes, hamza and ayn will be treated exactly the same. So straight apostrophes, and only in non-initial positions. - HyperGaruda (talk) 17:22, 28 September 2015 (UTC)Reply
      • Thanks, Francis, for the explanation that AT policy avoids apostrophe-like characters because we can't rely on redirects being in place. HyperGaruda's argument from simplicity and media convention makes some sense. It looks like we're not at the point where WP can lead convention on Arabic transliteration, so options 2 and 3 (using apostrophes) seem like the best choices. So, the question is: Should WP follow ALA-LC in titles and omit the initial hamza and not the initial ʻayn? Rupert Clayton (talk) 16:43, 29 September 2015 (UTC)Reply
I thought we had already reached consensus on "apostrophe vs apostrophe-like characters" in general, a couple of comments earlier... Anyway, regarding the initial apostrophe there are two (three including me, 3.5 including Rupert's 50/50 choice) editors in favour of leaving it out altogether and none against. "Should WP follow ALA-LC...?": yes in strict transliterations; no in standard transcriptions (which includes titles and general use in the main text), which already is the case currently. I'll edit MOS-AR this afternoon (UTC) so that there is at least a visual example of what we're discussing here. - HyperGaruda (talk) 07:38, 1 October 2015 (UTC)Reply

The reason why none of these proposals are still going anywhere is imho very simple: they try to impose a rule across all articles and are still unsuccessful in dealing with the consensus that may arise around some article titles. These consensuses are real and won't be overthrown by would-be guidance. That's why the WP:TSC policy contains "If, exceptionally, other variants are used a redirect with the apostrophe variant should be created (e.g. 'Abdu'l-Bahá redirects to `Abdu'l-Bahá)." I don't think `Abdu'l-Bahá is going to change anywhere soon, so deal with it.

Another reason why these discussions progress so difficultly is that contributors don't pay much attention to what others write, they add something, apparently only half understanding what someone else wrote. E.g. above I read "...options 2 and 3 (using apostrophes)..." while "option 3" reads "Leave the initial apostrophe out altogether". So please stay focussed, and avoid fuzziness in the comments. How is someone supposed to be interested in your comment, if that comment is only paying half attention to the comments by others? --Francis Schonken (talk) 08:24, 1 October 2015 (UTC)Reply

Updated ayn

edit

And additionally made changes to the wording of some phrases. - HyperGaruda (talk) 14:46, 1 October 2015 (UTC)Reply

Just a note that MOS:APOSTROPHE now prefers the use of {{ayin}}, {{aleph}}, and {{hamza}}. -- Beland (talk) 15:42, 6 August 2020 (UTC)Reply

Renaming suggestion

edit

Perhaps we should rename the current standard transcription into something like basic transcription. I often get the feeling on other talk pages that people confuse standard transcription with strict transliteration. Although this might give problems with primary transcription, so maybe we should change that into common transcription Any better naming suggestions? - HyperGaruda (talk) 19:20, 1 October 2015 (UTC)Reply

I think your choice of common transcription, basic transcription and strict transliteration describes well the different options. Abjiklɐm (tɐlk) 21:34, 1 October 2015 (UTC)Reply
Works for me! Another thing I think we might need to (re)consider is whether or not we're going to say anything about translations... The first sentence of the "Definitions" section reads "For the purposes of this convention, an Arabic word is a name or phrase that is most commonly originally rendered in the Arabic alphabet, and that in English is not usually translated into a common English word...." and that's the last thing said about translations. Now if we're going to talk about article titles in a coherent way, translations would probably need to be part of the guidance (e.g. a concept that is under a translated article title, can be transliterated in the first paragraph of the lede). Or leave article titles out of this guidance alltogether? Which would mean reviving WP:Naming conventions (Arabic)... Don't know what would be best? --Francis Schonken (talk) 05:46, 2 October 2015 (UTC)Reply
Thanks for mentioning translations. It's a bit odd that the broadly titled "MoS/Arabic" hardly says anything about using translations in the first place. Then again, to the original editors it might have been obvious, that the mother of all trans-language wiki conventions comes first. I'm really against treating the article title differently, since it just adds more complexity to an already complex situation. Instead, simply stressing the importance of WP:English might be enough. - HyperGaruda (talk) 10:41, 2 October 2015 (UTC)Reply
Re. "...treating the article title differently...": well article titles are treated differently. That's basic MOS/AT distinction. The choice is whether we say something sensible about article titles in this guidance or whether we revive the derelict WP:Naming conventions (Arabic). Even if the last option is chosen WP:MOSAR should not contain guidance that is incompatible with article titling guidance, meaning: the complexities involving article titling need to be addressed anyway. That is, if we want this page to evolve from would-be guidance to an actual guideline. --Francis Schonken (talk) 11:48, 2 October 2015 (UTC)Reply
I think the current convention for article titles is a clear enough summary: Article titles are written preferentially using a common English translation. If unavailable the common transcription is used. If neither is available, the basic transcription is used. The strict transliteration should never be used in article titles. This brevity does not really warrant a complete article of its own imho. The current common and basic transcription rules are MoS-compatible, so I am not sure what other rules you want to add for article titles; care to explain? - HyperGaruda (talk) 15:21, 2 October 2015 (UTC)Reply
That ruleset could never account for Muḥammad ibn Mūsā al-Khwārizmī which has survived quite a number of WP:RMs (...and many other examples some of which have been mentioned above), so I attempted to rewrite in view of WP:AT compatibility, and removed the link to the failed naming convention. Hope this may work. --Francis Schonken (talk) 16:47, 2 October 2015 (UTC)Reply
That looks great. And about al-Khwarizmi: wow, who permitted that title? Ok, I see some reason in keeping the first name over the common name, but the diacritics? It's got to be an absolute pain in the a** to keep using that name in the article. - HyperGaruda (talk) 17:36, 2 October 2015 (UTC)Reply
Re current article title for al-Khwarizmi (and many others that show little coherence for how the Arabic name is rendered): shows the downside of not having had a coherent article titling guideline for Arabic names for the last ten years, so I'm happy we're going towards something that might fill the gap. If this becomes an acceptable naming convention probably some of the current article titles may benefit from being revisited. Don't want to rake up the dust though, before we're sure this is a convenient naming convention. --Francis Schonken (talk) 07:43, 3 October 2015 (UTC)Reply

<big> and <large> tags

edit

Since the guideline is being updated, what is your opinion on using the <big> and <large> tags, presumably to make the Arabic script more legible? I'm in favor of avoiding them since I'm hoping that, in the future, more browsers will support choosing an appropriate font. These tags are visually distracting and are a cheap fix for a temporary problem. Abjiklɐm (tɐlk) 15:44, 4 October 2015 (UTC)Reply

I'm still against these ugly and static tags, especially since I've learned that the "lang" templates can take care of the problem more elegantly. Our conversation at #In-line Arabic shows that at least Firefox and Chrome adapt to Arabic script if indicated as such. Not sure about Safari, although the basic Arabic font on my iPad is already nicely legible without the "lang" template. If anyone wants to compare with/without template: Arabic diacritics#Alif waslah contains both cases. - HyperGaruda (talk) 16:33, 4 October 2015 (UTC)Reply

Capitalizing the definite articles in templates

edit

Should the definite article be capitalized in navigation templates where Arabic words are listed in horizontal and alphabetical order? I ask because I've been discussing the Template:Mosques in Palestine with another user who is arguing that because each article stands alone, the definite article should be capitalized. I believe, because the definite article is basically a formality, that it should not be emphasized and thus should be lowercase while only the main word should be capitalized. I think capitalizing the definite article for each item in the list muddies the alphabetical order because the emphasis becomes shifted to the definite article instead of the main word getting prioritized. For example: I think "Ibrahimi Mosque • al-Jawali Mosque • Nabi Yahya Mosque an-Nasr Mosque" is preferable to "Ibrahimi Mosque • Al-Jawali Mosque • Nabi Yahya Mosque • An-Nasr Mosque". The only time I think the definite article should be capitalized is for the first item in the list. I could not find a specific guideline in MoS Arabic for templates so I wonder what the policy should be here. --Al Ameer (talk) 05:24, 25 October 2015 (UTC)Reply

Since each entry is basically a separate sentence, I'd say capitalise the first letter, regardless of being "al" or not. The same has been the case for the initial "The" in other templates such as Template:Andrew Lloyd Webber (forgive me for comparing to a musical writer, but this came up first in my mind). Of course any other "al" should not be capitalised. - HyperGaruda (talk) 07:39, 26 October 2015 (UTC)Reply
@HyperGaruda: I see your point, but I still think only the first word of a particular row (section) should have its definite article capitalized for Arabic. With the English "The", the following word is not exactly attached to the definite article like "al-" is attached to the subject word because a hyphen is not used in the former. I also think the lowercase style takes less emphasis away from the main word, which is even more relevant for an alphabetically ordered list), and simply looks better (at least to my eyes). I'm not exactly sure what the process here is for amending or adding to the current guidelines, but it would be beneficial to MoS Arabic if a definitive policy could be set regarding the capitalization of the definite article in templates so that it could be applied throughout the encyclopedia. I understand the argument that my question is answered by the following guideline: "Al-" ... always written in lower case (unless beginning a sentence), and a hyphen separates it from the following word, but I think there should be clarification regarding stand-alone words in a list. Further input from more users like yourself who are active on MoS Arabic would be nice. --Al Ameer (talk) 03:46, 27 October 2015 (UTC)Reply
Well, I am the other editor who thinks that list elements are like the beginning of a new sentence, see the discussion at Template_talk:Mosques_in_Palestine#Capitalization_of_the_definite_article. In this regards, list elements are like Wikipedia article titles, and are as a rule capitalized, see Al-Aqsa Mosque and all others. Debresser (talk) 22:07, 27 October 2015 (UTC)Reply

Pinging Francis Schonken, Abjiklam and Khestwol for some more input. - HyperGaruda (talk) 06:00, 28 October 2015 (UTC)Reply

edit

I have tried to find templates related to Arabic, excluding navigation and message templates. It is probably a good idea to include these in the MoS, but I would like to hear your comments about them and if there are more which may need a mention in the MoS. - HyperGaruda (talk) 08:51, 3 November 2015 (UTC)Reply

Already mentioned in MoS. - HyperGaruda (talk) 08:51, 3 November 2015 (UTC)Reply

In combination with |ar. Already mentioned in MoS. - HyperGaruda (talk) 08:51, 3 November 2015 (UTC)Reply

In combination with |ar. Same effect as {{lang}}, but used for entire paragraphs that need to be right-aligned.

Gives IPA pronunciation.

For use after external links to indicate that the linked site is written in Arabic.

Infobox with all the possible renderings of an Arabic term.

Infobox analysing the different parts of an Arabic name.

Problems with "basic transcription"

edit

I understand that for the purposes of article titles at least, transliteration guidelines are always trumped by WP:UCN, on a case-by-case basis. A notorious case is Muammar Gaddafi -- it doesn't matter how you would transliterate the name in theory, choice of article name is guided entirely by how the relevant English-language sources tend to render it. Similarly, Quran vs. Qur'an. The second variant is more "correct", but the first is simply the more common in English-language sources.

It is perfectly fine, also, that the "basic transliteration" brings information loss, mostly losing vowel length and "emphatic" markers ( vs z). These are phonological features in Arabic and it is fair enough to not preserve them in "basic transliteration", we can always give close transliteration for clarification.

But I am unhappy with the accident of collapsing hamza and ayin. These are two entirely different phonemes, and they are collapsed not because they are phonologically similar but because their romanization symbols happen to look similar.

In names with common anglicization, neither ayin or hamza will be rendered, e.g. Amman, Iraq, Quran, etc. But in technical topics, or specialist terminology with no familiar anglicization, I would suggest it is advisable to recommend use of a distinct transliteration of ayin, e.g. Muʿtazila, Muqattaʿat, vs. the corresponding DIN symbol ʾ, or simple apostrophe, Al Wala' Wal Bara'. --dab (𒁳) 09:02, 15 November 2017 (UTC)Reply

I'm glad we're having this discussion. I'm not happy about requiring basic transliteration in article text (as MOS:ISLAM does: "Otherwise, a basic transcription should be used.") and I haven't been following this requirement myself. Academic encyclopedias use more informative transliterations and I see no reason why we shouldn't do so as well. Basic transliteration is sometimes convenient and sufficient, but I think it's misguided to attempt imposing consistency across WP at the "lowest common denominator". I have only rarely seen editors replace richer transliteration by basic transcription in article text, so I'm not sure this requirement represents an actual de facto consensus. That's my take on usage in the body of articles.
Article titles seem to be a special case. There's nothing to discourage use of non-ASCII characters in WP:TSC, but the preference for basic transliteration in MOS:ISLAM seems to be nearly universally followed in titles transliterated from Arabic or other languages written in Arabic-derived scripts. I'm not convinced by an argument based on information loss for article titles, because a strict transliteration should be found on the first line of the article (this per MOS:ISLAM). However, I'm also not sure what actual problems using richer transliterations in article titles with appropriate redirects would cause, aside from simple lack of consistency, and especially for ayin (which can be transliterated as ‘ or ʿ or ʕ, or I believe with other similar Unicode symbols).
I'll publicize this discussion on related pages so we can get broader input. Eperoton (talk) 01:21, 16 November 2017 (UTC)Reply
The above is actually a bit of a mis-analysis. UCN (WP:COMMONNAME) is not a style policy. It's the policy that tells us to use some version of Quran, versus "Koran" or "Islamic bible" or whatever. In some cases it will result in something like Muammar Gaddafi versus "Muammar Qaddafi", etc., because there are so many possible and well-sourced transliterations, we have to pick one and go with the common one. But we're actually quite tolerant of diacritics, except for names with an overwhelmingly common name in English without them (e.g. Iraq – for the same reason we use Munich not München). If a diacritic can be reliably sourced as belonging in the name, we generally keep it. It's also fine for us to use a glyph like ʿ or ʾ instead of ', as long as the alternatives redirect there; e.g., we do this with various Hawaiian names that contain an okina (but not for those assimilated so deeply into English they're rarely encountered in English with them, as with Hawaii itself). For rendering ayin, we should use whatever is most recommended by reliable sources on how to transliterate Arabic (not what is most frequently done in sources that are not reliable for language matters, like newspapers, which probably just omit any symbol at all). This is in keeping with our handling of diacritics, too. It doesn't matter one whit whether American and British newspapers and even sport governing bodies regularly misspell a sportsperson's name as "Gratic"; if RS tell us it is properly spelled Gratič, then we use that. (Unless the subject him/herself has pointedly abandoned use of the diacritic, entirely or in English, e.g. by omitting it in their official English-language website; then it becomes a WP:ABOUTSELF policy matter. Same goes for Asian name order; Utada Hikaru remains family-name first because she uses that name order on the majority of her albums, even in English; Hajime Sorayama is given-name first because he uses that name order in Western media.)  — SMcCandlish ¢ >ʌⱷ҅ʌ<  03:29, 16 November 2017 (UTC)Reply
May I ask you all what you are actually proposing to change from the current situation? It is not that clear from you discussion.--Lüboslóv Yęzýkin (talk) 17:43, 16 November 2017 (UTC)Reply
For reference, here's what the current guideline MOS:ISLAM says:
As a general rule, diacritical marks over and under the letters should not be used in article titles or text (only in the etymology section and sometimes the first sentence of the lead section). If a non-standard form of transliteration is to be used, it must be the common transcription, based on references or self-identification. For example Mecca rather than Makkah, mosque rather than masjid etc. Otherwise, a basic transcription should be used. The characters representing the ayin (ع) and the hamza (ء) are not omitted (except when at the start of a word) in the basic form, represented both by the straight apostrophe (').
WP:MOSAR, which is just a proposed guideline, follows the general tenor of this. In response to SMcCandlish's analysis, I would just note that transliteration practices in RSs are inconsistent. Some large academic reference works impose a uniform, strict transliteration scheme, but not the same one (notoriously, the flagship encyclopedia of Islamic studies, the second edition of Encyclopaedia of Islam uses an idiosyncratic transliteration not adopted elsewhere). In the broader field of academic publications in Islamic studies, there's no consistency and at least some of the diacritics required for a strict, reversible transliteration are commonly omitted.
I don't have a strong opinion on whether we should continue applying the current requirement to titles. I do propose relaxing it for article text. I think the rule of thumb is that we should use strict(er) transliteration whenever it helps the reader, or more precisely those readers for whom the choice between basic and strict(er) transliteration would make a difference. When we write common Arabic terms, which someone who knows Arabic can easily recognize based on basic transliteration, and for which there's no common transcription, then a basic transcription should be sufficient, though not required. However, if there's potential for confusion, then using a richer transliteration is called for. I'm not sure if we should reflect this rather involved line of thought in the guideline, or simply relax the language. Eperoton (talk) 00:54, 17 November 2017 (UTC)Reply
Thanks for you explanation. I'm not that sure if I can comment here, because, even though I'm very interested in the subject, I have rarely participated writing big chunks of text with a lot of Arabic names or terms (except when I have written about the Arabic language itself, where I always try to use strict transliteration). My impression here, there is not and likely cannot be a universal rule, every situation require different approaches. I'm 100% for using strict transliteration in the lead section (usually followed by the original Arabic script in brackets), or when the word is introduced for the first time (usually after the original Arabic script). But I'm not that sure that we must or may follow this every time when an unadopted Arabic term or name is used in general English text. I'm not against of such a practice if a given editor is comfortable with it, but unlikely many will be comfortable as well. For one reason at least. For most people it is quite a problem to enter special letters (I know there is a character picker in the edit toolbar, but). Of course, one may devise and use his own keyboard layout or other device, but many would not bother themselves. Imagine any article about Islam, where there are dozens if not hundreds of Arabic words, and it would be definitely tiresome to use strict transliteration every time. So most will continue with the 26-letters transliteration plus may add the typewriter apostrophe (U+0027) here or there. Even requiring to discriminate hamza and 'ayn with two different quotation marks may have little effect. I also agree with Abjiklam's remark below that (non-Arab) people do not pronounce the sounds anyway. And it is not that the conflation of these two sounds is more problematic than others: consider an example of fuṣḥá/fusha. We may also look at how other languages are dealt. For Chinese it is a norm to use the tone marks in pinyin when immediately followed by Chinese characters, but the tones are usually omitted when written in general context.--Lüboslóv Yęzýkin (talk) 21:16, 17 November 2017 (UTC)Reply
The basic transcription is simply the strict transliteration without diacritics. It is not based on the phonological similarity of the Arabic letters. It just so happens that the two happen to coincide the majority of the time. However if the latter were true, then the basic transcription for ẓā’ would be dh, because the letter ẓā’ is not pronounced like an emphatic zāy, it is closer to an emphatic dhāl. Nevertheless, regarding the basic transcriptions of ‘ayn and hamzah, I would support them being distinct as long as we don't go back to using the grave symbol (`) for ‘ayn. Axiom292 (talk) 04:50, 17 November 2017 (UTC)Reply
When the topic of the article is not the Arabic language itself, I don't see why we should use different symbols for hamzah and ayn. As long as the word occurs once in its strict transliteration, there should be no confusion at all. Both sounds are ignored by those who do not speak Arabic anyway. The only exception would be to differentiate two words in an article that only differ in whether they use either letter. Abjiklam (talk) 13:42, 17 November 2017 (UTC)Reply
The current guideline seems to make an incorrect assumption that the only terms we need to transliterate are the subject of any given article, whose strict transliteration and native form should be found in the opening sentence. Most transliterations I've used myself were actually proper nouns and technical concepts, such as legal terms, which occurred in running text, often just once per article. They don't necessarily have articles of their own. In those cases, I find it to be less cumbersome to use a non-ambiguous transliteration, which would easily allow an Arabic speaker to reconstruct the native spelling of the word, rather than a basic transcription followed by Arabic script. As Любослов Езыкин points out, ambiguity may arise for a number of letter pairs besides ayin and hamza. If our goal is to reflect the usage in RSs and help readers of various levels of knowledge, then I think that the current requirement to use basic transcription except in the opening sentence and etymology section is untenable. I don't think anyone is proposing a requirement to use strict transliteration. Rather, these transliteration schemes should be proposed as two reference points, while the choice of transliteration -- which may well be a hybrid between the two -- should be discretionary and context-dependent. Eperoton (talk) 02:03, 18 November 2017 (UTC)Reply
You're right that Arabic words should preferably be introduced with the Arabic script and the strict transliteration. I suppose my comment was more about words that would occur often in an article, especially words that are the subject of an article. In that case I still think that, after the word is written once with Arabic letters and a strict transliteration, a basic transcription without diacritics and a single apostrophe should usually be enough. Frankly, what I find more important is that basic transcription and strict transliteration each remain consistent across Wikipedia. The choice of one over the other is, as you said, context-dependent. Abjiklam (talk) 02:24, 18 November 2017 (UTC)Reply
I note that HyperGaruda has been an active advocate of basic transcription on this talk page. HyperGaruda hasn't been active in the last few days, and I'd like to hear their thoughts before I try to propose any specific changes to the guideline text. Eperoton (talk) 03:44, 19 November 2017 (UTC)Reply
Sorry, I have had some wonderful time off in Japan, but I am back now ☺ If I would have been given dictatorial powers, I would impose strict transliteration throughout the entire 'pedia, as that is what is used in scholarly articles. Now in a more politically correct tone: the only thing I care about is a consistent application of the guideline, regardless of which transliteration scheme is used. The only real issue I once had with the strict transliteration, is its accessibility on older machines, but now I am of the opinion that if your machine can still not read diacritics, it is so old it should not even be connected to the internet. Alternatively, if the only problem is the distinction between ayn and hamza, we could start using single quotation marks (‘) and (’) in basic transcription, as these two characters will still properly display in most older encodings. --HyperGaruda (talk) 13:11, 19 November 2017 (UTC)Reply

Very well. It looks like we may be gravitating towards consensus. I think we could start with Dbachmann's objection to mandating the use of apostrophe for ayin in the basic transcription. I had assumed that this practice was rooted in a strongly held community consensus, but it certainly doesn't seem to be the case judging from this discussion. We now have a nice discussion at Ayin#Transliteration courtesy of Dbachmann, which indicates that the LOC maybe a holdout in the broader trend of adopting the raised semi-circles for ayin and hamza in the academic publishing industry. This symbol is available in the WP editor under the Arabic tab. I have just done a bit of further research and I see that while specialist Oxford encyclopedias in Islamic studies use strict transliteration, the Oxford Encyclopedia of the Modern World and MacMillan Reference (Gale Thompson) encyclopedias in Islamic studies use a scheme that seems to match our basic transcription, but with the raised semi-circles for ayin and hamza. In view of this, I start with the following proposals:

1) List the raised semi-circles as alternatives for ayin and hamza under strict transliteration.

2) List the raised semi-circles and raised commas as alternatives for ayin and hamza under basic transcription.

3) Recommend not using the apostrophe for ayin and hamza unless it is part of a common transcription.

Eperoton (talk) 02:42, 20 November 2017 (UTC)Reply

Rather than listing alternatives, I would prefer picking one or the other (raised semi-circles or raised commas / single quotes) and applying it consistently to both the strict transliteration and basic transcription. Axiom292 (talk) 03:43, 20 November 2017 (UTC)Reply
I don't think there's a case for discouraging the use of raised semi-circles, which have become a standard in the publishing industry and are even used in the Arabic tab of our own editor utility. Is there a case for discouraging the use of raised commas/quotes, adopted in ALA-LC romanization and the United Nations Group of Experts on Geographical Names? Why should we aim for consistency that doesn't reflect the body of RSs on this point? Eperoton (talk) 03:53, 20 November 2017 (UTC)Reply
A small practical question: what editor utility do you use? I use the standard source code edit window with the Help:Edit toolbar, but have never found raised semi-circles in the Special characters tab. It is not under Latin, Latin extended, IPA, Symbols, Arabic, or Arabic extended. --HyperGaruda (talk) 10:17, 20 November 2017 (UTC)Reply
Hmm, I believe I've been using the default source code editor. It has tabs called Wiki markup, Symbols, Latin, ... Arabic, etc, but not the "extended" ones. Under Arabic I have the symbols ʾ and ʿ. Eperoton (talk) 01:06, 21 November 2017 (UTC)Reply
Ahh, seems I've been doing it the hard way for years. I just found out that the toolbar at the bottom has more options beyond inserting mathematical symbols and is probably the one you are describing. The one I have used till now is the toolbar above the editing window. In that case I would support using raised semi-circles throughout, provided that we add a line to the MoS about how to enter transliteration characters. --HyperGaruda (talk) 04:28, 22 November 2017 (UTC)Reply
I was looking through Romanization of Arabic and it looks like DIN is the only one using semi-circles, in contrast to the single quotes used in most other romanizations. For the sake of representation and the fact that MOSAR is mostly based on ALA-LC, I'd rather add single quotes to the bottom toolbar and use those throughout Wikipedia. --HyperGaruda (talk) 04:39, 22 November 2017 (UTC)Reply

According to that table, the semi-circle for ayin is also part of the BS and ISO schemes, but I don't think counting distinct transliteration schemes is an appropriate methodology to use here, as it takes no account of how widely these schemes are actually adopted. Also, the table is not accurate on this point: most notably, contrary to what it states for "EI", all editions of the Encyclopedia of Islam have used the semi-circles, as one can verify on its website.

I've checked the character used for ayin in the major reference works in Islamic studies that I can quickly consult:

- Raised semi-circle: EI1,2,3; Brill's Encyclopedia of the Qur'an, The Oxford Encyclopedia of the Islamic World and other OUP encyclopedias, MacMillan/Gale Encyclopedia of Islam and the Muslim World

- Raised inverted comma: Routledge Medieval Islamic Civilization: An Encyclopedia,

- Different characters used in different entries: Oxford Handbooks, The New Cambridge History of Islam, The Princeton Encyclopedia of Islamic Political Thought

So, while I don't see a rationale for discouraging raised commas, which are used in influential sources, I see even less rationale for discouraging raised semi-circles, which seem to be the most widely adopted convention in current academic publications. Eperoton (talk) 02:27, 23 November 2017 (UTC)Reply

To recap the discussion, I tried to formulate what seemed to be an emergent consensus in the following proposals:
1) List the raised semi-circles as alternatives for ayin and hamza under strict transliteration.
2) List the raised semi-circles and raised commas as alternatives for ayin and hamza under basic transcription.
3) Recommend not using the apostrophe for ayin and hamza unless it is part of a common transcription.
Axiom292 and HyperGaruda expressed a preference for using just one set of symbols for ayin and hamza. Though I think it would be a more convenient option in a less messy world, we have not come up with a rationale for discouraging either semi-circles or raised commas, given their wide adoption in RSs (and for raised semi-circles also the current design choice for the default WP editor). At that point the discussion has gone dormant. We can continue it, but in the meantime I think we have a consensus that MOSAR should be changed in some way along these lines, and implementing these proposals seems like an incremental improvement. Eperoton (talk) 02:31, 15 December 2017 (UTC)Reply

Capitalizing al- when beginning a sentence

edit

From MOS:

"Al-" and its variants (ash-, ad-, ar-, etc.) are always written in lower case (unless beginning a sentence), and a hyphen separates it from the following word.

I think it should always be written in lowercase even when beginning a sentence. I'm sorry if you discussed this before but I couldn't find any direct mentions to this.

From Wikipedia Romanization of Arabic:

https://en.wiki.x.io/wiki/Romanization_of_Arabic#endnote_10

The UN system and ALA-LC prefer lowercase a and hyphens.

References cited by the Wikipedia article:

ALA-LC Romanization Tables, Rule 18(a)

https://www.loc.gov/catdir/cpso/romanization/arabic.pdf

the definite article al is given in lower case in all positions

UNGEGN Romanization Tables, Other systems of romanization

http://www.eki.ee/wgrs/rom1_ar.pdf

The original transliteration table, published in vol. II of the report on the Second UN Conference on the Standardization of Geographical Names, contains examples (but not explicit rules) where the definite article is always written with a small initial

What do you think? — Preceding unsigned comment added by Zeromido (talkcontribs) 09:53, 18 October 2019 (UTC)Reply

This page is like a mine field.

edit

This page is like a mind field.

Only people with perfect knowledge of the Quran (the holy book of Muslim) and its meanings and perfect knowledge of the English language, should venture in editing it or be participant in it.

Why ? Because the only origin of the Arabic language is the Quran.

A statistic issued by the United Nations claimed that the most three difficult languages worldwide are the American Native Indians languages, which has no rules, the Chinese language which underwent a massive simplification in the last decades and third the Arabic language which cannot be simplified by any means because the Quran cannot be changed in any way.

Just two examples 1- The word : ALLAH (which means the only name of God in Arabic) does not change pronunciation, in most, may be all of the languages spoken worldwide. 2- Any other word changes from country to another - example : house (English) - maison (French) - haus (German) - casa (Spanish and Italian) - (بيث - العربية). Try to read them all and see if you get the same sound.

Now, any word that is Arabic and has an established noun in a foreign language, should be used ( for the picky people like me both should be used for example Cairo - Alqaheera - القاهرة).

For any word (also in Arabic) that has not an established noun in the foreign language should be written as it is pronounced in the Arabic language. The last point is a logic thinking.


@Pathawi Abdelhamidelsayed (talk) 22:48, 10 November 2020 (UTC)abdelhamidelsayed Reply


Your religious piety may be impeccable, but unfortunately your knowledge of linguistics appears to be weak. Mostly what is being transcribed is MSA pronunciation rather than strict Qur'anic tajwid. "Cairo" is an exonym (see the article on exonyms and endonyms)... AnonMoos (talk) 01:33, 11 November 2020 (UTC)Reply

Egypt and apostrophes

edit

Excuse me, @Apaugasma:, but are you familiar with the Arabic phonology topic? Have Arabic names passed on to you before?

  • The standard use of ⟨ج⟩ in Egypt is always /ɡ/. Transliterating it ⟨j⟩ in such cases is plain wrong. The only case to do so correctly, when explicitly stating the scheme which uses ⟨j⟩ only.
  • The glottal stop as well as the voiced pharyngeal fricative are normally never spelled with apostrophes. There are endless examples, and here are excerpts: Aliaa, Alaa, Maalouf, Semaan, Amer...

--Mahmudmasri (talk) 17:43, 24 January 2022 (UTC)Reply

Hello Mahmudmasri! I'm a trained Arabist, and have quite a lot of experience editing articles related to Arabic topics here. Let me address your two points one by one:
  1. See transliteration: it is a type of conversion of a text from one script to another that involves swapping letters (thus trans- + liter-) in predictable ways [...] Transliteration is not primarily concerned with representing the sounds of the original but rather with representing the characters, ideally accurately and unambiguously. In other words, phonology is in principle irrelevant for transliteration. Although the characters chosen will often represent similar phonemes in both scripts, what is important about transliteration is that it should be strictly consistent. The common and even the basic transcription may well be made more phonologically correct, but strict transliteration not.
  2. The apostrophe thing for hamza and ayn is a Wikipedia-specific phenomenon. It has to do with the requirements for Wikipedia article titles, which should normally only contain ASCII characters (which the straight apostrophe is, but other characters used to transliterate hamza and ayn not). See WP:TITLESPECIALCHARACTERS: "various apostrophe(-like) variants (’ ʻ ʾ ʿ ᾿ ῾ ‘ ’), should generally not be used in page titles. A common exception is the simple apostrophe character (', same glyph as the single quotation mark) itself (e.g. Anthony d'Offay), which should, however, be used sparingly". The result has been that hamza and ayn are both represented with straight apostrophe in article titles. From there on, Wikipedia editors started to use the same convention throughout articles. This is now an official part of MOS:ISLAM: The characters representing the ayin (ع) and the hamza (ء) are not omitted (except when at the start of a word) in the basic form, both represented by the straight apostrophe ('). Do note, however, that this only holds for basic transcription: if there is a common transcription, such as in the examples you link to (mostly personal names, and some place names), that should always be used instead of basic transcription.
I hope this answers your questions. Regards, ☿ Apaugasma (talk ) 18:34, 24 January 2022 (UTC)Reply
Thanks dear. I am well aware of both issues and particularly the problematic apostrophe-like characters, however, I referred to over-using them when inappropriate, e.g. insisting on spelling Egyptian towns and cities with a ⟨J⟩ when clearly no one uses it locally or pronounces [(d)ʒ]. The style guide is actually confusing regarding the letter ⟨ج⟩. I rechecked the guide as there was someone asking for an opinion for a location name. A regular user would miss what is strict and what is not, and even if explicitly labeled, e.g. ALA-LC, that's a mouthful.
  1. What is wrong with just simplifying things and stating that people do actually spell /ʔ/ and /ʕ/ with neither apostrophes nor apostrophe-like characters?
  2. Why can't we make it very clear that those transliteration schemes are normally only suitable for historical and Islamic topics? Not every Wikipedian is interested to know the Arabic phonology in details to deduce from the confusing guide that transliterating ⟨ج⟩ to ⟨J⟩ is not right in some cases.
  3. An ordinary reader (our main target audience) would see ⟨J⟩ and pronounce //, not /ɡ/. How about that? Is the guide solving problems or creating additional ones?
Having seen the guide, I can only say that it was created without taking too many cases into consideration, and it was drafted by those who have very limited knowledge of Arabic and how it is used, historically and contemporarily.
Thanks again @Apaugasma: I wish to make an elaborating edit about the letter ⟨ج⟩ as well as a note on the /ʔ/ and /ʕ/, hoping the guide is more user friendly. What do you think?
--Mahmudmasri (talk) 19:07, 24 January 2022 (UTC)Reply
In my view, the guideline was created with the norms and conventions of Wikipedia in mind: the first option is to base the spelling on the common transcription as found in a majority of references in English sources. This is in line with WP:NPOV, WP:N, WP:V, etc., all of which were designed to make Wikipedia reflect the most common usage in mainstream sources. Only if no such common transcription exists do we shift to our own system of transcription, mainly basic transcription, which in turn is based on the most common transcription systems found in the sources. Yes, common transcription will often be limited to modern topics, and basic transcription to historical topics. But that is not what the difference is based on, nor what it should be based on: in Wikipedia terms, the relevant distinction depends on what we find in the sources.
Most notable modern topics will have a common transcription, which means that no problem will present itself. Where we do need to shift to basic transcription, there are several options: either we use one system for all variants of Arabic, or we use different forms of basic transcription for various local variants. Perhaps it would make sense to create different systems for strongly developed regional variants such as Egyptian Arabic. What we cannot do, however, is to adjust the basic transcription currently in use for all variants to the whims of editors who believe this or that character is a better phonological approximation. If we would allow that, we could just as well have no system of basic transcription at all. In particular, the basic transcription of hamza and ayn as it stands now should not be changed, mainly because that is how it is commonly used (remember that Wikipedia's style guidelines follow existent usage among editors, they do not prescribe it), though also because MOS:ISLAM would need to be changed too.
Yes, using /g/ for the basic transcription of ج in Egyptian Arabic is a kind of obvious thing to do. But it's still a transcription of characters (e.g., written Egyptian Arabic), not of phonemes. We're transcribing ء and ع, not /ʔ/ and /ʕ/. The Egyptian Arabic word قمر will be qamar, not 'amar. Anyway, it may be a good idea to create a separate basic transcription system for Egyptian Arabic. In the future, we may also create separate basic transcription systems for a number of other major variants such as Maghrebi, Levantine, Mesopotamian, Peninsular Arabic.
However, for any proposal you may have, I suggest you put it up here first on the talk page, and wait for input from other editors. ☿ Apaugasma (talk ) 20:23, 24 January 2022 (UTC)Reply
It is a misunderstanding from your part that I spoke about dialects. See? I carefully titled the discussion "Egypt", not Egyptian Arabic or dialect!
  • The /ɡ/ pronunciation is the standard one in Egypt, even in Literary Arabic, the written Arabic, the official Arabic used in writing, media, and by clerics. Even outsiders acknowledged that, e.g. Janet Watson (The phonology and morphology of Arabic (The Phonology of the World's Languages)) who mostly studied Yemeni dialects, and for a historical reason, ⟨ج⟩ is considered a lunar consonant.
At the beginning of the guide, it was clear that it is dormant, so I am not expecting to have lots of opinions regarding the issues.
Anyone who can read Arabic knows that those transliterations are actually not transliterations in the strict sense. E.g. ⟨السلف الصالح⟩ would have been ⟨al-Slf al-Ṣālḥ⟩.
I am following standardized schemes, the Library of Congress seems to be a preferred one here, but since Wikipedia doesn't strictly always stick to one, based on how I see the guide now and how Wikipedians transcribe in Wikipedia, it's safe to say that it is not always following any documented romanization, rather personal Wikipedians' preferences are included.
  • From the Library of Congress, DIN, and the UN (page 4 mentioned explicitly "Egypt" and ⟨G⟩), it's obvious that they have contextual variations for transliterating the same Arabic letter. Accordingly, Egypt-related topics should use ⟨G⟩. There was a mention of spelling words related to Northwestern Africa, which stated to follow their norm of using the French orthography. So, it is not a strict transliteration, rather a transcription. An interesting example for you is Moustafa Amar (and Google), not Muṣtafā Qamar, although ⟨Kamar⟩ is conventionally a more common spelling for such ⟨ق⟩ words.
As someone who speaks the language, the common errors and confusion is undeniable to me and it pains me to see the same mistakes persistently repeated since contributors barely understand the schemes they ought to follow.
There would need to be a subsection under the definition, stating the "context".
  1. First suggestion is to edit the consonant table to reflect that ⟨G⟩ is to be used, even in the strictest transliterations, in Egyptian contexts.
  2. A "common" spelling list needs to be in the aforementioned table, which would in turn summarize most of the article by taking a quick glimpse. That should also include the /ʔ/ and /ʕ/.
  3. The definite article note is very long and vague, not explicitly mentioning that it is usually acceptable to be spelled without assimilating the ⟨L⟩, by strict schemes. That is even advantageous to reduce the words, making them more readable by non-Arabic speakers, e.g. ⟨ash-shāri‘⟩ vs ⟨al-shāri‘⟩; pronouncing the word /ælˈʃɛəri/ is defintely better than /əˈʃɛəri/.
  4. Many other sections might need rephrasing to reduce them. The guide is currently uninviting.
--Mahmudmasri (talk) 00:57, 25 January 2022 (UTC)Reply
Moustafa Amar is a common transcription. Please try to understand this. Anytime a name is spelled in a certain way in a significant majority of sources, we use that form of the name. For Egyptian names, this will always be with /g/ for ج. Basic transcription and strict transliteration, on the other hand, is to be used only when there are not enough sources establishing a common spelling. This should be a small minority of cases for modern names: if we have an article on a modern Egyptian topic, there should be sources on it, and we follow the usage of the sources. Only for topics on which there is a very small amount of English-language sources, and where transcriptions vary between the sources, do we shift to our own system of transcription/transliteration.
But transcription/transliteration is a system. It is a scholarly practice that should always be used in a consistent way. It's not based on phonology. As I've stated above, I do not agree with any unilateral change you would make to it. But it's also not supposed to be overly rigid: on articles related to modern Egyptian topics, feel free to use /g/ rather than /j/! This is a common practice. We could note that this is allowed. For more sweeping changes, I suggest making very concrete proposals and opening an RfC, notifying WikiProjects and relevant editors. ☿ Apaugasma (talk ) 18:43, 25 January 2022 (UTC)Reply
Dear Apaugasma. I'll get back in a few days, due to busyness. --Mahmudmasri (talk) 00:25, 26 January 2022 (UTC)Reply

Hello Apaugasma, now let me take your notes one by one to tell you what needs to be changed in the guide.

  • I meantioned Moustafa Amar, because, even though Gamal is a valid romanization (and the strict contextual romanizations would spell it Gamāl), you insisted on restoring a wrong "strict" romanization:
    • partial rv: strict transliteration is Jamāl (strict means strict, i.e. always the same Latin letter for the same Arabic one). Unfortunately, that's not what common romanization schemes want, and it is unambiguously understood that ⟨G⟩ and ⟨J⟩ are used for ⟨ج⟩.
    • I already mentioned earlier in this discussion: From the Library of Congress, DIN, and the UN (page 4 mentioned explicitly "Egypt" and ⟨G⟩), it's obvious that they have contextual variations for transliterating the same Arabic letter. Accordingly, Egypt-related topics should use ⟨G⟩.
    • It is worth mentioning that the word for English is ⟨إنجليزية⟩ in Arabic, using ⟨ج⟩ and is transliterated with ⟨G⟩, always, never with ⟨J⟩, as ingilīzīyah / ʾingilīziyyah / ingilīziyya...

Is it really transliteration or transcription? Or just an aid to pronounce words as correct as possible?

  • Quoting myself: Anyone who can read Arabic knows that those transliterations are actually not transliterations in the strict sense. E.g. ⟨السلف الصالح⟩ would have been ⟨al-Slf al-Ṣālḥ⟩. While the Library of Congress would transcribe it al-salaf al-ṣāliḥ.

It seemed that users of the English Wikipedia prefer that Library of Congress romanization since it has less characters with diacritics. It is a bad practice to be picky in choosing parts and pieces of many romanization schemes combined together, e.g. to romanize /ʕ/ and /ʔ/ as ⟨ʿ⟩ and ⟨ʾ⟩ (the DIN scheme) while "mostly" using the scheme of the Library of Congress which uses ⟨⟩ and ⟨⟩ instead, + ignoring the final ⟨h⟩ for ⟨ة⟩, but also forgetting the context.

  • If you don't like having this ⟨h⟩, then you should switch to the Hans Wehr transliteration, but you should note that it does not capitalize first letters. Mixing schemes is bad and confuses those who can't read Arabic. Preserving ⟨h⟩ is often needed in grammar articles on Arabic to distinguish inflected words that end with ⟨a⟩. How would that be differentiated from ⟨ة⟩? Muʿallima, is that the feminine مُعَلِّمَة‎ or the masculine in the construct مُعَلِّمَ?

Currently, the article of Gamal Abdel Nasser has the ⟨J⟩ spelling (Jamāl ʻAbdu n-Nāṣir Ḥusayn) as a strict transliteration due to the mistake of this guide, even though he is not an ancient personality. Similarly, Saddam Hussein's surname was clearly pronounced in formal Iraqi newscasts in Literary Arabic, [sˤɐdˈdaː.meħ.seːn] rather than Ṣaddām Ḥusayn in the Wikipedia article. Such additional transliteration is actually not needed at all in such cases. Now to the "very concrete proposals":

  1. The context of romanization should be mentioned. (Briefly at the beginning of the article; under Basic transcription and Strict transliteration; elaborately under the Transliteration)
  2. The letter table should be amended to reflect the common use, the contextual use.
    • E.g. the ⟨g⟩ for ⟨ج⟩ when pronounced /ɡ/ in loanwords rewritten in Latin or Egyptian names; the ⟨ch⟩ (Northwest African use for ⟨ش⟩)...
  3. Sticking to one romanization scheme, apparently ALA-LC, and using it only in historical and Islamic topics in the article body, + ⟨h⟩ for ⟨ة⟩, but not using it in contemporary topics.
    • In case there is a need to romanize for a contemporary topic, the context must be respected, which means, the ⟨G⟩ should be used when it is about a contemporary Egyptian topic, exactly as the schemes advise.
  4. To use ⟨⟩ and ⟨⟩ for /ʕ/ and /ʔ/ when needed, but not those from other schemes, like ⟨ʿ⟩ and ⟨ʾ⟩.
  5. To reduce the lead of the Definite article subsection.
  6. Manuals of Urdu and Ottoman Turkish to be split.

I can do all of the aforementioned edits. Thanks. --Mahmudmasri (talk) 07:31, 28 January 2022 (UTC)Reply

@Atitarev, Nableezy, KarimKoueider, and Lockesdonkey: Dear users. You are requested to join the discussion about Arabic romanization summarized in the previous 6 points. --Mahmudmasri (talk) 08:06, 28 January 2022 (UTC)Reply

I support explicitly allowing an exception for ج in Egyptian contexts in our current basic transcription/strict transliteration guide: we can add In Egypt-related topics /g/ can be used instead of /j/, if applied consistently throughout the article. I oppose saying that /g/ should be used: we need to be less rigid, not more.
I will repeat that transliteration (read that article!) is not concerned with representing sounds or aiding pronunciation. It is not primarily based on phonology. You should stop using pronunciation as an argument, because it is just not valid. If you can't accept what transliteration is, you should not try to change it here.
Using -a for ta' marbuta is not ambiguous in strict transliteration as long as case endings are not represented. When case endings are represented, it will never be -a, but always -at(un/an/in/u/a/i).
There is no problem at all with 'mixing' different existing standards for our scheme. Almost all scholarly journals and publishers impose their own scheme, which in the majority of cases is a mixture of different standards. We do as scholarly journals do, and it has worked out very well until now. The only thing that is important is that our own scheme is consistent.
Where we differ from scholarly journals is that we do not rigidly impose one scheme. This is a good thing. Both -a and -ah should be allowed for ta' marbuta, both j and g (in Egyptian contexts) for ج, both ‘ and ʿ for ayn, both the non-assimilated definite article (al-shams) and the assimilated definite article (ash-shams), as long as application is consistent throughout each article. We could be somewhat more rigid on some points if others think this is necessary, but any proposal for using one and only one romanization scheme simply is a non-starter. It's what stranded this very guideline into sleep mode: we should describe established practice, not try to prescribe it.
In contemporary topics, common transcriptions should often be available. But if they are not, they should generally follow a basic transcription guideline. Sure, create different guidelines for different regional variants. But except for explicitly allowing an exception here and there, do not mess with the basic transcription guideline for the Modern Standard Arabic variant: even though real language use is always fluent, rougher distinctions are and will be made in order to have a consistent transliteration system. ☿ Apaugasma (talk ) 14:07, 28 January 2022 (UTC)Reply
@Mahmudmasri: I just want to note that I looked at your recent changes here and I agree with all of them. It really is an improvement, thanks!   ☿ Apaugasma (talk ) 09:50, 29 January 2022 (UTC)Reply
Well, you are welcome. I am very glad that we found an initial common ground. There is room for more rephrasing, but I am quite busy for that right now. --Mahmudmasri (talk) 11:15, 29 January 2022 (UTC)Reply

English spelling of Arabic place-names

edit

This discussion was moved from Talk:Al-Shunah al-Shamalyah, with some parts specific to that page omitted.

I have brought this up on many pages, with little result. A standard system, not to be applied blindly, but as a source of reference, is needed. There are Jordanian lists of toponyms, I have once seen partial ones prepared by the Department of Antiquities, but they're not online – and I don't know if they've been worked through to a final form. Here a few options.

  • The old familiar spelling, used by many of the old British sources, which became the gold standard, and adopted from there by hugely popular travel guides, such as Lonely Planet's Jordan: -eh for the ending, with -iyeh for the long i in the fem. sing. form, and less cumbersome versions for within the text (dropping the initial article, or using English forms such as "North Shuna"). Archaeologists across the region are still using the British-era toponyms in official publications, but there are efforts underway to reform them, for whatever reason. The old spelling standards were developed over time by the PEF and the official cartography departments while preparing the SWP and Mandate-period maps, gazetteers, and so forth. They have laid the foundation (and built the first 15 floors) of cartography and archaelogy in Jordan and Palestine/Israel. Local authorities often follow no rules whatsoever, so keeping the old ones would be by far better.
  • The Oxford Guide to Style (from here on OGS), an academic guide where general rules are explained by a trustworthy, specialised US-born scholar: in 2003, one year after publishing the OGS, Robert M. Ritter was Publications Manager for the Oxford Centre for Islamic Studies. It's not taylor-made for our needs, but quite useful nevertheless. It recommends (I'm using > for "preferred to"):
▶ a > e within the word
▶ ah or a > eh or e as an ending (feminine singular)
▶ definite article (al- etc.) ALWAYS joined to the noun by a hyphen
▶ When the word following al- begins with one of the 14 'sun' letters (t, th, d, dh, r, z, s, sh, ṣ, ḍ, ṭ, ẓ, l, n), the l is replaced by ('assimilated to') the 'sun' letter => we get ad-, an-, ar-, as-, at-, or az-. Exception: personal names in usual scholarly usage, where it remains al-.
▶ Everywhere, as a rule: widely used spellings remain unchanged, even against the rules chosen for the article at hand.[1]

Ritter also mentions three other widely used standards, but I lack the time to search for those too. They have been established and are used by:

  • The gazetteer published by the US Board on Geographic Names : we can also use it to a degree, although it's quite old already (1990), but it's fully worked through and very detailed. Where they clearly diverge from the standards applied by most academic, mainly archaeology-related systems (see OGS), is that they don't use a hyphen between the defined article and the nouns & adjectives, which we should. Also, they've completely replaced e with a, which goes against the old standard developed by the Brits and the flexibility recommended by the OGS. They have influenced the authorities, with the result that administrative spelling tends to follow US rules, while academic spelling follows the long-standing British rules. As a side-effect, the universally used and (I hope) unshakeable spelling of the common noun "tell", when used in administrative toponyms, is taking the weird form "tall". Fuck knows what that's supposed to improve. Mind Ritter's (OGS) rule of NEVER changing deeply rooted, familiar spellings.

Concrete example for this article: see the US gazetteer entries from p. 237. I have grouped together all variants leading to the same recommended spelling:

▶ Ash Shūnah (for Shuna, Shunah, Maḥaṭṭat Kahrabā' ash Shūnah = Shuneh Power Station)
▶ Shūnat Nimrīn (for Shūnet Nimrīn, Shûna, Shuna al Janubiya, Shuna Janubiya)
▶ Ash Shūnah ash Shamālīyah (for Shūna, Shuna esh Shamaliya, Shuna Shamaliya)
▶ 'Ayn ash Shūnah (for 'Ain esh Shuna, 'Ain ash Shuna = Shunah Spring)
▶ 'Jisr ash Shūnah (for Jisr esh-Shuna = Shuna Bridge)
▶ Tall ash Shūnah
▶ Shūnat Ibn 'Adwān[2]

It's clear that we have two main places, North and South Shuna. Not clear to me where the gazetteer's Sh. Power Station, Bridge, and Tell are, nor if Shūnat Ibn 'Adwān coincides with one of the two, but this is secondary for now. Anyway, getamap.net states that "Shuneh Power Station is also known as Jisr ash Shunah, ... Shuna Bridge", in Irbid Province (so connected to North Shuna). I guess Shuna Bridge is over the King Abdallah Channel and the power station somewhere nearby - unless they are both actually relating to Rutenberg's power station (there are several bridges there), which would require for Baqura to be in the N Shuna District & Irbid Province, which is perfectly possible.

"The Hashemite Kingdom of Jordan (Archaeological Map) 1:250k, April 1978" (from here on JAM), Sheet 1, has a Kh[irbet] esh-Shuneh at North Shuneh. Maybe it's identical with our Tell esh-Shuneh.

geographic.org 's page on Jordan (see here) looks mildly useful. Some of the pins are not very accurate, but can help in broad terms. Shuna Refugee Camp is shown at/just outside South Shuna. Shūnat Ibn 'Adwān has a pin in the middle of nowhere, next to some agricultural terraces near Hisban/Husban. Sh. Power Station has its pin in an agricultural field, but it's close to N. Shuna, in Irbid Province. Tall ash Shūnah is totally misplaced, downhill from Pella, but maybe it's not by chance that it's closer to N Shuna than to S Shuna.

EcoPeace Middle East. There is also a New Shuneh near S Shuna (see p. 62, left col.). They call Tell esh-Shuneh "Tell North Shuna", but "they" aren't focused on names, they're into really moving things in the real world ("Regional NGO Master Plan for Sustainable Development in the Jordan Valley. Final Report – June 2015").

The long-standing standard for reproducing long i sounds at the end of place-names is -iyeh or -iyyeh, which looks a bit over the top. The tendency (see OGS) is to now use a for e, so we get -iya(h).

CONCLUSION: I would suggest

Ash-Shunah ash-Shamaliyah = North Shuna (it also gets a lot of Google hits, even the most if one ignores the Wiki monster we've created)
Ash-Shunah al-Janubiyah = South Shuna, aka Shunat Nimrin,

both with the equally valid alternative version without an article (ash-) at the beginning, so

Shunah ash-Shamaliyah and
Shunah al-Janubiyah. This is standard practice when writing in English.

I would personally have preferred the old forms, with -iyeh, but that seems to be sooo 1999. We can consider when to use which form for the article's title, and create redirects for those left out. Arminden (talk) 01:35, 22 January 2022 (UTC)Reply

References

  1. ^ Ritter, Robert M. (2002). The Oxford Guide to Style (PDF). Oxford: Oxford University Press. pp. 252–256. ISBN 0-19-869175-0. Retrieved 21 January 2022. (see Hart's Rules).
  2. ^ United States Board on Geographic Names (1990). Gazetteer of Jordan: Names Approved by the United States Board on Geographic Names (2 ed.). Washington, D.C.: Defense Mapping Agency. Retrieved 21 January 2022.
I created Ash-Shunah al-Janubiyah Loew Galitz (talk) 03:16, 23 January 2022 (UTC)Reply
@Arminden: Besides the discussion regarding this village, the general discussion about transliteration should be moved to MoS-Arabic, though that page appears defunct due to lack of activity, or another more comprehensive talk page. I agree with the premise: except those places where the clear common name contravenes that recommended in a proposed Manual of Style, we should have consistent transliteration. This has long been on my mind, but I honestly do not know which is the best style to apply—I just favor consistency. As such a policy, if implemented, would be far-ranging, it should be achieved by a consensus. The main questions, should such a policy be in place, are which standard or combination should be used and should there be variation by dialectical region, country, or smaller unit, and if so, should each unit have its own manual of style? Inviting A455bcd9, Attar-Aram syria, Apaugasma, Elie plus, Huldra, Mahmudmasri, Makeandtoss, Oncenawhile, SarahFatimaK, and Zero0000 for their opinions as all have worked either on the topics of Arabic lingusitics or places in various Arab countries. The individual country and linguistics wikiprojects should be notified too in the hopes a wide consensus could be achieved. Al Ameer (talk) 04:15, 24 January 2022 (UTC)Reply
I strongly recommend to follow the main principles of MoS-Arabic: first differentiate between specific terms that are very widely and consistently transcribed in a certain way (e.g., al-Qaeda, Cairo, Mecca, Gamal Abdel Nasser) on the one hand, and those which are commonly transcribed in a number of different ways (e.g., al-sunna, which may commonly be found transcribed al-sunnah, as-sunna, or as-sunnah) on the other. For the first group, which apart from the names of modern Arab figures and well-known place names is a small minority, use the common transcription. For everything else, use the basic transcription system outlined in MoS-Arabic, using the strict transliteration system given there in the lead sentence. MoS-Arabic allows for some variants (assimilation of al- or not, probably should also allow -a or -ah for ta' marbuta), and I think that generally articles should be allowed to deviate from it on a few points, as long as it still resembles it and is self-consistent.
I recommend using this system because as far as I know, regional preferences for transliteration (reflecting also regional pronunciations) are so wildly diverse and unsystematic that there is no way to render them consistent. The basic transcription system in MoS-Arabic closely resembles the most commonly used systems for both Classical Arabic and Modern Standard Arabic, the latter of which remains normative for all Arabic speakers, and is still what people turn to when looking for something consistent and systematic.
I would support updating MoS-Arabic to an official part of the MOS if it would be more focused (leave out the bits on Persian, Urdu and Turkish, which should have their own guidelines) and if it would be made less rigid (explicitly allow articles to deviate on a few points, and let MOS:STYLEVAR apply).
In the case of this article: if North Shuna is in (very) wide use, use that and only transliterate once in the lead and once in the etymology section, strictly: al-Shūna al-Shamāliyya. If North Shuna is not in wide use, use the basic transcription al-Shuna al-Shamaliyya, with the strict transliteration at first mention in the lead. ☿ Apaugasma (talk ) 11:05, 24 January 2022 (UTC)Reply

Let's decide: ash- or al-?

edit

Thanks to Loew now we have an article on South Shuna, too. There the article used is ash, here it's al – shouldn't we decide for one? There is an article (DAB, but with explanation in the lead) for ash-shamaliyah and I have updated the one on al-janubiyah. Arminden (talk) 11:25, 23 January 2022 (UTC)Reply

MoS-Arabic currently allows both, but non-assimilation (al-) is in far more common use on Wikipedia, following in this its more common use in the scholarly literature. So I say, use al- for both. ☿ Apaugasma (talk ) 11:07, 24 January 2022 (UTC)Reply

I agree that the manual of style should be followed. I even amended it since it was confusing or had multiple styles mixed without labeling.

I've always supported the following practice:

  1. In case a certain spelling seems to be imprecise, use it if it is the more common one, e.g. used in journalism and academia.
  2. If there are none in the previous case, keep names as they are conventionally written, locally. Arabic speakers of every dialect don't pronounce their places or people in Literary Arabic, rather in their respective dialects, even in formal Literary Arabic speeches.
  3. Only in religious terms, e.g. fiqh, shari'a, otherwise, please avoid Literary Arabic transliterations.
  4. If definite articles need to be written, they should remain al-, el- or a similar form, with or without hyphenation, never assimilate them, please! Common transliteration schemes avoid assimilating, anyway.
    • Arabic speakers, don't consistently apply the sun and moon rules in pronunciation. As a young child, I remember that I never assimilated any of their definite articles at all and only learned to do so in school.

Thanks. ----Mahmudmasri (talk) 15:18, 24 January 2022 (UTC)Reply

Al Ameer son, hi. Thank you as always for your reply. To everybody else, too: I know I've written quite a lot, but it's quite a lot to cover and I've put some effort into synthesising it all, so pls do read first what I wrote at the top, as MUCH of what I'm seeing added here just repeats what's already there. I mean primarily what Ritter wrote in the Oxford Guide to Style (OGS), which is available online, plus the US gazetteer. I've summarised both using bullets and numbers.
For instance: all the popular spellings stay unchanged, of course. Different rules for scholarly articles on, say, Islam on one hand, and more popular topics on the other - people look up places they've visited and prefer phonetic spellings ("But nobody there pronounced it with an L!") and are unhappy if Wiki comes up with ways they never came across in their guidebook or during the travel. So apples and oranges. Also, Ameer, I easily agree to take into consideration local traditions. In Syria, Lebanon, most of the Maghreb, also Egypt, the French have left their mark and we need to stay connected to what's common. In Israel/Palestine and Jordan, it's been British English for maybe a century, and now the Americans start having the strongest influence. EnWiki is not a different planet, we must stay connected to reality. However, both in I/P and Jo, local street signs and alike aren't better than the Chinese attempts at English and one gets even authorities using self-invented, improvised and inconsistent ways of spelling, which I don't support. We should mention often used ones in the text, but never adopt them for the article titles and headings. Actually a mix of common sense and literacy, with a nod to Lawrence who mocked those who attempted at squeezing Arabic of all couleurs into one Oxford standard.
So yes, IMO different manuals of style for different regions. Mind that there are piles of WWI and WWII articles written based on the spelling used by the armies involved at the time, and it's NEVER a good idea to try to change history. So keep the place-names as they are used in the literature of the time and about that time, and use correct wikilinks, but invisible to the user; for instance keep "Second Transjordan attack on Shunet Nimrin and Es Salt" and use the links [[Ash-Shunah al-Janubiyah|Shunet Nimrin]] and [[As-Salt|As Salt]]. The trick is how to recognise what place is hiding behind the old-fashioned spelling, but good history sections and lots of redirects can solve that problem.
Here I'd be in favour of using Shunah ash-Shamaliyah and Shunah al-Janubiyah, with Ash- at the beginning if it must be (but I think we can do well without; you're the specialists, so you decide), and add redirects for all the common permutations (no final -h, al- everywhere, no hyphens, e instead of a, even that crazy spelling with iyy). That's what I am doing with articles I care more about. Arminden (talk) 10:43, 25 January 2022 (UTC)Reply
I was taught to write and pronounce the assimilated sun letters. I would lean toward that but it's not a big deal. Cuñado ☼ - Talk 06:01, 26 January 2022 (UTC)Reply

Setting a manual for spelling Arabic places

edit

@Apaugasma:@Mahmudmasri:@Arminden: In the interest of focusing the discussion about setting a manual, or manuals, of style for Arabic place names on Wikipedia, I have moved the original thread here. Mahmud you have been active with this page in the past, what is the best approach to setting this up? Organizing the discussion here or going further by opening an RFC (also here)? Either way, the relevant wikiprojects and active users who have frequently edited this topic (i.e. places in the Arabic-speaking world) will be notified. Before we continue discussion about the substance though, we should decide the best mechanism. Al Ameer (talk) 21:35, 25 January 2022 (UTC)Reply

Thanks Al Ameer. I am quite busy, but I'll try to find a way to insert this in the guide. --Mahmudmasri (talk) 00:27, 26 January 2022 (UTC)Reply
Thank you Mahmudmasri. I was not referring specifically to the guideline about the non-assimilated, hyphenated 'al-' or 'el-' (excuse the confusion from my placement of this message under that specific thread; I have separated it now). I meant what do you believe is the best approach to hold a broader discussion about how we spell Arabic place names so that we could decide and implement a manual. For the record, I do support a guideline about using non-assimilated, hyphenated articles across the board, unless it contravenes the clear common name. Al Ameer (talk) 02:34, 26 January 2022 (UTC)Reply
You will notice that you should use the normal ⟨al⟩ or ⟨el⟩, hyphenated/capitalized or not. I am currently waiting for some other input in the above discussion, then I will make it clearer in the guide, since it is not. --Mahmudmasri (talk) 08:09, 28 January 2022 (UTC)Reply

Alphabetization

edit

Hello @Apaugasma, @Mahmudmasri, @Tavix, @HiddenFace101, @Nehme1499, @HistoryofIran, @Fayenatic london, @WPEditor42, @BlankpopsiclesilviaASHs4, @Mandarax, @SYSS Mouse, @PamD there is ongoing discussion on 2023 AFC U-20 Asian Cup qualification regarding how Arabic surnames should be indexed/alphabetized.

  1. This manual states that when indexing the surnames the names like 'Foo Al-Foo' should be indexed using the 'A' in 'Al-Foo' and not using the 'F' in 'Foo'. The discussion for the same has been done on the talk page of this manual here - Wikipedia talk:Manual of Style/Arabic/Archive 4#Alphabetization.
  2. On the other hand WP:MCSTJR states that Arabic names should be indexed using the 'F' in 'Al-Foo' and not the 'A'.

As it stands both the versions differ from each other, and thus due to this there has been some problems regarding the same on the 2023 AFC U-20 Asian Cup qualification where argument has been made in favour and against of indexing the Arabic surnames using 'Al' as the part of the name and without it. Thus, can it be clarified which of these two manuals does indeed follow the correct policy and the same be used for all the pages in the future so that this discussion can be used as a reference? Anbans 586 (talk) 14:57, 17 September 2022 (UTC)Reply

A notice for the discussion to be started here has been put on here - Wikipedia talk:Categorization of people#Alphabetization for Arabic surnames. This would hopefully invite people to get into this discussion. Anbans 586 (talk) 15:02, 17 September 2022 (UTC)Reply
For all of the matter about Arabic translation to Latin (notably English), I would say it is better to understand the functioning of "Al"/"El". While it is very common to see Arab people carrying "Al" or "El" suffixes, there is no permanent law designing the use of "Al" or "El" in Arab nations about naming structure.
Bear in mind that the Arabs have a very different idea about how surname actually is. Unlike the Chinese (which place surname first and first name last) or Western, some Arabs take surname from their tribes, or some may not even have a surname at all. Those who do not have surname tend to include suffix "Al"/"El" to emphasise their rather father bloodline instead of which tribes they came from. This is why the word "Al"/"El" in Arabic name have confused some readers, some even believe it is better to include "El"/"Al" just because it is part of their naming.
But since "Al"/"El" functions in only rather a honorary title instead of being a naming thing, it is very hard to start counting of alphabetical order with these suffixes because too many Arabs carry them. There are some exception, but this is because these suffixes being combined with their surnames (for instance Libyan footballer Ali Elmusrati, his surname was written altogether instead of El Musrati because this is how translation works) How can you define them if you include these suffixes?
Arabic does not have standardised translation, and historical colonialism has also contributed to how they diverged in translation. Take the French translation of Arabic names, it still makes impact in the Maghreb states, where these suffixes are completely absent (example U-17 Tunisian footballer, Abderrahmane Argui, if translated to English version would have been Abdelrahman El-Erqi, but the French version phased out this use).
In respect, while I respect the use of suffixes (I have no opposition against it), it is still suffix, and just that. HiddenFace101 (talk) 19:37, 17 September 2022 (UTC)Reply

This is a bit tricky. For transliterated words (those converted from Arabic to Latin letters in an ad hoc fashion and according to a rigorous and fixed system), the common practice in reliable sources is to discard the al- in alphabetical sorting. So for example, al-jazira and al-qa'ida (both 'basic transcriptions' according to this guideline's fixed transliteration system) would be sorted under 'j' and 'q'. However, in words that have a common English spelling (what we in this guideline call a 'common transcription', i.e., where the transcription is established by common usage rather than by rigorously following one fixed system) the al- or el- part is often treated as a full part of the word. So for example, Al Jazeera and Al-Qaeda would be sorted under 'a', and El-Bizri under 'e'. This dual approach is what most reliable sources do, in my experience, for example in the alphabetical sorting of bibliographies.

However, this may be thought of as complicated rule for Wikipedia, which might want to decide to either disregard all usage of al-/el- or never to disregard it. But in that case, disregarding the Al- in Al-Qaeda and sorting it under 'Q' sounds like a bad idea (most people not familiar with Arabic might find that unnatural or difficult to comprehend), so never disregarding it may be the preferred simple solution.

Whatever is decided, this guideline and WP:MCSTJR should be harmonized. ☿ Apaugasma (talk ) 16:16, 17 September 2022 (UTC)Reply

The manual of style guideline should be followed in article lists, and WP:MCSTJR should be followed in category pages. WPEditor42 (talkcontribs) 16:43, 17 September 2022 (UTC)Reply

I think the point raised above about the distinction between "established / official" names (such as "Al Jazeera") and names which can easily be transliterated in different ways depending on who does the transliteration (such as names of people) is important to take into consideration. Take Kassem El Zein: his name can easily be written as "Kassem Al Zein", or "Kasem El Zain", for example. It wouldn't be "fair" to sort his name by the "E" in "El", when the surname could have also been written as "Al Zein", and thus sorted by "A". The "defining" part of the name, in a sense, is "Zein", not the definite article. Another point: many peoples' surnames omit the definite article from their name in Latin alphabet, others use "El", others "Al". Take Hanadi Zakaria al-Hindi, Yahya El Hindi and Baba Ratan Hindi as an example. All three surnames are written as الهندي in Arabic, but would be categorized differently if we took into account the definite article (the first under "A", the second "E" and the third "H"), even though all three could be written in a variety of different ways in Latin. I think not taking into account the definite article in these cases leaves no room for inconsistency.
I don't agree that we should have different guidelines for article lists and categories. The principle is the same, and should be applied throughout. Nehme1499 18:15, 17 September 2022 (UTC)Reply
  • I've been pinged but I have no particular knowledge of Arabic names, I'd just say:
  • "al" and "el" are prefixes, not suffixes (ie they come before, not after, the main name)
  • Wikipedia needs to try to be consistent, although this will sometimes be impossible because different individuals' names have gone through different histories (eg westernisation, different transliterations). Using a different sorting system for categories and for lists within articles would be a bad idea.
  • I'd think that a WP Guideline like Wikipedia:Categorization_of_people#Other_exceptions would take priority over a 15-year-old discussion on a talk page like Wikipedia_talk:Manual_of_Style/Arabic#Alphabetization.
  • Looking to see what current practice is: let's look at a couple of categories to see what "DEFAULTSORT" has been applied. I looked at Category:Saudi Arabian footballers because this discussion started with footy: all seem to be alphabetised ignoring "al" or "el"; I then checked Category:Libyan politicians, as a contrasting category which might have a lot of Arabic names: all but the very first entry seem to be file ignoring "al" or "el". Only a sample of two cats, but it seems to suggest that Wikipedia's current practice is as described in the Guideline, rather than as discussed in that talk page. (I then recklessly had a look at a third category: Category:Egyptian poets: Aaargh, what a mess! Quite a few sorted by first name rather than surname, and a mixture of prefixes ignored or not. Are Egyptian names different, or have there been editors in this area who don't know about DEFAULTSORT?) PamD 20:20, 17 September 2022 (UTC)Reply
  • Hey @Apaugasma, @HiddenFace101, @Nehme1499, @Fayenatic london, @WPEditor42, @PamD So here is a question. What should be done with names like 'Khaled Abu Al-Heija'? When sorting this name, should it be indexed on 'Abualheija' or 'Alheija' or 'Abuheija'? What would be the suggestion in this case?Anbans 586 (talk) 21:39, 17 September 2022 (UTC)Reply
    • I believe that should be indexed as Heija.Fayenatic London 21:42, 17 September 2022 (UTC)Reply
    • If it's a compound surname, then it should be indexed as Abu Heija. If 'Abu' is part of the forename, it should be indexed as Heija. WPEditor42 (talkcontribs) 21:46, 17 September 2022 (UTC)Reply
    • Per WP:SUR, Modern names with Abu, Abd, Abdel, Abdul, Ben, Bin and Bent are considered compound names and particles are integral to the name. "Abu Al-Heija" should be sorted as "Abu Heija". Nehme1499 21:48, 17 September 2022 (UTC)Reply
      @Nehme1499, @Fayenatic london, @WPEditor42 I think that everyone is on the same page then, that 'Al' etc should be ignored when indexing these names. I did not like the points HiddenFace101 made, the points made contained stuff like 'how the name originally came from', 'original bloodline' etc which do not make much sense as in a discussion like this, where the sole witness for all things cultural is he/she him/herself, and what their idea is about it (no reference, or any thing to backup their argument).
      On the other hand I do agree with what Nehme1499 has to say as how 'Hanadi Zakaria al-Hindi', 'Yahya El Hindi' and 'Baba Ratan Hindi' should be grouped together at the same place since they are very much similar to each other. This argument makes a lot of sense as it is straightforward and logical, not some kind of complex mumbo-jumbo.
      So can someone please update this manual, so that the discussion made here, is reflected in the manual itself. I can, but I am not very confident on my ability of phrasing stuff correctly (most of my expirience on wikipedia over the last few years has been on editing list type pages which require minimum amount of phrasing). Thanks, good to have some constructive discussion than the one-to-one conversation I was initially having on the original page. Anbans 586 (talk) 22:41, 17 September 2022 (UTC)Reply
  •   DoneFayenatic London 08:08, 18 September 2022 (UTC)Reply
    Thanks for that. We still have a contradiction between WP:SUR, saying Modern names with Abu, Abd, Abdel, Abdul, Ben, Bin and Bent are considered compound names and particles are integral to the name. Osama bin Laden is sorted {{DEFAULTSORT:Bin Laden, Osama}}, and our guideline here , saying For indexing, the family name designators ibn (or colloquial bin) and bint should be ignored, unless the common transliteration makes it a part of the name (as in the Saudi Binladin Group). Any chance we might resolve that too? Probably better to also adjust this guideline to be in line with WP:SUR. ☿ Apaugasma (talk ) 13:06, 18 September 2022 (UTC)Reply
    The opening proviso "Modern names" seems to imply that such words should be ignored for the purpose of indexing medieval names. Can anyone offer clarification on this, please? – Fayenatic London 19:26, 20 September 2022 (UTC)Reply
    I believe this is because with medieval names, it does not matter: as WP:SUR stipulated in the preceding paragraph, no surname is identified in medieval names anyway and they are always sorted as written out rather than as surname, given name. In medieval names Abu, Abd, ibn, etc. also are integral to the name, but that doesn't come in to play, e.g., in Muhammad ibn al-Hanafiyya, which is always sorted under 'Muhammad' rather than either under 'Ibn al-Hanafiyya' or 'Hanafiyya'. ☿ Apaugasma (talk ) 19:43, 20 September 2022 (UTC)Reply
    OK, good example – so would that be sorted as 'Muhammad ibn Hanafiyya'? – Fayenatic London 21:42, 28 September 2022 (UTC)Reply
    Sorry for the late reply (didn't log on for a few days). Muhammad ibn al-Hanafiyya would simply be sorted as 'Muhammad ibn al-Hanafiyya', because medieval names are sorted without putting the surname in front and because the 'al-' is only discarded when the word containing it is the one the entry is sorted under (e.g., Hanadi Zakaria al-Hindi being a modern name would be sorted as 'al-Hindi, Hanadi Zakaria', but 'al-Hindi' being the first word it is sorted under 'Hindi, Hanadi Zakaria al-'). Osama bin Laden, on the other hand, being a modern name would per WP:SUR put the surname in front and would be sorted as 'Bin Laden, Osama', the 'Bin' being and integral part of the name. I think it's best if MOSAR would be changed to also reflect that rule. ☿ Apaugasma (talk ) 22:47, 1 October 2022 (UTC)Reply
    Just a small note: Hanadi Zakaria al-Hindi should be sorted as "Hindi, Hanadi Zakaria", not "Hindi, Hanadi Zakaria al-", as "al-" is not part of the first name(s). Nehme1499 23:32, 1 October 2022 (UTC)Reply
    Thank you both. I have replaced the relevant paragraph in Wikipedia:Manual of Style/Arabic#Collation in alphabetical order – please review it. – Fayenatic London 15:40, 2 October 2022 (UTC)Reply
    Looks good to me. Thanks for your work on this! ☿ Apaugasma (talk ) 15:42, 2 October 2022 (UTC)Reply

Poor accessibility of Ayin and Hamza strict transliteration

edit

The current strict transliteration recommends ʾ for ayin and ʿ for hamza. Just kidding! It's actually ʿ for ayin and ʾ for hamza. On my system these characters look identical at the default body text size. While I'm all for following standards, standards that depend on distinguishing different apostrophes are not suitable for use on the web where we cannot control how text is rendered (font, size, rasterisation, etc).

While I can accept that for basic transcription, conflating ayin and hamza is acceptable, the purpose of the strict transliteration is explicitly to be able to distinguish each letter. Everybody should be able to distinguish it, not just people with 20/20 vision with their nose pressed against the screen.

I propose that one of apostrophes should be changed to a different glyph. It is better to change ayin, as using apostrophe for glottal stop is a stronger convention.

Here's some potential non-apostrophe options for ayin:

3 - the de facto standard transliteration of ayin as used by Arabs, plus some educational sources such as Duolingo
ﻉ - some texts use an ﻉ glyph directly inside the transliteration. But including an RTL glyph inside an LTR word is problematic for unicode bidi reasons
e - this is used by IATA (on passports) and Google Translate. But it is confusing for Egyptian Arabic where "e" represents a vowel
A - some texts represent ayin using an uppercase A

I like using "3" the best as it's a de facto standard and many people are already used to reading it.

Here's what the difference looks like:

أعرف → ʾaʿraf → ʾa3raf

رائعة → rāʾiʿa → rāʾi3a

عائلة → ʿāʾila → 3āʾila

To my eyes, that's a big improvement. I can actually read the 2nd transliteration. Alextgordon (talk) 13:04, 20 November 2022 (UTC)Reply

On Wikipedia we generally follow scholarly sources, and the de facto standard in scholarly sources these days are modifier letter left half ring (ʿ) and modifier letter right half ring (ʾ) for ayn and hamza, respectively. I'm not sure why these two look identical on your system, but on most systems they look fairly distinct, more so than the modifier letter reversed comma (ʽ) and modifier letter apostrophe (ʼ) often used in older scholarly sources. Yes, "3" of course looks much more distinct, but it also looks like SMS language, and it is never used in scholarly sources. In my opinion it would be a very bad idea for Wikipedia to be the first encyclopedia or knowledge-oriented source to start using "3" instead of ʿ or ʽ. We are not trailblazers, we follow established usage. ☿ Apaugasma (talk ) 14:09, 20 November 2022 (UTC)Reply
I strongly agree with Apaugasma about "3", but some selective use of the IPA symbols ʔ and ʕ might be helpful (not in article titles, of course). AnonMoos (talk) 00:21, 22 November 2022 (UTC)Reply
Yes, ʔ and ʕ are appropriate particularly in linguistic contexts, where reliable sources also often them. This guideline should not be applied to strictly: each article can have its own standards according to the usage found in its sources, if that usage consistently differs from WP:MOSAR. ☿ Apaugasma (talk ) 10:37, 22 November 2022 (UTC)Reply
The use of 3 to represent 'ain is derived from slang transliteration; it has never been scholarly. Iskandar323 (talk) 07:56, 24 April 2023 (UTC)Reply

Definite article capitalization in page titles

edit

The guideline seems to be rather ambiguous about this. In the capitalization section, it mentions the definite article as an exception, but there is no real exposition of or examples of when this is the case. As it is, there seems to be a lot of cross-Wiki variation ATM, e.g. Al-Qaeda/al-Ghazali. But, IMO, articles such as al-Ghazali look pretty funky and non-standard with a lower-case initial definite article. Iskandar323 (talk) 07:53, 24 April 2023 (UTC)Reply

The difference between Al-Qaeda and al-Ghazali is analogous to the difference between Mohamed Morsi and Muhammad al-Baqir. As modern subjects, the former two have fixed spellings as used by the popular press (MOSAR calls this 'common transcription'), while the two latter subjects are historical in nature and therefore follow a specific and consistent transliteration scheme ('basic transcription' in MOSAR terms). Capitalization of al- follows this distinction: in modern subjects which have fixed spellings as used by all our sources we do not change the capitalization, while in historical subjects to which we apply transliteration we follow scholarly standards in not capitalizing al- except at the start of a full sentence. For a prominent example of such standard scholarly usage, see the capitalization in Encyclopaedia of Islam titles like [1][2][3][4][5]. ☿ Apaugasma (talk ) 12:56, 24 April 2023 (UTC)Reply

Hospitals that begin with al-

edit

Should these be capitalized or lowercase? (This matters because it affects whether {{Lowercase title}} is placed in the article.) Also should they have a dash or no dash? They appear to be randomly mixed at the moment. Examples:

Folks are reverting me for capitalizing one of these, so was wondering if there's a standard. Thanks. cc GnocchiFan. –Novem Linguae (talk) 20:41, 18 October 2023 (UTC)Reply

I would not capitalize them, since it's not consistently being capitalized in sources, and it just means "the" and we don't capitalize that at the front of entity names except under unusual circumstances. It's "the University of Edinburgh" not "The University of Edinburgh" (except at the beginning of a sentence). We should also apply this to French l'[Something], etc. One actually has to wonder whether the al can be entirely dropped in various of these cases (just as we most often drop a leading l' or a Spanish el or la). Do the sources almost always include the al for a particular case or not? On the hyphenation, what are most English-language sources doing, and this appears to be a bigger question than hospitals in particular.  — SMcCandlish ¢ 😼  21:28, 18 October 2023 (UTC)Reply
Since these are modern subjects which often have a relatively stable yet idiosyncratic spelling (i.e., not in line with how other similar entities are transcribed), the most common spelling used in the sources for each subject (MOSAR's 'common transcription') should in each case be followed.
In my experience, in cases like this sources overwhelmingly use either capitalized "Al-" (with hyphen) or capitalized "Al " (without hyphen), though one might sometimes also find "El-" or "El ", and –in a minority of cases– uncapitalized versions of the previously mentioned options. Sources tend to copy each other and develop a certain standard spelling for each subject.
However, it's best to look at the sources: scholarly transliteration used for historical subjects always has lower case and hyphenated "al-", and although the popular press tends to use upper case and sometimes drops the hyphen, there are cases where they follow the scholarly standard of using lower case. Leaving out the al- altogether is almost never done, and so generally is a no-go. The best course of action is to start by looking at the sources the article is using. ☿ Apaugasma (talk ) 21:08, 19 October 2023 (UTC)Reply
I've wondered about this myself. À la MOS:Dates I'd down with one style and stick with it. Consistency is day. kencf0618 (talk)

Houthi vis-à-vis Houthis

edit

I curate the Timeline of the 2023 Israel-Hamas war, and I'm puzzled by the distinction (if any) between "Houthi" and "Houthis". I've done some digging, and in the citations at least "Houthi" predominates. But what's the proper style? Should we go orthographically by citation? Write etymologically "Drones from the tribe of Houth"? What? What? Figured that I, monoglot, would hand you this abstruse issue. Details matter. Thx. kencf0618 (talk) 14:29, 17 December 2023 (UTC)Reply