Module talk:Wikt-lang

(Redirected from Module talk:Wikt-lang/data)
Latest comment: 5 months ago by Trappist the monk in topic Uncaught error

Italicisation of Halkomelem

edit
This discussion moved to Template talk:Lang/Archive 7#Italicisation of Halkomelem, as it pertains to {{lang}}. — Preceding unsigned comment added by Erutuon (talkcontribs) 17:13, 29 June 2018 (UTC)Reply

missing data?

edit

Am I mistaken, or should every language tag that is listed in the ["languages"] table have both of ["name"] and ["article"] entries? If that is so then:

these are missing ["name"]:

these are missing ["article"]:

In the above list the language names are the names that Module:Lang associates with the language tag. When a second name is listed, Module:Lang and Wiktionary definitions of the language tag disagree. In the case of the IETF-like language tags, Module:lang does not support extlangs and the extlangs that Wiktionary uses are not defined in the IANA language-subtag-registry file. Wiktionary names used above were taken from one of:

Am I mistaken? Is there a reason that these language tags do not have both of ["name"] and ["article"]?

Trappist the monk (talk) 15:50, 14 December 2023 (UTC)Reply

@Editor Erutuon: you are (apparently) the original author of this module. Do you have an answer for my questions?
Trappist the monk (talk) 20:32, 17 December 2023 (UTC)Reply
It's been a long time since I looked at this module. It doesn't currently use the article field in any code that can be invoked by a template.makeLinkedName uses the article field, but it's orphaned and apparently I never tested it, because it would give a "tried to concatenate nil" error for language codes that don't have an article field or don't have a Wikipedia_name or name field. I think I meant to use it in another invokable module function that would prepend the language name like {{lang-fr}} does, and never got around it. But it would probably be better to have Module:lang generate that language name. So for now the article field is unnecessary.
The name field is used to specify the canonical name (the name of the language used in level-2 headers in Wiktionary entries). If I remember right, in the code that is invoked by templates, it is only needed when mw.language.fetchLanguageName(languageCode, 'en') doesn't return the canonical name. Thus, be doesn't have a name field because mw.language.fetchLanguageName('be', 'en') returns Belarusian, the canonical name; but ab has a name field with the canonical name Abkhaz because mw.language.fetchLanguageName('ab', 'en') returns Abkhazian and aaq has Abnaki because mw.language.fetchLanguageName('aaq', 'en') returns nil.
The name fields are incomplete because I hadn't worked out how to fully populate the list of idiosyncratic Wiktionary language names and keep it up to date. There's no way for Lua modules on Wikipedia to access data modules on Wiktionary, so unless someone manually copies modules from Wiktionary to Wikipedia every time there is a change, the syncing would have to be done with JavaScript or an off-wiki script.
We don't use real extlang subtags on Wiktionary or IETF language tags, but only use idiosyncratic codes identifying languages that can have entries, or language families, or variants (dialects for instance) of languages or language families, or variants of variants. This Wikipedia module only cares about languages that have entries. (It could also be made to handle codes of variants of those languages, if we wanted to figure that out; they are listed in wikt:Module:etymology languages/data along with variants of language families.) Some languages that have entries use ISO 639 language codes (probably sometimes with a different circumscription from what ISO prescribes), and others use codes that are sequences of two or three sequences of two or three letters joined by hyphens. (The codes for languages that have entries all match a regex ^[a-z][a-z][a-z]?(-[a-z][a-z][a-z]?){0,2}$, but according to this table, most hyphenated codes have segments of three letters.) There's logic to how the language codes are made, such as that the first segment is usually an existing code, but as far as I know, those making Wiktionary language codes never look at the IANA subtag registry.
Wiktionary puts these idiosyncratic language codes in lang attributes in the HTML. When they happen to be valid IETF language tags (as many are), it's because Wiktionary uses an ISO language code and the IANA registry also uses it. There has been talk about translating Wiktionary language codes into IETF tags, but no work on it so far. If Wiktionarians do that, we'll have to come up with some private-use tags, as Wikipedia has done with ine-x-proto, and it could simplify things if Wiktionary and Wikipedia use the same private-use tags.
Anyway, that's a lot of background information. Syncing the name field from the Wiktionary language data modules requires 1. grabbing all the pairs of codes and canonical names of languages that have entries from wikt:Module:languages/code to canonical name (this can't be done in Lua on Wikipedia), 2. filtering out the names that are equal to the output of mw.language.fetchLanguageName(code, 'en'), 3. writing the remaining codes and names to a new data module on Wikipedia (separate from Module:Language/data, which contains some data that has to be manually edited), 4. making this module use the new data module. The name field could then be removed from Module:Language/data. There are several other problems in this module, but this would solve the name problem. — Eru·tuon 21:49, 17 December 2023 (UTC)Reply
Ah, thank for that. My purpose is to add a level-2-header anchor to a wiktionary interwiki link. With what you've written above, it looks like a two step process. Step one is to query Module:Language/data for the Wiktionary preferred name. When that query returns nil, then step two: attempt to fetch the language name from MediaWiki using mw.language.fetchLanguageName(<tag>, 'en'). When both fail, use the value in |anchor= if present and emit an error category; when |anchor= is missing or empty, emit an error message and an error category so that the underlying data can be fixed. It ain't pretty but it will work I think.
Trappist the monk (talk) 23:38, 17 December 2023 (UTC)Reply
Looks good. That's roughly what {{wikt-lang}} does already, apart from not having a |anchor= parameter. — Eru·tuon 22:06, 30 December 2023 (UTC)Reply

Samaritan

edit

{{lang|smp|example}} (the code for Samaritan Hebrew language) adds a page to Category:Articles containing Samaritan-language text but Samaritan language redirects to Samaritan Aramaic language sam. -- Error (talk) 20:06, 17 January 2024 (UTC)Reply

That is not a 'failing' of Module:Language (because that module is not used by {{lang}}). It may be a failing of Module:Lang which when queried gives these results:
{{lang|fn=category_from_tag|smp|link=yes}}Category:Articles containing Samaritan Hebrew-language text
{{lang|fn=name_from_tag|smp|link=yes}}Samaritan Hebrew
{{lang|fn=category_from_tag|sam|link=yes}}Category:Articles containing Samaritan Aramaic-language text
{{lang|fn=name_from_tag|sam|link=yes}}Samaritan Aramaic
The ISO 639-3 custodian assigns smp to Samaritan. Similarly, the IANA language-subtag-registry file also assigns smp to Samaritan. Apparently, no one at en.wiki has cared enough about which article is linked when smp is used in {{lang}} to gain a consensus to override the ISO 639-3 name or to repoint the Samaritan-language-redirect so that it points elsewhere.
Issues involving {{lang}}, {{transliteration}}, and other templates that use Module:Lang, should be discussed at Template talk:Lang, not here. Where is it best to discuss repointing the Samaritan-language-redirect? I don't know, except to say that such discussion should not be here.
Trappist the monk (talk) 20:48, 17 January 2024 (UTC)Reply
Thanks. Asked at Template talk:lang#Samaritan --Error (talk) 01:23, 18 January 2024 (UTC)Reply

Mis-italicization

edit

This should not be applying italics automatically in the case of {{wikt-lang|en|...}}, only for non-English material (MOS:FOREIGN, MOS:ITALICS). If an English-language term is being used in a words-as-words manner, it can be italicized manually or by a template parameter.  — SMcCandlish ¢ 😼  18:22, 25 June 2024 (UTC)Reply

I have hacked Module:Language/sandbox to create a common function (permalink) that chooses how the 'text' is rendered. Default rendering is:
Latn-script text is rendered in italic except when
  • the language tag is en (English text renders in upright font)
  • |italics=no
English text may be rendered in italics for MOS:WORDSASWORDS by setting |italics=yes (or y or +)
non-Latn-script text is rendered in upright font except when
  • |italics=yes (or y or +)
See Template:Wikt-lang/testcases.
Keep? Discard?
Trappist the monk (talk) 19:50, 27 June 2024 (UTC)Reply

Requested move 9 July 2024

edit
The following is a closed discussion of a requested move. Please do not modify it. Subsequent comments should be made in a new section on the talk page. Editors desiring to contest the closing decision should consider a move review after discussing it on the closer's talk page. No further edits should be made to this discussion.

The result of the move request was: moved. Admin: Please double check the subpages associated with this page to ensure nothing else needs to be moved, otherwise it looks like this should be fairly straightforward. (closed by non-admin page mover) ASUKITE 17:42, 17 July 2024 (UTC)Reply


Module:LanguageModule:Wikt-lang – This module has a very misleading name. It is solely used to create Wiktionary links with a language anchor, and should not be confused with Module:Lang which is used for most other language related usages. This move will match the template name at Template:Wikt-lang. Gonnym (talk) 12:31, 9 July 2024 (UTC)Reply

Note: not all of the /data sub-modules actually belong to the module so will need to be checked when disentangling this. Gonnym (talk) 12:34, 9 July 2024 (UTC)Reply
@Gonnym or whoever understands how these work, from digging through the subpages the only one I found that didn't mention Wiktionary was Module:Language/data/iso 15924 and its doc page, would those potentially need to be moved to Module:Lang/data/iso 15924? ASUKITE 16:27, 17 July 2024 (UTC)Reply
No. Module:Language/data/iso 15924 is not a submodule of Module:Lang. Do not move it there.
Trappist the monk (talk) 16:35, 17 July 2024 (UTC)Reply
Thanks, that was the only one that was unclear, don't want us to break anything! ASUKITE 16:38, 17 July 2024 (UTC)Reply
Rename per nom. * Pppery * it has begun... 16:10, 9 July 2024 (UTC)Reply
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Uncaught error

edit

Currently using the template without a text value produces a Lua error - Lua error in Module:Wikt-lang at line 131: bad argument #1 to 'get_best_script' (string expected, got nil). This should be more gracefully handled like the error in line 248. Gonnym (talk) 12:14, 19 July 2024 (UTC)Reply

Caught:
{{Wikt-lang|en|}}[text?] Parameter 2 is required
{{Wikt-lang|en}}[text?] Parameter 2 is required
Trappist the monk (talk) 13:53, 19 July 2024 (UTC)Reply