The Full Wiki

WT:LANGCODE: Wikis

Advertisements

Note: Many of our articles have direct quotes from sources you can cite, within the Wikipedia article! This article doesn't yet, but we're working on it! See more info or our list of citable articles.

Wiktionary

Up to date as of January 15, 2010
(Redirected to Wiktionary:Language codes article)

Definition from Wiktionary, a free dictionary

Application-certificate Gion.svg This is a Wiktionary policy, guideline or common practices page. Specifically it is a policy think tank, working to develop a formal policy.
Policies: CFI - ELE - BLOCK - REDIR - BOTS - QUOTE - DELETE - NPOV - AXX
Shortcut:
WT:LANGCODE

According to our Criteria for inclusion, Wiktionary is intended to include all words in all languages. One way to differentiate languages is through codes formed by a few letters. Language codes are found in topical categories, language-exclusive templates, etymologies, among other pages. There are also situations where language names are used instead. See also Wiktionary:Language names.

Contents

Language code templates

Main category: Language templates

Each language code is stored in a different template, the name being the code itself. When the template is called, the result is the language name, which may or not be automatically linked to the related Wiktionary entry. For example, when called, the template {{vot}} returns Votic. All language code templates are at Category:Language templates.

Advertisements

Linking names

Language names from language codes may be linked, to become easier to find information about these languages as Wiktionary entries. Most of them are linked by default, except these 43 major languages:

Code Language
{{ar}} Arabic
{{be}} Belarusian
{{bg}} Bulgarian
{{bs}} Bosnian
{{cmn}} Mandarin
{{cs}} Czech
{{da}} Danish
{{de}} German
{{el}} Greek
{{en}} English
{{es}} Spanish
{{et}} Estonian
{{fi}} Finnish
{{fil}} Filipino
{{fr}} French
{{hr}} Croatian
{{hu}} Hungarian
{{hy}} Armenian
{{id}} Indonesian
{{is}} Icelandic
{{it}} Italian
{{ja}} Japanese
{{ka}} Georgian
{{ko}} Korean
{{la}} Latin
{{lt}} Lithuanian
{{lv}} Latvian
{{mn}} Mongolian
{{nl}} Dutch
{{no}} Norwegian
{{pl}} Polish
{{pt}} Portuguese
{{ro}} Romanian
{{ru}} Russian
{{sk}} Slovak
{{sl}} Slovene
{{sq}} Albanian
{{sr}} Serbian
{{sv}} Swedish
{{th}} Thai
{{tr}} Turkish
{{uk}} Ukrainian
{{vi}} Vietnamese

Code assignment

This is related to the discussions Duplicate Language Codes and lang= parameter in templates.

  • When possible, Wiktionary language codes come from ISO 639-1, a series of two-letter codes which covers 136 major languages.
    Examples: en is English, fa is Persian and kl is Greenlandic.
  • For languages not found on ISO 639-1, we use ISO 639-3 (which grew out of the Ethnologue codes), a series of three-letter codes which covers thousands of languages.
    Examples: ang is Old English, arz is Egyptian Arabic and cmn is Mandarin.
  • For language families, such as "Germanic", we use ISO 639-5 codes. These language families are only used in etymologies (there are no indepdendent "Germanic" entries) and therefore the codes that represent these groupings are only used with {{etyl}}. For this reason these templates have an "etyl:" prefix.
    Examples: etyl:cel is Celtic

However, not all known languages are represented by ISO, therefore exceptions should be assigned when necessary.

  • An exceptional code may simply come from ISO 639-2, from deprecated ISO 639-1 or from deprecated ISO 639-3.
    Examples: mo is Moldavian, nah is Nahuatl and sh is Serbo-Croatian.
  • An exceptional code may be borrowed directly from the Wikimedia language codes.
    Examples: cbk-zam is Zamboanga Chavacano, nah is Nahuatl and nds-nl is Dutch Low Saxon.
  • There has been little consensus on how to code language dialects. Similar to language families these codes are only used in etymologies with the template {{etyl}}. Current usage employs codes similar to Webster's 1913 dictionary.
    Examples: etyl:AE. is American English
  • Otherwise, the exceptional code must start with a related ISO 639-5 code, because a ISO 639-1 or ISO 639-3 would probably be used by itself. Then, it must be followed by - (an ordinary hyphen) and an extension formed by a few lower case letters. (Therefore, no digits, upper case letters, etc; IANA tags allow these, case independent, but Mediawiki software is more restrictive.)
    Examples: cpe-spp is Samoan Plantation Pidgin, roa-gal is Gallo and zls-mon is Montenegrin.
  • Any code derived from community consensus and not directly from ISO 639-1 or 639-3, is an exceptional code.
    Example: zh is Mandarin (not Chinese, due to consensus)

ISO 639-1, 639-3, and 639-5 codes are (and should be) listed in the External links section of the language name entry (e.g. French#External links). For an generated list of all name and codes of individual languages, see Wiktionary:Index to templates/languages. For list of (almost) all names and codes (including language families and dialects) that can be used in etymologies with {{etyl}}, see Wiktionary:Etymology/language templates.

For listing reconstructed terms in hypothetical languages, such as Proto-Germanic use {{proto}} which does not utilize languages codes.

List of exceptions

Exceptional codes may be assigned to languages (including dialects and language families) not represented individually by ISO 639-1, ISO 639-3, and ISO 639-5. These exceptional codes should always be listed here.

The following list should not be confused with the list of Wikimedia language codes. See Wiktionary:Wikimedia language codes.

Languages

Main category: All languages

These are the codes used to represent individual languages.

Name Wikipedia article Wiktionary code Comments
 !Kung w:!Kung language {{khi-kun}}
  •  !Kung may be considered a group of dialects or related languages.
Ammonite w:Ammonite language {{sem-amm}}
Banyumasan w:Banyumasan language {{map-bms}}
Chinese w:Chinese language {{zhx-zho}}
  • The code zh is ISO 639-1 for Chinese, though recognized as Mandarin due to consensus.
Darkinjung w:Darkinjung language {{aus-dar}}
Dutch Low Saxon w:Dutch Low Saxon {{nds-nl}}
  • Mediawiki uses the code nds-nl for Dutch Low Saxon language projects.
Gabi w:Pama-Nyungan languages#Classification and Languages {{aus-gab}}
Gallo w:Gallo language {{roa-gal}}
  • Gallo may be considered a dialect of French or an individual language.
Gaulish w:Gaulish language {{cel-gau}}
Greenlandic Eskimo Pidgin w:Indigenous languages of the Americas#Pidgins, mixed languages and trade languages {{crp-gep}}
Guernésiais w:Guernésiais {{roa-grn}}
  • Guernésiais may be considered a dialect of Norman, a dialect of French or an individual language.
Gunai w:Gunai language {{aus-gun}}
  • Gunai may be considered a group of dialects or related languages.
Gutnish w:Modern Gutnish {{gmq-gut}}
  • Gutnish is the modern version of Old Gutnish.
Jèrriais w:Jèrriais {{roa-jer}}
  • Jèrriais may be considered a dialect of Norman, a dialect of French or an individual language.
Latgalian w: Latgalian language {{bat-ltg}}
Leonese w:Leonese language {{roa-leo}}
Mandarin w:Mandarin Chinese {{zh}}
  • There is the ISO 639-1 code cmn for Mandarin.
  • The code cmn should be used always; zh is merely an alternative to be avoided.
  • The code zh is ISO 639-1 for Chinese, though recognized as Mandarin due to consensus.
Maroon Spirit Language w:Jamaican Maroon Spirit Possession Language {{cpe-mar}}
Middle Chinese w:Middle Chinese {{zhx-mid}}
Middle Norwegian w:Norwegian language#From Old Norse to distinct Scandinavian languages {{gmq-mno}}
Mingo w:Mingo {{iro-min}}
Moldavian w:Moldovan language {{mo}}
  • Moldavian may be considered a dialect of Romanian or an individual language.
  • The ISO 639-1 code mo and the ISO 639-3 code mol for Moldavian are no longer active.
Montenegrin w:Montenegrin language {{zls-mon}}
  • Montenegrin may be considered a dialect of Serbian, a dialect of Serbo-Croatian or an individual language.
  • Montenegrin is currently in the process of being standardized, has an official orthography, and a request for its own ISO code has been submitted.
Nahuatl w:Nahuatl {{nah}}
  • Nahuatl may be considered a group of dialects or related languages.
  • There is the ISO 639-2 or ISO 639-5 code nah for Nahuatl.
  • Mediawiki uses the code nah for Nahuatl language projects.
  • The result of a RFDO discussion [1] was to keep the language category.
Norfuk w:Norfuk language {{cpe-nor}}
  • Norfuk and Pitkern may be considered related languages or a single language.
Norman w:Norman language {{roa-nor}}
  • Norman may be considered a dialect of French or an individual language.
  • Mediawiki uses the code nrm for Norman language projects, but the ISO 639-3 code nrm is assigned to Narom language instead.
Old Novgorod dialect w:Old Novgorod dialect {{zle-nov}}
Old Polish w:Old Polish language {{zlw-opl}}
Old Swedish w:Swedish language#Old Swedish {{gmq-osw}}
Phuthi w:Phuthi language {{bnt-phu}}
Picuris w:Picuris language {{nai-pic}}
Pitkern w:Pitkern {{cpe-pit}}
  • Norfuk and Pitkern may be considered related languages or a single language.
Pomeranian w:Pomeranian language {{zlw-pom}}
Russenorsk w:Russenorsk {{crp-rsn}}
Samoan Plantation Pidgin w:Samoan Plantation Pidgin {{cpe-spp}}
Samogitian w:Samogitian dialect {{bat-smg}}
  • Samogitian may be considered a dialect of Lithuanian or an individual language.
  • Mediawiki uses the code bat-smg for Samogitian language projects.
Serbo-Croatian w:Serbo-Croatian language {{sh}}
  • Serbo-Croatian may be considered a group of languages (Bosnian, Croatian and Serbian; Montenegrin may also be included) or an individual language.
  • The ISO 639-1 code sh for Serbo-Croatian is no longer active.
  • There is the ISO 639-3 code hbs for Serbo-Croatian.
  • Mediawiki uses the code sh for Serbo-Croatian language projects.
Slovincian w:Slovincian {{zlw-slv}}
  • Slovincian may be considered a dialect of Kashubian, a dialect of Pomeranian, a dialect of Polish or an individual language.
Sydney w:Sydney language {{aus-syd}}
  • Sydney language is also known by several other names (see Wiktionary:Language names), including “Dharuk”.
Taimyr Pidgin Russian {{crp-tpr}}
Tarantino w:Tarantino language {{roa-tar}}
Wemba-Wemba w:Wemba-Wemba {{aus-wem}}
Woiwurrung w:Woiwurrung language {{aus-wwg}}
Zamboanga Chavacano w:Chavacano language#Zamboangueño {{cbk-zam}}
  • Mediawiki uses the code cbk-zam for Zamboanga Chavacano language projects.

Families and collective languages

Main category: Language families

These are the codes used to represent groups of related languages. They should only be used in Etymologies with the template {{etyl}}.

Name Wikipedia article Wiktionary code Comments
Brythonic w:Brythonic languages cel-bry
Cariban w:Cariban languages sai-car
Gaelic w:Goidelic languages cel-gae
High German w:High German gmw-hge
Judeo-Aramaic w:Judeo-Aramaic language sem-jar
Low German w:Low German gmw-lge
  • Corresponds to "modern Low German" {{nds}}, "Middle Low German" {{gml}}, and "Old Low German" ("Old Saxon") {{osx}}

Dialects

There are the codes used to represent regional or historic varieties of individual languages. They should only be used in Etymologies with {{etyl}}.

Name Wikipedia article Wiktionary code Comments
American English w:American English AE.
Austrian German w:Austrian German AG.
Late Latin w:Vulgar Latin LL.
Mediaeval Latin w:Mediaeval Latin ML.
New Latin w:New Latin NL.
Old Latin w:Old Latin OL.
Old Northern French w:Old Northern French ONF.
Provençal w:Provençal Pr.
  • In 2007 Provençal prv was merged by ISO 639-3 into {{oci}} ("Occitan")
Viennese German w:Viennese German VG.
Vulgar Latin w:Vulgar Latin VL.

See also

  • Wiktionary:Etymology/language templates
  • User:Robert Ullmann/L2
  • User:Robert Ullmann/Trans languages/uncoded

Advertisements






Got something to say? Make a comment.
Your name
Your email address
Message