The Full Wiki

Tocharian languages: Wikis


Note: Many of our articles have direct quotes from sources you can cite, within the Wikipedia article! This article doesn't yet, but we're working on it! See more info or our list of citable articles.


From Wikipedia, the free encyclopedia

Tocharian languages
Spoken in Tarim Basin in Central Asia
Language extinction 9th century
Language family Indo-European
Writing system Tocharian script
Language codes
ISO 639-1 None
ISO 639-2 ine
ISO 639-3 either:
xto – Tocharian A
txb – Tocharian B

Tocharian or Tokharian is an extinct branch of the Indo-European language family. The name is taken from people known to the Greeks (Ptolemy VI, 11, 6) as the Tocharians (Ancient Greek: Τόχαροι, "Tokharoi"). These are sometimes identified with the Yuezhi and the Kushans, while the term Tokharistan usually refers to 1st millennium Bactria. A Turkic text refers to the Turfanian language (Tocharian A) as twqry. Interpretation is difficult, but F. W. K. Müller has associated this with the name of the Bactrian Tokharoi.

Tocharian consisted of two languages; Tocharian A (Turfanian, Arsi, or East Tocharian) and Tocharian B (Kuchean or West Tocharian). These languages were spoken roughly from the sixth to ninth centuries; before they became extinct, their speakers were absorbed into the expanding Uyghur tribes. Both languages were once spoken in the Tarim Basin in Central Asia, now part of the Xinjiang Autonomous Region of China.



Phonetically, Tocharian is a "Centum" Indo-European language, meaning that it merges the palato-velar consonants (*ḱ, *ǵ, *ǵʰ) of Proto Indo-European with the plain velars (*k, *g, *gʰ). Centum languages are mostly found in western and southern Europe, (Italic, Celtic, Germanic). In that sense, Tocharian (like to some extent the Greek and the Anatolian languages) seems to have been an isolate in the "Satem" phonetic world of Indo-European-speaking East European and Asian populations. The discovery of Tocharian contributed to doubts that Proto-Indo-European had originally split into western and eastern branches.[1][2]



  • /i/, /e/, /a/ (transcribed ā) /u/, /o/, /ɨ/ (transcribed ä), /ə/ (transcribed a)
  • Diphthongs (Tocharian B only): /əi/ (transcribed ai), /oi/ (transcribed oy), /əu/ (transcribed au), /au/ (transcribed āu)


  • Stops: /p/, /t/, /c/, /k/, /kʷ/ (transcribed ku)
  • Affricates: /ts/
  • Fricatives: /s/, /ɕ/ (transcribed ś), /ʂ/ (transcribed )
  • Approximants: /w/, /j/ (transcribed y)
  • Trills: /r/
  • Nasals: /m/, /n/ (transcribed word-finally), /ɲ/ (transcribed ñ)
  • Lateral approximants: /l/, /ʎ/ (transcribed ly)

Note that the above consonantal values are largely based on the writing of Sanskrit/Prakrit loanwords. A retroflex value for /ʂ/ is particularly suspect as it is derived from palatalized /s/; it was probably a low-frequency sibilant /ʃ/ (like German spelling sch), as opposed to the higher-frequency sibilant /ɕ/ (like Mandarin Pinyin spelling x).

Writing system

Wooden tablet with an inscription showing Tocharian B in its Brahmic form. Kucha, China, 5th-8th century (Tokyo National Museum)

Tocharian is documented in manuscript fragments, mostly from the 8th century (with a few earlier ones) that were written on palm leaves, wooden tablets and Chinese paper, preserved by the extremely dry climate of the Tarim Basin. Samples of the language have been discovered at sites in Kucha and Karasahr, including many mural inscriptions.

Tocharian A and B are not intercomprehensible. Properly speaking, based on the tentative interpretation of twqry as related to Tokharoi, only Tocharian A may be referred to as Tocharian, while Tocharian B could be called Kuchean (its native name may have been kuśiññe), but since their grammars are usually treated together in scholarly works, the terms A and B have proven useful. A common Proto-Tocharian language must precede the attested languages by several centuries, probably dating to the 1st millennium BC. Given the small geographical range of and the lack of secular texts in Tocharian A, it might alternatively have been a liturgical language, the relationship between the two being similar to that between Classical Chinese and Mandarin. It must be noted however that the lack of a secular corpus in Tocharian A is by no means definite, due to the fragmentary preservation of Tocharian texts in general.

Most of the script in Tocharian was a derivative of the Brahmi alphabetic syllabary (abugida) and is referred to as slanting Brahmi, However a smaller amount was written in the Manichaean script in which Manichaean texts were recorded [3] [4]. It soon became apparent that a large proportion of the manuscripts were translations of known Buddhist works in Sanskrit and some of them were even bilingual, facilitating decipherment of the new language. Besides the Buddhist and Manichaean religious texts, there were also monastery correspondence and accounts, commercial documents, caravan permits, and medical and magical texts, and one love poem. Many Tocharians embraced Manichaean dualism or Buddhism.

In 1998, Chinese linguist Ji Xianlin published a translation and analysis of fragments of a Tocharian Maitreyasamiti-Nataka discovered in 1974 in Yanqi.[5],[6],[7]


Tocharian has completely re-worked the nominal declension system of Proto-Indo-European. The only cases inherited from the proto-language are nominative, genitive, and accusative; in Tocharian the old accusative is known as the oblique case. In addition to these three cases, however, each Tocharian language has six cases formed by the addition of an invariant suffix to the oblique case. For example, the Tocharian A word käṣṣi "teacher" is declined as follows:

Case Suffix Singular Plural
Nominative käṣṣi käṣṣiñ
Genitive käṣṣiyāp käṣṣiśśi
Oblique käṣṣiṃ käṣṣis
Instrumental -yo käṣṣinyo käṣṣisyo
Perlative käṣṣinā käṣṣisā
Comitative -aśśäl käṣṣinaśśäl käṣṣisaśśäl
Allative -ac käṣṣinac käṣṣisac
Ablative -äṣ käṣṣinäṣ käṣṣisäṣ
Locative -aṃ käṣṣinaṃ käṣṣisaṃ

Cultural significance

"Tocharian donors", with light hair and light eye color, 6th century CE fresco, Qizil, Tarim Basin. These frescoes are associated with annotations in Tocharian and Sanskrit made by their painters.

The existence of the Tocharian languages and alphabet was not even suspected until archaeological exploration of the Tarim basin by Aurel Stein in the early 20th century brought to light fragments of manuscripts in an unknown language [8] .

This language, now known as Tocharian, turned out to belong to a hitherto unknown branch of the Indo-European family of languages. The discovery of Tocharian has upset some theories about the relations of Indo-European languages and revitalized their study.

The Tocharian languages are a major geographic exception to the usual pattern of Centum branches, being the only one that spread directly east from the theoretical Indo-European starting point in the Pontic steppe. One theory, following the "wave" theory of Johannes Schmidt, suggests that the Satem isogloss represents a linguistic innovation within the heart of the Proto-Indo-European home range, which would thus see the distribution of the Centum languages as simply representing linguistic conservatism along the eastern and western peripheries of the Proto-Indo-European home range.

Tocharian probably died out after 840, when the Uyghurs were expelled from Mongolia by the Kyrgyz, retreating to the Tarim Basin. This theory is supported by the discovery of translations of Tocharian texts into Uyghur. During Uyghur rule, the peoples mixed with the Uyghurs to produce much of the modern population of what is now Xinjiang.

Comparison to other Indo-European languages

Tocharian vocabulary (sample)
Modern English Tocharian A Tocharian B Old Irish Latin Ancient Greek Vedic Sanskrit Proto-Indo-European
one sas e oen ūnus heis eka *oynos, *sems
two wu wi duo duo dva *d(u)woh1
three tre trai trí trēs treis tri *treyes
four śtwar śtwer cethair quattuor téssares catur *kʷetwores
five päñ piś cóic quīnque pente pañca *penkʷe
six äk kas sex héx ṣáṣ *(s)weḱs
seven pät ukt secht septem heptá saptá *septm
eight okät okt ocht octō októ aṣṭa *oḱtoh3
nine ñu ñu noí novem ennéa náva *newn
ten śäk śak deich decem deka dáśa *deḱm
hundred känt kante cét centum hekatón śatám *ḱmtom
father pācar pācer athair pater patēr pitár- *ph2tēr
mother mācar mācer máthair mater mētér mātar- *meh2tēr
brother pracar procer bráthair frāter phrátēr[* 1] bhrātar- *bhreh2tēr
sister ar er siur soror éor[* 1] svas- *swesor
(horse)[* 2] yuk yakwe ech equus híppos áśva- *eḱwo-
cow ko keu bos[* 3] boûs gáus *gʷow-
(voice)[* 3] vak vek - vōx épos[* 1] vāk *wekʷ-
name ñom ñem ainmm nōmen ónoma nāman- *nomn
to milk malk mälk mlig-/blig- mulgēre amélgein marjati[* 1] *melg-
  1. ^ a b c d Cognate, with shifted meaning
  2. ^ English meaning, unrelated word
  3. ^ a b Borrowed cognate, not native.

See also

Indo-European topics

Indo-European languages (list)
Albanian · Armenian · Baltic
Celtic · Germanic · Greek
Indo-Iranian (Indo-Aryan, Iranian)
Italic · Slavic  

extinct: Anatolian · Paleo-Balkans (Dacian,
Phrygian, Thracian) · Tocharian

Indo-European peoples
Europe: Balts · Slavs · Albanians · Italics · Celts · Germanic peoples · Greeks · Paleo-Balkans (Illyrians · Thracians · Dacians) ·

Asia: Anatolians (Hittites, Luwians)  · Armenians  · Indo-Iranians (Iranians · Indo-Aryans)  · Tocharians  

Language · Society · Religion
Urheimat hypotheses
Kurgan hypothesis
Anatolia · Armenia · India · PCT
Indo-European studies


  1. ^ Renfrew, Colin Archaelogy and language (1990), pg 107
  2. ^ Baldi, Philip The Foundations of Latin (1999), pg 39
  3. ^ Daniels (1996), p. 531
  4. ^ Campbell (2000), p. 1666
  5. ^ "Fragments of the Tocharian", Andrew Leonard, How the World Works,, January 29, 2008
  6. ^ "Review of 'Fragments of the Tocharian A Maitreyasamiti-Nataka of the Xinjiang Museum, China. In Collaboration with Werner Winter and Georges-Jean Pinault by Ji Xianlin'", J. C. Wright, Bulletin of the School of Oriental and African Studies, University of London, Vol. 62, No. 2 (1999), pp. 367-370
  7. ^ "Fragments of the Tocharian a Maitreyasamiti-Nataka of the Zinjiang Museum, China", Ji Xianlin, Werner Winter, Georges-Jean Pinault, Trends in Linguistics, Studies and Monographs
  8. ^ Deuel, Leo. 1970. Testaments of Time, ch. XXI, pp. 425-455. Baltimore, Pelican Books. Orig. publ. Knopf, NY, 1965.


  • Daniels, Peter (1996), The Worlds Writing Systems, Oxford University Press, ISBN 0195079930  
  • Campbell, George (2000), Compendium of the Worlds Languages Second Edition: Volume II Ladkhi to Zuni, Routledge, ISBN 041520473  
  • "Tokharian Pratimoksa Fragment Sylvain Levi". The Journal of the Royal Asiatic Society of Great Britain and Ireland. 1913, pp. 109–120.
  • Mallory, J.P. and Victor H. Mair. The Tarim Mummies. London: Thames & Hudson, 2000. (ISBN 0-500-05101-1)
  • Schmalsteig, William R. "Tokharian and Baltic." Lituanus. v. 20, no. 3, 1974.
  • Krause, Wolfgang and Werner Thomas. Tocharisches Elemantarbuch. Heidelberg: Carl Winter Universitätsverlag, 1960.
  • Malzahn, Melanie (Ed.). Instrumenta Tocharica. Heidelberg: Carl Winter Universitätsverlag, 2007. (ISBN 978-3-8253-5299-8)

External links


Got something to say? Make a comment.
Your name
Your email address