Computational linguistics: Wikis


Note: Many of our articles have direct quotes from sources you can cite, within the Wikipedia article! This article doesn't yet, but we're working on it! See more info or our list of citable articles.


From Wikipedia, the free encyclopedia

Computational linguistics is an interdisciplinary field dealing with the statistical and/or rule-based modeling of natural language from a computational perspective. This modeling is not limited to any particular field of linguistics. Traditionally, computational linguistics was usually performed by computer scientists who had specialized in the application of computers to the processing of a natural language. Computational linguists often works as members of interdisciplinary teams, including linguists (specifically trained in linguistics), language experts (persons with some level of ability in the languages relevant to a given project), and computer scientists. In general, computational linguistics draws upon the involvement of linguists, computer scientists, experts in artificial intelligence, mathematicians, logicians, cognitive scientists, cognitive psychologists, psycholinguists, anthropologists and neuroscientists, among others.



Computational linguistics as a field predates artificial intelligence, a field under which it is often grouped. Computational linguistics originated with efforts in the United States in the 1950s to use computers to automatically translate texts from foreign languages, particularly Russian scientific journals, into English. [1] Since computers can make arithmetic calculations much faster and more accurately than humans, it was thought to be only a short matter of time before the technical details could be taken care of that would allow them the same remarkable capacity to process language.[citation needed]

When machine translation (also known as mechanical translation) failed to yield accurate translations right away, automated processing of human languages was recognized as far more complex than had originally been assumed. Computational linguistics was born as the name of the new field of study devoted to developing algorithms and software for intelligently processing language data. When artificial intelligence came into existence in the 1960s, the field of computational linguistics became that sub-division of artificial intelligence dealing with human-level comprehension and production of natural languages.[citation needed]

In order to translate one language into another, it was observed that one had to understand the grammar of both languages, including both morphology (the grammar of word forms) and syntax (the grammar of sentence structure). In order to understand syntax, one had to also understand the semantics and the lexicon (or 'vocabulary'), and even to understand something of the pragmatics of language use. Thus, what started as an effort to translate between languages evolved into an entire discipline devoted to understanding how to represent and process natural languages using computers.[citation needed]


Computational linguistics can be divided into major areas depending upon the medium of the language being processed, whether spoken or textual; and upon the task being performed, whether analyzing language (recognition) or synthesizing language (generation).

Speech recognition and speech synthesis deal with how spoken language can be understood or created using computers. Parsing and generation are sub-divisions of computational linguistics dealing respectively with taking language apart and putting it together. Machine translation remains the sub-division of computational linguistics dealing with having computers translate between languages.

Some of the areas of research that are studied by computational linguistics include:

The Association for Computational Linguistics defines computational linguistics as:

...the scientific study of language from a computational perspective. Computational linguists are interested in providing computational models of various kinds of linguistic phenomena[2].

See also


  1. ^ John Hutchins: Retrospect and prospect in computer-based translation. Proceedings of MT Summit VII, 1999, pp. 30–44.
  2. ^ The Association for Computational Linguistics What is Computational Linguistics? Published online, Feb, 2005.

External links


Study guide

Up to date as of January 14, 2010
(Redirected to Topic:Computational linguistics article)

From Wikiversity


Welcome to the Wikiversity Center for Computational Linguistics.


Topic:Lexicography redirects to here.


The Center for Computational Linguistics is a Wikiversity content development project where participants create, organize and develop learning resources for Computational Linguistics. This general goal intersects the Schools of Computer Science and Linguistics. It relates also to Translation, Multilingual Studies and other topics.

Specific Goals

This content development project is concerned with learning activities for Computational linguistics. We need learning activities that will help learners:

  • To get familiarized with basic terminology of the field.
  • To get to know different experiences on this field related to Mediawiki: conjugators, bots, multilingual websites approach such as WiktionaryZ,...
  • To practice in your own computer software tools, such as Natural language Toolkit and Apertium.
  • To propose, discuss or even develop new applications which can be used with Mediawiki, especially to improve projects such as Wiktionary, language learning methodologies in Wikiversity or language learning books in Wikibooks.

Concepts to learn include: /concepts

Learning materials

Mini-icons of 10*10 pixels.

Learning materials and learning projects are located in the main Wikiversity namespace. Simply make a link to the name of the lesson (lessons are independent pages in the main namespace) and start writing!

You should also read about the Wikiversity:Learning model. Lessons should center on learning activities for Wikiversity participants. Learning materials and learning projects can be used by multiple projects. Cooperate with other departments that use the same learning resource.


Brainstormed list of possible lessons:

  • Lesson 1: What does Computational Linguistics mean?
  • Lesson 2: Computational Morphology
    • The conjugator based on templates: An example in Wiktionary
  • Lesson 3: The corpus (corpus linguistics)
    • What is it? What can it be used for?
  • Lesson 4: The parser
  • Lesson 5: OmegaWiki as a corpus or lexicon
  • Lesson 6: Audio interfaces and the relationship between sound and meaning
  • Lesson 7: Human/Machine interfaces and linguistics framework
  • Lesson 8: Language acquisition for youngsters and their machines
  • Lesson 9: Computational applications for foreign language learning
    • The multilingual platforms: user preference selection.
  • ...Lesson brainstorm continues...

Remember: All actual learning resources should be on pages in the main namespace (page names with no prefix).


First course — Introduction
  • Introduction
    • Including Unix for Poets (how to mangle text)
  • Lexical analysis
    • Morphological analysis
      • Finite state automata and transducers
        • Tour of free-software packages (including at least SFST and lttoolbox)
        • Paradigms and lemma-paradigm pairs
      • Two-level morphology
      • POS tagging
        • HMMs
  • Syntactic analysis
      • Finite state grammars
  • Semantic analysis
    • Word sense disambiguation
  • Machine translation
    • Sub-fields: Direct, Transfer, Example-based, SMT
      • Practicals on creating MT systems for a given pair of languages within the RBMT/Transfer paradigm (using Apertium), and in the SMT paradigm (using GIZA++/Moses)
Second course — Probabilistic methods


  • Creating templates for automatic regular conjugated verbs creation. (Regular verbs conjugated with templates in Spanish Wiktionary)
  • Using a corpus: [1]
  • Work on the Translator's Handbook sections, Machine Translation and Computer-assisted Translation
  • Participate locally at OmegaWiki or at
  • ...develop an activity...


Each activity has a suggested associated background reading selection.


Additional helpful readings include:

Active participants

Active participants in this Learning Group

  1. --Javier Carro 08:34, 7 October 2006 (UTC)
  2. --CQ 02:47, 30 November 2006 (UTC)
  3. --Dionysios (talk), Founder of the Wikiversity School of Advanced General Studies, Date: 2007-09-25 (September 25, 2007) Time: 2128 UTC
  4. --Copyleft 22:29, 22 June 2009 (UTC)
  5. -- ...

See also


Got something to say? Make a comment.
Your name
Your email address