Southeast Asia and the Pacific
ISO 639-5: poz

The Malayo-Polynesian languages are a subgroup of the Austronesian languages, with approximately 351 million speakers. These are widely dispersed throughout the island nations of Southeast Asia and the Pacific Ocean, with a smaller number in continental Asia. Malagasy is a geographic outlier, spoken in the island of Madagascar in the Indian Ocean.

A characteristic of the Malayo-Polynesian languages is a tendency to use reduplication (repetition of all or part of a word, such as wiki-wiki) to express the plural, and like other Austronesian languages they have simple phonologies; thus a text has few but frequent sounds. The majority also lack consonant clusters (e.g., [str] or [mpt] in English). Most also have only a small set of vowels, five being a common number.


The western Malayo-Polynesian languages, under the simplifying classification of Wouk & Ross (2002).      Borneo-Philippines (not shown: Yami in Taiwan)      Sunda-Sulawesi (not shown: Chamorro)      Central Malayo-Polynesian      Halmahera-Geelvink Bay      the westernmost Oceanic languages

The Malayo-Polynesian languages share several phonological and lexical innovations with the eastern Formosan languages, including the leveling of proto-Austronesian *t, *C to /t/ and *n, *N to /n/, a shift of *S to /h/, and vocabulary such as *lima "five" which are not attested in other Formosan languages. However, it does not align with any one branch. A 2008 analysis of the Austronesian Basic Vocabulary Database suggests the closest connection is with Paiwan, though it only assigns that connection a 75% confidence level.

Malayo-Polynesian has traditionally been divided into Western ("Hesperonesian"), Central, and Eastern branches. While Central MP has almost no support from the data, and Eastern MP is dubious, a united Central-Eastern branch is reasonably well supported, receiving an 80% confidence level in the 2008 analysis. However, the Western branch is a purely remnant grouping: it is defined as those Malayo-Polynesian languages which fall outside the Central-Eastern branch. Wouk and Ross (2002) proposed a Nuclear Malayo-Polynesian branch, based on a consistent simplification of the Austronesian alignment in the syntax of the proto-Malayo-Polynesian language, which is found throughout Indonesia apart from much of Borneo and the north of Sulawesi. Because Nuclear MP included some Western MP languages along with Central-Eastern MP, Wouk and Ross split Western MP into an "Inner" group on Sulawesi and the Sunda Islands, which together with Central-Eastern formed Nuclear Malayo-Polynesian, and an "Outer" group on Borneo and the Philippines. Both are remnant groups with negative definitions: Outer WMP (Borneo-Philippines) are those Malayo-Polynesian languages which are not Nuclear, while Inner WMP (Sunda-Sulawesi) are those Nuclear languages which are not Central-Eastern. Although Nuclear MP was defined using syntactic data, it finds moderate support from lexical data.

The 2008 analysis found three branches of Malayo-Polynesian with full support of the lexical data. These were the Philippine languages, including some languages of northern Sulawesi; Sama-Bajaw, of the Sulu Archipelago between the Philippines and Borneo; and the Indo-Melanesian languages, being all the rest. It found moderate (75%) support for Sama-Bajaw forming a unit with the Philippine languages. Within Indo-Melanesian, it found moderate (75%) support for Nuclear Malayo-Polynesian, and lesser (65%) support for the Bornean languages as a valid group.

Thus the internal structure of Malayo-Polynesian suggested by the 2008 study is,

The Philippine languages are spoken by 90 million people and include Tagalog, Cebuano, Ilokano, Hiligaynon, Bikolano, and Kapampangan, and Waray-Waray, each with at least three million speakers.

The most populous Bornean language is Malagasy, with 20 million speakers.

The Sunda-Sulawesi languages (Nuclear languages outside Central-Eastern) are spoken by about 230 million people and include Malay (Indonesian and Malaysian), Sundanese, Javanese, Balinese, Acehnese, Chamorro of Guam, and Palauan.

Central-Eastern includes the Oceanic languages, with Micronesian languages such as Gilbertese and Nauruan, and Polynesian languages such as Hawaiian, Māori, Samoan, Tahitian, and Tongan.


