English phonology is the study of the phonology (i.e., the sound system) of the English language. Like all other languages, spoken English has wide variation in its pronunciation both diachronically and synchronically from dialect to dialect. This variation is especially salient in English, because the language is spoken over such a wide territory, being the predominant language in Australia, Canada, the Commonwealth Caribbean, Ireland, New Zealand, the United Kingdom and the United States, in addition to being spoken as a first or second language by people in countries on every continent, notably South Africa and India. In general, the regional dialects of English are mutually intelligible.
Although there are many dialects of English, the following are usually used as prestige or standard accents: Received Pronunciation for the United Kingdom, General American for the United States and General Australian for Australia.
The number of speech sounds in English varies from dialect to dialect, and any actual tally depends greatly on the interpretation of the researcher doing the counting. The Longman Pronunciation Dictionary by John C. Wells, for example, using symbols of the International Phonetic Alphabet, denotes 24 consonants and 23 vowels used in Received Pronunciation, plus two additional consonants and four additional vowels used in foreign words only. For General American, it provides for 25 consonants and 19 vowels, with one additional consonant and three additional vowels for foreign words. The American Heritage Dictionary, on the other hand, suggests 25 consonants and 18 vowels (including r-colored vowels) for American English, plus one consonant and five vowels for non-English terms .
The following table shows the consonant phonemes found in most dialects of English. When consonants appear in pairs, fortis consonants (i.e., aspirated or voiceless) appear on the left and lenis consonants (i.e., lightly voiced or voiced) appear on the right:
|Plosive||p b||t d||k ɡ|
|Fricative||f v||θ ð||s z||ʃ ʒ||(x)3||h|
|Approximant||ɹ1, 2, 5||j||w4|
|/ɹ/||run (also /r/, /ɻ/)||/j/||yes|
Although regional variation is very great across English dialects, some generalizations can be made about pronunciation in all (or at least the vast majority) of English accents:
The vowels of English differ considerably between dialects. Because of this, corresponding vowels may be transcribed with various symbols depending on the dialect under consideration. When considering English as a whole, no specific phonemic symbols are chosen over others; instead, lexical sets are used, each named by a word containing the vowel in question. For example, the vowel of the LOT set ("short o") is transcribed /ɒ/ in Received Pronunciation, /ɔ/ in Australian English, and /ɑ/ in General American. For an overview of the correspondences, see IPA chart for English dialects.
The monophthong phonemes of General American differ in a number of ways from Received Pronunciation:
vowels occur in some unstressed syllables. (Other unstressed
syllables may have full vowels, which some dictionaries mark as secondary
stress.) The number of distinctions made among reduced vowels
varies by dialect. In some dialects vowels are centralized but otherwise kept
mostly distinct, while in Australia, New Zealand and many US
dialects all reduced vowels collapse to a schwa [ə]. In Received Pronunciation, there is a
distinct high reduced vowel, which the OED writes ‹
Linguists such as Ladefoged and Bolinger argue that vowel reduction is phonemic in English, and that there are two "tiers" of vowels in English, full and reduced; traditionally many English dictionaries have attempted to mark the distinction by transcribing unstressed full vowels as having "secondary" stress, though this was later abandoned by the Oxford English Dictionary. Though full unstressed vowels may derive historically from stressed vowels, either because stress shifted over time (such as stress shifting away from the final syllable of French loan words in British English) or because of loss or shift of stress in compound words or phrases (óverseas vóyage from overséas or óverséas plus vóyage), the distinction is not one of stress but of vowel quality (Bolinger 1989:351), and over time, if the word is frequent enough, the vowel will tend to reduce.
English has up to five reduced vowels, though this varies with
dialect and speaker. Schwa /ə/ is found in all dialects, and a rhotic schwa
("schwer") /ɚ/ is found in rhotic dialects. Less common is
a high reduced vowel ("schwi") /ɪ̈/ (also "/
ɪ/"); the two are distinguished by many
people in Rosa's /ˈroʊzəz/ vs roses /ˈroʊzɪ̈z/. More unstable is a rounded schwa,
/ö/ (also /ɵ/); this contrasts for some speakers in a
mission /əˈmɪʃən/, emission /ɪ̈ˈmɪʃən/, and omission /ɵˈmɪʃən/. In words like following, the
following vowel is preceded by a [w] even in dialects which do not
otherwise have a rounded schwa: [ˈfɒlɵwɪŋ, ˈfɒləwɪŋ]. A high rounded schwa /ʊ̈/ (also "/ ʊ/") may be found in words such as
into /ˈɪntʊ̈/, though in many dialects this is not be
distinguished from /ɵ/.
Though speakers vary, full and reduced unstressed vowels may contrast in pairs of words like Shogun /ˈʃoʊɡʌn/ and slogan /ˈsloʊɡən/, chickaree /ˈtʃɪkəriː/ and chicory /ˈtʃɪkərɪ̈/, Pharaoh /ˈfɛəroʊ/ and farrow /ˈfæroʊ/ (Bolinger 1989:348), Bantu /ˈbæntuː/ and into /ˈɪntʊ̈/ (OED).
The choice of which symbols to use for phonemic transcriptions may reveal theoretical assumptions or claims on the part of the transcriber. English "lax" and "tense" vowels are distinguished by a synergy of features, such as height, length, and contour (monophthong vs. diphthong); different traditions in the linguistic literature emphasize different features. For example, if the primary feature is thought to be vowel height, then the non-reduced vowels of General American English may be represented as follows:
|General American full vowels,
vowel height distinctive
If, on the other hand, vowel length is considered to be the deciding factor, the following symbols may be chosen:
|General American full vowels,
vowel length distinctive
(This convention has sometimes been used because the publisher did not have IPA fonts available, though that is seldom an issue any longer.)
If vowel transition is taken to be paramount, then the chart may look like one of these:
Many linguists combine more than one of these features in their transcriptions, suggesting they consider the phonemic differences to be more complex than a single feature.
|General American full vowels,
height & length distinctive
Stress is phonemic in English. For example, the words desert and dessert are distinguished by stress, as are the noun a record and the verb to record. Stressed syllables in English are louder than non-stressed syllables, as well as being longer and having a higher pitch. They also tend to have a fuller realization than unstressed syllables.
Examples of stress in English words, using boldface to represent stressed syllables, are holiday, alone, admiration, confidential, degree, and weaker. Ordinarily, grammatical words (auxiliary verbs, prepositions, pronouns, and the like) do not receive stress, whereas lexical words (nouns, verbs, adjectives, etc.) must have at least one stressed syllable.
English is a stress-timed language. That is, stressed syllables appear at a roughly steady tempo, and non-stressed syllables are shortened to accommodate this.
Traditional approaches describe English as having three degrees of stress: Primary, secondary, and unstressed. However, if stress is defined as relative respiratory force (that is, it involves greater pressure from the lungs than unstressed syllables), as most phoneticians argue, and is inherent in the word rather than the sentence (that is, it is lexical rather than prosodic), then these traditional approaches conflate two distinct processes: Stress on the one hand, and vowel reduction on the other. In this case, primary stress is actually prosodic stress, whereas secondary stress is simple stress in some positions, and an unstressed but not reduced vowel in others. Either way, there is a three-way phonemic distinction: Either three degrees of stress, or else stressed, unstressed, and reduced. The two approaches are sometimes conflated into a four-way 'stress' classification: primary (tonic stress), secondary (lexical stress), tertiary (unstressed full vowel), and quaternary (reduced vowel). See secondary stress for details.
Initial-stress-derived nouns mean that stress changes in many English words came about between noun and verb senses of a word. For example, a rebel [ˈɹɛb.ɫ̩] (stress on the first syllable) is inclined to rebel [ɹɨ.ˈbɛɫ] (stress on the second syllable) against the powers that be. The number of words using this pattern as opposed to only stressing the second syllable in all circumstances doubled every century or so, now including the English words object, convict, and addict.
Prosodic stress is extra stress given to words when they appear in certain positions in an utterance, or when they receive special emphasis. It normally appears on the final stressed syllable in an intonation unit. So, for example, when the word admiration is said in isolation, or at the end of a sentence, the syllable ra is pronounced with greater force than the syllable ad. (This is traditionally transcribed as /ˌædmɨˈreɪʃən/.) This is the origin of the primary stress-secondary stress distinction. However, the difference disappears when the word is not pronounced with this final intonation.
Prosodic stress can shift for various pragmatic functions, such as focus or contrast. For instance, consider the dialogue
In this case, the extra stress shifts from the last stressed syllable of the sentence, tomorrow, to the last stressed syllable of the emphasized word, dinner. Compare
Although grammatical words generally do not have lexical stress, they do acquire prosodic stress when emphasized. Compare ordinary
with more emphatic
Most languages of the world syllabify CVCV and CVCCV sequences as /CV.CV/ and /CVC.CV/ or /CV.CCV/, with consonants preferentially acting as the onset of a syllable containing the following vowel. According to one view, English is unusual in this regard, in that stressed syllables attract following consonants, so that ˈCVCV and ˈCVCCV syllabify as /ˈCVC.V/ and /ˈCVCC.V/, as long as the consonant cluster CC is a possible syllable coda. In addition, according to this view, /r/ preferentially syllabifies with the preceding vowel even when both syllables are unstressed, so that CVrV occurs as /CVr.V/. However, many scholars do not agree with this view.
The syllable structure in English is (C)3V(C)5, with a near maximal example being strengths (/ˈstrɛŋkθs/, although it can be pronounced /ˈstrɛŋθs/). Because of an extensive pattern of articulatory overlap, English speakers rarely produce an audible release in consonant clusters. This can lead to cross-articulations that seem very much like deletions or complete assimilations. For example, hundred pounds may sound like [hʌndɹɛb pʰaʊndz] but X-ray and electropalatographic studies demonstrate that inaudible and possibly weakened contacts may still be made so that the second /d/ in hundred pounds does not entirely assimilate a labial place of articulation, rather the labial co-occurs with the alveolar one.
When a stressed syllable contains a pure vowel (rather than a diphthong), followed by a single consonant and then another vowel, as in holiday, many native speakers feel that the consonant belongs to the preceding stressed syllable, /ˈhɒl.ɨ.deɪ/. However, when the stressed vowel is a long vowel or diphthong, as in admiration or pekoe, speakers agree that the consonant belongs to the following syllable: /ˈæd.mɨ.ˈreɪ.ʃən/, /ˈpiː.koʊ/. Wells (1990) notes that consonants syllabify with the preceding rather than following vowel when the preceding vowel is the nucleus of a more salient syllable, with stressed syllables being the most salient, reduced syllables the least, and secondary stress / full unstressed vowels intermediate. But there are lexical differences as well, frequently with compound words but not exclusively. For example, in dolphin and selfish, he argues that the stressed syllable ends in /lf/; in shellfish, the /f/ belongs with the following syllable: /ˈdɒlf.ɪn/, /ˈsɛlf.ɪʃ/ → [ˈdɒlfɨn], [ˈsɛlfɨʃ] vs /ˈʃɛl.fɪʃ/ → [ˈʃɛlˑfɪʃ], where the /l/ is a little longer and the /ɪ/ not reduced. Similarly, in toe-strap the /t/ in a full plosive, as usual in syllable onset, whereas in toast-rack the /t/ is in many dialects reduced to the unreleased allophone it takes in syllable codas, or even elided: /ˈtoʊ.stræp/, /ˈtoʊst.ræk/ → [ˈtʰoˑʊstɹæp], [ˈtoʊs(t̚)ɹʷæk]; likewise nitrate /ˈnaɪ.treɪt/ → [ˈnʌɪtɹ̥ʷeɪt] with a voiceless /r/, vs night-rate /ˈnaɪt.reɪt/ → [ˈnʌɪt̚ɹʷeɪt] with a voiced /r/. Cues of syllable boundaries include aspiration of syllable onsets and (in the US) flapping of coda /t, d/ (a tease /ə.ˈtiːz/ → [əˈtʰiːz] vs. at ease /æt.ˈiːz/ → [æɾˈiːz]), epenthetic plosives like [t] in syllable codas (fence /ˈfɛns/ → [ˈfɛnts] but inside /ɪn.ˈsaɪd/ → [ɪnˈsaɪd]), and r-colored vowels when the /r/ is in the coda vs. labialization when it is in the onset (key-ring /ˈkiː.rɪŋ/ → [ˈkʰiːɹʷɪŋ] but fearing /ˈfiːr.ɪŋ/ → [ˈfɪəɹɪŋ]).
There is an on-going sound change (yod-dropping) by which /j/ as the final consonant in a cluster is being lost. In RP, words with /sj/ and /lj/ can usually be pronounced with or without this sound, e.g., [suːt] or [sjuːt]. For some speakers of English, including some British speakers, the sound change is more advanced and so, for example, in General American /j/ is also not present after /n/, /l/, /s/, /z/, /θ/, /t/ and /d/. In Welsh English it can occur in more combinations, for example in /tʃj/.
The following can occur as the onset:
|All single consonant phonemes except /ŋ/|
|Plosive plus approximant other than /j/:
/pl/, /bl/, /kl/, /ɡl/,
|play, blood, clean, glove, prize, bring, tree, dream, crowd, green, twin, dwarf, language, quick|
|Voiceless fricative plus approximant other than /j/:
|floor, sleep, friend, three, shrimp, swing, thwart, which|
|Consonant plus /j/:
/pj/, /bj/, /tj/, /dj/, /kj/, /ɡj/,
|pure, beautiful, tube, during, cute, argue, music, new, few, view, thew, suit, Zeus, huge, lurid|
|/s/ plus voiceless plosive:
/sp/, /st/, /sk/
|speak, stop, skill|
|/s/ plus nasal:
|/s/ plus voiceless plosive plus
|split, spring, street, scream, square, smew, spew, student, skewer|
Certain English onsets appear only in contractions: e.g., /zbl/ ('sblood), /zd/ (sdein), and /zw/ or /dzw/ ('swounds or 'dswounds). Some, such as /pʃ/ (pshaw) or /fw/ (fwoosh), can occur in interjections. An archaic voiceless fricative plus nasal exists, /fn/ (fnese).
A few other onsets occur in further (anglicized) loan words, including /bw/ (bwana), /mw/ (moiré), /nw/ (noire), /pw/ (pueblo), /zw/ (zwieback), /vw/ (voilà), /kv/ (kvetch), /ʃv/ (schvartze), /tv/ (Tver), /vl/ (Vladimir), and /zl/ (zloty).
Some clusters of this type can be converted to regular English phonotactics by simplifying the cluster: e.g. /(d)z/ (dziggetai), /(h)r/ (Hrolf), /kr(w)/ (croissant), /(p)f/ (pfennig), /(f)θ/ (phthalic), and /(t)s/ (tsunami).
Others can be substituted by native clusters differing only in voice: /zb ~ sp/ (sbirro), and /zgr ~ skr/ (sgraffito).
The following can occur as the nucleus:
Most, and in theory all, of the following except those which end with /s/, /z/, /ʃ/, /ʒ/, /tʃ/ or /dʒ/ can be extended with /s/ or /z/ representing the morpheme -s/z-. Similarly most, and in theory all, of the following except those which end with /t/ or /d/ can be extended with /t/ or /d/ representing the morpheme -t/d-.
Wells (1990) argues that a variety of syllable codas are possible in English, even /ntr, ndr/ in words like entry /ˈɛntr.ɪ/ and sundry /ˈsʌndr.ɪ/, with /tr, dr/ being treated as affricates along the lines of /tʃ, dʒ/. He argues that the traditional assumption that pre-vocalic consonants form a syllable with the following vowel is due to the influence of languages like French and Latin, where syllable structure is CVC.CVC regardless of stress placement. Disregarding such contentious cases, which do not occur at the ends of words, the following sequences can occur as the coda:
|The single consonant phonemes except /h/, /w/, /j/ and, in non-rhotic varieties, /r/|
|Lateral approximant + plosive or affricate: /lp/, /lb/, /lt/, /ld/, /ltʃ/, /ldʒ/, /lk/||help, bulb, belt, hold, belch, indulge, milk|
|In rhotic varieties, /r/ + plosive or affricate: /rp/, /rb/, /rt/, /rd/, /rtʃ/, /rdʒ/, /rk/, /rɡ/||harp, orb, fort, beard, arch, large, mark, morgue|
|Lateral approximant + fricative: /lf/, /lv/, /lθ/, /ls/, /lʃ/||golf, solve, wealth, else, Welsh|
|In rhotic varieties, /r/ + fricative: /rf/, /rv/, /rθ/, /rs/, /rʃ/||dwarf, carve, north, force, marsh|
|Lateral approximant + nasal: /lm/, /ln/||film, kiln|
|In rhotic varieties, /r/ + nasal or lateral: /rm/, /rn/, /rl/||arm, born, snarl|
|Nasal + homorganic plosive or affricate: /mp/, /nt/, /nd/, /ntʃ/, /ndʒ/, /ŋk/||jump, tent, end, lunch, lounge, pink|
|Nasal + fricative: /mf/, /mθ/ in non-rhotic varieties, /nθ/, /ns/, /nz/, /ŋθ/ in some varieties||triumph, warmth, month, prince, bronze, length|
|Voiceless fricative + voiceless plosive: /ft/, /sp/, /st/, /sk/||left, crisp, lost, ask|
|Two voiceless fricatives: /fθ/||fifth|
|Two voiceless plosives: /pt/, /kt/||opt, act|
|Plosive + voiceless fricative: /pθ/, /ps/, /tθ/, /ts/, /dθ/, /dz/, /ks/||depth, lapse, eighth, klutz, width, adze, box|
|Lateral approximant + two consonants: /lpt/, /lfθ/, /lts/, /lst/, /lkt/, /lks/||sculpt, twelfth, waltz, whilst, mulct, calx|
|In rhotic varieties, /r/ + two consonants: /rmθ/, /rpt/, /rps/, /rts/, /rst/, /rkt/||warmth, excerpt, corpse, quartz, horst, infarct|
|Nasal + homorganic plosive + plosive or fricative: /mpt/, /mps/, /ndθ/, /ŋkt/, /ŋks/, /ŋkθ/ in some varieties||prompt, glimpse, thousandth, distinct, jinx, length|
|Three obstruents: /ksθ/, /kst/||sixth, next|
Note: For some speakers, a fricative before /θ/ is elided so that these never appear phonetically: /ˈfɪfθ/ becomes [ˈfɪθ], /ˈsiksθ/ becomes [ˈsikθ], /ˈtwelfθ/ becomes [ˈtwelθ].
Around the late 14th century, English began to undergo the Great Vowel Shift, in which
The other long vowels became higher:
Later developments complicate the picture: whereas in Geoffrey Chaucer's time food, good, and blood all had the vowel [oː] and in William Shakespeare's time they all had the vowel [uː], in modern pronunciation good has shortened its vowel to [ʊ] and blood has shortened and lowered its vowel to [ʌ] in most accents. In Shakespeare's day (late 16th-early 17th century), many rhymes were possible that no longer hold today. For example, in his play The Taming of the Shrew, shrew rhymed with woe.
æ-tensing is a phenomenon found in many varieties of American English by which the vowel /æ/ has a longer, higher, and usually diphthongal pronunciation in some environments, usually to something like [eə]. Some American accents, for example that of New York City or Philadelphia, make a marginal phonemic distinction between /æ/ and /eə/ although the two occur largely in mutually exclusive environments.
The bad-lad split refers to the situation in some varieties of southern English English and Australian English, where a long phoneme /æː/ in words like bad contrasts with a short /æ/ in words like lad.
The cot-caught merger is a sound change by which the vowel of words like caught, talk, and tall (/ɔ/), is pronounced the same as the vowel of words like cot, rock, and doll (/ɒ/ in New England /ɑː/ elsewhere). This merger is widespread in North American English, being found in approximately 40% of American speakers and virtually all Canadian speakers.
The father-bother merger is the pronunciation of the short O /ɒ/ in words such as "bother" identically to the broad A /ɑː/ of words such as "father", nearly universal in all of the United States and Canada save New England and the Maritime provinces; many American dictionaries use the same symbol for these vowels in pronunciation guides.