Sounds of English provides an introduction to the attributes of the sound system of the language. It provides information on phonetics, phonology, and orthography. It also explains how to produce the sounds of English with particular focus on the bio-mechanics of articulation. Sounds of English provides background knowledge and understanding which will enable the reader to understand the system of spelling and pronunciation of modern English and its historical roots. The series consists of the following individual posts:
It is recommended that posts be read in the order above. Additional links will be made active as posts are updated.
Continue reading Part 1: Introduction
This post begins a series on the sounds of English. First, a brief discussion of phonetics, phonology, and orthography of English:
To begin with, let’s talk terminology. These concepts of phonetics, phonology, and orthography are all interrelated as they involve the sounds of languages and how those sounds are combined and represented in writing. Phonetics (from Greek phone, meaning sound or voice) is simply put, the study of sounds and how they are produced using the vocal organs (mouth, throat, vocal cords, etc). Phonology (from Greek phone, meaning sound, and logos meaning speech — the ‘sounds’ of ‘speech’), on the other hand, is not the study of sounds, but instead of the sound system of particular languages — how sounds are used within a given language, and the rules governing them. Orthography (from Greek orthos meaning correct, and graphein meaning to write — or, the way things are written) deals with the way in which a language uses combinations of letters or symbols to represent the sounds of that language. Another way to look at this is that in English, phonetics describes sounds and how they are produced, phonology establishes a set of rules for how to use those sounds (pronunciation), and orthography provides visual representation of those sounds (spellings that equate to those pronunciations).
Within English there are roughly 50 unique sounds(phonetics). These 50 sounds are represented by 26 letters, alone or in combination with one another (orthography). The sound system of English consists of about 2/3 consonants, which are either voiced or voiceless depending on which sounds surround them, and 1/3 vowels, which may be long or short depending on where they fall within a word (phonology).
Of these sounds, vowels are fairly well understood and will not be addressed too heavily in this series. Vowels are also more difficult to discuss definitively because many of them vary by dialect. Consonants shall be the focus of these discussions on English, and to understand consonants, it is necessary to be familiar with the organs of the vocal tract used to produce them. This is the focus of the next post.
The 30+ consonants in English, consist of the following types:
*Stops are technically the first two parts of a plosive, with the third part being a sudden expelling of air as a release. Without this ‘explosion‘ of air, a plosive is merely a stop.
The first three involve some type of halting or obstructing the flow of air. They always occur as voiced and voiceless pairs, with two sounds being produced in mechanically identical ways, but with the only difference between them being the vibration (or lack of vibration) of the vocal cords. The final three types of sounds involve redirection of the air exiting the body without halting or obstructing its flow. These sounds are always voiced, but often occur in more than one form depending on how they are combined with other sounds. Each category is discussed in separate posts later in this series.
Each language has its own orthography — its way of expressing sounds with letters or symbols. These systems vary by language from very similar systems (English, German, Latin) to different but similar systems (Russian, Arabic, Hebrew), to systems that have very little in common with the standard concept of alphabet (Chinese, Japanese, Egyptian hieroglyphs). Because sounds are present in all languages regardless of orthography, linguists needed a way to represent the same sounds in different languages, no matter in which language they occur. To represent the full spectrum of sounds without using different orthographic systems, a universal alphabet of sounds has been developed. The IPA, or International Phonetic Alphabet uses a single symbol for each specific sound. Sometimes these symbols match the letters in English which represent these sounds. Sometimes they do not. IPA symbols are used throughout this series, but don’t worry, they shall always be explained and examples of each sound shall be given with normal English spelling.
Continue reading Part 2: Articulation
The term articulation refers to the bio-mechanical process of altering the flow of air through the vocal tract to produce sounds.
Sounds are described not by how they sound to the ear, but rather how they are produced in the vocal tract. In the posts below dealing with the different sounds in English, they are so named, and each sound is described based on how the vocal organs interact with each other in producing each specific sound. In fact, the word articulate actually means move. Sounds are produced my moving the articulators (things that can be moved) within the vocal tract (lips, tongue, etc). Terminology relating to the vocal organs, articulators, and points of articulation is defined below. Click the head diagram to the right for an interactive map showing the locations and shapes discussed.
Alveolar refers to the alveolar ridge (purple in diagram), which is flat area just behind the front upper teeth but before the edge of the roof of the mouth. When pronouncing these sounds the tongue touches (/t/, /d/, /n/), or nearly touches (/s/, /z/) the alveolar ridge.
Dental refers to the teeth, particularly the front upper teeth. The tongue touches these teeth when producing the sounds (/θ/as in three, and /ð/ as in there). These teeth touch the bottom lip when producing /f/as in fair and /v/as in very.
Glottal refers to sounds in which the airway is constricted by tightening the airway in the back of the throat. The primary glottal sounds in English are /h/ as in happy, and the vowels.
labial refers to the lips. Sounds produced with the lips include /f/as in fair and /v/as in very, in which the bottom lip touches the upper front teeth; /b/ as in boy and /p/ as in pop, in which both lips are pressed together to interrupt the airflow; and /m/ as in my, in which the lips come together to fully blow the airflow, directing it instead out through the nose.
Lengual refers to the tongue. Most consonants are produced by touching the tongue to another part of the mouth. Vowels are formed by changing the shape of the tongue within the mouth (it’s really a big muscle).
Nasal refers to the nose. Three sounds /m/ as in mom, /n/ as in nice, and /ŋ/ as in ring are nasal, meaning that the flow of air out of the body passes through the nose rather than through the mouth.
Palatal refers to the roof of the mouth (flat purple area in the diagram) behind the alveolar ridge but in front of the velum (see below). The tongue touches the palate in producing the sounds /∫/ as in shoe, /ʒ/ as in pleasure, /t∫/ as in church, /dʒ/ as in jelly. It almost touches the palate in /r/ as in read and /ɝ/ as in dinner.
Velar refers to the velum (green in the diagram) which is the soft portion of the roof of the mouth at the very rear of the mouth. It is generally the farthest point the tongue can reach by curling backward. Velar sounds are produced when the rear portion of the tongue is brought near the velum (/w/ as in wait), or contacts the velum (/k/ as in cat and /g/ as in good).
Continue reading Part 3.1: Plosives
Affricates – an affricate is a consonant which begins as a stop (plosive), characterized by a complete obstruction of the outgoing airstream by the articulators, a build up of air pressure in the mouth, and finally releases as a fricative, a sound produced by forcing air through a constricted space, which produces turbulence when the air is forced trough a smaller opening. Depending on which parts of the vocal tract are used to constrict the airflow, that turbulence causes the sound produced to have a specific character (compare pita with pizza, the only difference is the release in /t/ and /ts/). There are two types of affricate in English. For an interactive example of each sound (including descriptive animation and video), click this link, then in the window that opens, click affricate, and select the appropriate sound.
/ts/ /dz/ lingua-alveolar affricates
A lingua-alveolar (from lingua tongue and alveola the ridge just behind the front upper teeth) affricate is a sound which the flow of air out of the body is initially interrupted in the same manner as a lingua-alveolar stop /t/ or /d/, then immediately released in the same manner as a lingua-alveolar fricative /s/ or /z/, constricted by touching the tongue to the alveolar ridge — the part of the roof of the mouth, just behind the upper front teeth, creating a narrow opening through which the air passes. English has two lingua-alveolar affricates — voiceless /ts/ as in pizza and its, and /dz/ which is voiced as in ads and adze.
/t∫/ /dʒ/ postalveolar affricates
A postalveolar (from post- after and alveola the ridge just behind the front upper teeth) affricate is a sound which is a combination of a lingua-alveolar stop /t/ or /d/ and a lingua-palatal fricative /∫/ or /ʒ/. Because a postalveolar afficate is a combination of two sounds with different points of articulation (in this case, the spot where the tip of the tongue contacts the top of the mouth), its point of articulation falls between that of its two component sounds. In a lingua-alveolar stop, the tongue interrupts the flow of air by pressing against the alveolar ridge — the part of the roof of the mouth, just behind the upper front teeth. In a lingua-palatal fricative, the flow of air out of the body is constricted by very nearly touching the tongue to the hard palate — the part of the roof of the mouth, just behind the alveolar ridge, creating a narrow opening through which the air passes. In a postalveolar affricate, the point of articulation for both the stop and fricative release occurs between these two positions, just behind the alveolar ridge but not quite on the hard palate. English has two postalveolar affricates — voiceless /t∫/ as in cheese, catch, and ligature, and /dʒ/ which is voiced as in judge, magic, and jam.
Continue reading Part 3.4: Nasals, Liquids, & Glides
The first three groups of sounds in English — plosives, fricatives, and affricates are collectively referred to as obstruents (because they obstruct the airway). Each of these sounds involve some type of halting or obstructing the flow of air. Obstruents always occur as voiced and voiceless pairs, with two sounds being produced identically from a mechanical standpoint (which articulators do what), but with the only difference between them being the use of the vocal cords. In contrast, the final three types of sounds involve redirection of the air exiting the body without halting or obstructing its flow. These sounds are called sonorants. The word sonorant is a combination of sonorous (having strong resonant sound) and consonant. The name sonorant refers to the fact that these sounds reverberate or echo off the vocal organs with the breath exiting freely through either the nose or mouth (versus obstruents where the air is constricted or obstructed so that it cannot flow freely). In English, sonorants are always voiced, but often occur in more than one form depending on how they are combined with other sounds. There are three categories of sonorants — nasals, liquids, and glides.
Nasals – a nasal is a consonant produced by redirecting out air through the nose instead of allowing it to escape out of the mouth. In producing nasals, the throat and mouth act as a resonator, or place where the sound echoes about before exiting the body (in the same way that sound bounces around inside the body of a guitar or violin). The specific sound qualities of nasals differ depending on which parts of the vocal tract are used to stop the airflow and send it to the nose. Types of nasals derive their names from those articulators used. Nasals occur in pairs of very similar sounds — syllable initial nasals and syllable-final nasals, in which the order of articulation is reversed. In other words, the steps required to produce the syllable-initial sound are performed in reverse order. There are three types of nasal in English. For an interactive example of each sound (including descriptive animation and video), click this link, then in the window that opens, click nasal, and select the appropriate sound (only syllable-initial sounds are represented).
/m/ /m̩/ bilabial nasals
A bilabial (from bi- two and labia lip) nasal is a sound in which the flow of air out of the body is redirected from the mouth to be made to exit through the nose by pressing both lips together, fully closing the mouth. This allows the entire mouth to act as a resonance chamber resulting in the unique full sound. English has two bilabial nasals – /m/ which occurs at the beginning of a syllable (syllable-initial) as in make, mother and hammer, and syllable-final /m̩/ which occurs at the end of a syllable as in rhythm, mom, and imply.
Production of syllable-initial /m/ is begun with the lips together, the vocal cords vibrating, and air escaping through the nose; finally the jaw is dropped which parts the lips and opens the mouth resulting in a release, restoring the usual flow of air through the mouth. For syllable-final /m̩/, the order is reversed beginning with vocal cords made to vibrate while air is allowed to escape through the mouth, then the jaw is raised and lips brought together to seal the mouth, redirecting the already flowing air through the nose. Sound is simply ended as there is no release.
/n/ /n̩/ alveolar nasals
An alveolar (from alveola the ridge just behind the front upper teeth) nasal is a sound in which the flow of air out of the body is redirected from the mouth to be made to exit through the nose by touching the tongue to the alveolar ridge — the part of the roof of the mouth, just behind the upper front teeth. This allows the latter portion mouth to act as a resonance chamber resulting in the sound slightly more shallow than that of bilabial nasals. English has two alveolar nasals – /n/ which occurs at the beginning of a syllable (syllable-initial) as in need, know and running, and syllable-final /n̩/ which occurs at the end of a syllable as in can, nine, and given.
Production of syllable-initial /n/ is begun with the tongue pressed against the avleolar ridge, the vocal cords vibrating, and air escaping through the nose; finally the tongue is lowered, resulting in a release and restoring the usual flow of air through the mouth. For syllable-final /n̩/, the order is reversed beginning with vocal cords made to vibrate while air is allowed to escape through the mouth, then the tongue is raised and pressed against the alveolar ridge, redirecting the already flowing air through the nose. Sound is simply ended with the tongue still pressed to the alveolar ridge as there is no release.
/ŋ̯ / /ŋ / velar nasals
A velar (from velar the velum or soft palate) nasal is a sound in which the flow of air out of the body is redirected from the mouth to be made to exit through the nose by pressing the back of the tongue to the velum — the soft part of the roof of the mouth farthest from the front teeth; it’s about as far back in the mouth as can be reached with the tip of the tongue. This allows the only the throat to act as a resonance chamber resulting in a shallow sound which is ended with a reduced velar stop. English has two velar nasals – /ŋ/ which occurs at the end of a syllable (syllable-final) as in ring, singer and meaning, and syllable-initial /ŋ̯/ which occurs only at the beginning of certain foreign words such as the Vietnamese surname, Nguyen.
Production of syllable-final /ŋ/ is begun with the the vocal cords vibrating while air is allowed to escape through the mouth, then the back of the tongue raised and pressed against the velum, sealing the mouth and redirecting the already flowing air through the nose. Sound is ended by interrupting the flow of air with the velar stop /g/ (although the /g/ ending /ŋ/ is much weaker than the standalone lengua-velar stop). Syllable-initial /ŋ̯/ is produced similarly except that production is begun with the tongue pressed against the velum with the initial voicing being wholly nasal. /ŋ̯/ ends in a /g/ as a velar plosive release.
Liquids – a liquid is a consonant produced when the tongue approaches a point of articulation within the mouth but does not come close enough to obstruct or constrict the flow of air enough to create turbulence (as with fricatives). Unlike nasals, the flow of air is not redirected into the nose. Instead, with liquids the air is still allowed to escape via the mouth, but its direction of flow is altered by the tongue sending it in different directions within the mouth before exiting the lips. The unique sound of each liquid is affected by the position of the tongue and the way in which the exhaling air is directed around it. There are two primary types of liquids — laterals in which the air is directed toward the sides of the mouth, and non-laterals in which the flow of air is altered but still directed forward. The individual sounds of each type derive their names from points of articulation toward which the tongue is positioned. Like nasals, liquids occur in sets of very similar sounds — syllable initial, syllable-final, and in the case of non-laterals a third form, the trill. For an interactive example of each sound (including descriptive animation and video), click this link, then in the window that opens, click nasal, and select the appropriate sound (only syllable-final sounds are represented).
/ l / / ɫ / lateral liquids
A lateral (from Latin laterus to the side) liquid is a sound in which the flow of air out of the body is redirected around the tongue and toward the sides of the mouth before exiting through the lips. English has two lateral liquids. the alveolar lateral approximate /l/ in which the tongue is brought near (approximate) the alveolar ridge, forcing the air around the tongue toward the sides (lateral) of the mouth before being allowed to exit. /l/ occurs in syllable-initial position for example like, melon, and hello. The syllable-final sound /ɫ/ is referred to as a velarized alveolar lateral approximate, meaning that in addition to the tip of the tongue being brought near the alveolar ridge, the back of the tongue is raised toward the velum as well. /ɫ/ occurs in syllable-final position for example full, little, and belfry. As with nasals, the order of articulation is reversed between syllable-initial and syllable-final laterals.
/ ɹ / / ɻ / / r / non-lateral liquids
A non-lateral (from Latin non not and laterus to the side) liquid is a sound in which the flow of air out of the body is altered by the shape of the tongue, usually flowing over the tongue resonating near the roof of the mouth (but not toward the sides of the mouth) before exiting through the lips. English has three non-lateral liquids, with most dialects having two (rhotic), some having a third (trill), and some having only one (R-dropping). In syllable-initial / ɹ / as in rabbit, run, and borrow, referred to as a retroflex approximate, the tongue is brought forward the curled backward toward the roof of the mouth (retroflexion). It comes near (approximate) the roof of the mouth but does not touch it. The sound is released by lowering the jaw and drawing the tongue back to neutral position. This is the most common r-sound in English. Common in most dialects, syllable-final / ɻ / is similar to the syllable initial form. Depending on the accent of the speaker, this sound may be either an alveolar approximate or a retroflex approximate (some speakers place the tongue closer to the alveolar ridge, others put it in the same position as syllable-initial / ɹ /. The primary difference between syllable-initial and syllable-final forms is that the syllable-final sound begins and ends with the tongue and jaw in the approximate position. This differs from syllable-initial position which ends with the jaw lowering and the tongue returning resting position. Compare movement within the mouth between / ɹ / in red and Robert, and / ɻ / in car, better, and urgent. Finally, some dialects possess a third non-lateral approximate /r/ known as a trill (and in lesser form a flap). These sounds are often referred to as rolled-r. In producing this sound the tongue is quickly and lightly (and in longer trills, repeatedly) brought into contact with the alveolar ridge. Otherwise the /r/ is produced in the same manner as syllable-initial / ɹ / or syllable-final / ɻ / depending on position. The sound /r/ is a primary characteristic of many Scottish accents and is also found in certain Spanish loanwords in North American English including burrito and perro.
Glides – a glide, like a liquid, is a consonant produced when the tongue approaches a point of articulation within the mouth but does not come close enough to obstruct or constrict the flow of air enough to create turbulence. Unlike nasals, the flow of air is not redirected into the nose. Instead, as with liquids, the air is still allowed to escape via the mouth, but its direction of flow is altered by having it glide over the tongue before exiting the lips. The unique sound of each glide is affected by the point at which the tongue is brought closest to the point of articulation. The primary difference between liquids and glides is that with a liquid, the tip of the tongue is used, whereas with glides, body of the tongue and not the tip is raised. This provides a wide narrow space over which air passes before exiting the mouth. There are two primary types of glide in English — labiovelar and palatal. Each type derives its name from points of articulation toward which the tongue is positioned. Like nasals and liquids, glides occur in sets of very similar sounds and in Old English there were a variety of these sounds, but Modern English possesses only one of each type in most dialects. For an interactive example of each sound (including descriptive animation and video), click this link, then in the window that opens, click glide, and select the appropriate sound.
/w/ /?/ labiovelar glide
A labiovelar (from Latin labia lip and velar the velum or soft palate) glide is a sound in which the flow of air out of the body is altered by first the shape of the tongue, with the main body of the tongue (not the tip) being raised toward the velum — the soft part of the roof of the mouth farthest from the front teeth; it’s about as far back in the mouth as can be reached with the tip of the tongue. This creates a wide but shallow space with the air flowing over the tongue resonating near the roof of the mouth (but not toward the sides of the mouth). The unique characteristic of labiovelar glides is that production of the sound begins with the pursed together forming a narrow circular opening. The lips are then relaxed and the jaw dropped, opening the mouth. This sound, as described is the syllable-initial (in this case more aptly described as the pre-vocalic form because it also appears after other consonants, but always before the vowel within a syllable) form /w/ as in will, why, and quick and flower. The symbol /?/ has been used to reference the possibility of other related sounds. In Old English there existed at least two w-sounds with words currently spelled wh- representing words which initially began with this other sound. We unfortunately no longer have record of what this sound was or how it was pronounced, but it is likely similar to /w/. In Modern English there exists a second version of /w/ which occurs after the vowel (post-vocalic). This sound is not yet recognized by the IPA and thus does not have a symbol (represented with strikethrough herein). As with syllable-initial and syllable-final pairs, the post-vocalic /w/ is produced in reverse order of pre-vocalic /w/ with production of the sound beginning with the mouth opened and the lips relaxed, and ending with the lips pursed together forming a narrow round opening. Contrast the beginning and ending jaw and lip positions of /w/ as in weed or wow with those of /w/ in chew and wow. There is a third w-sound in Modern English which is rare but still present in modern phonology. That sound /ʍ/ known as a voiceless labiovelar is the version of /w/ in which the vocal cords are not used; compare voiced /w/ in water with voiceless /ʍ/ in the interjection whew! It is likely that the w-sound represented by wh- spellings was originally one of these two latter versions of labiovelar glide.
/j/ palatal glide
A palatal (from palate the top of the mouth) glide is a sound in which the flow of air out of the body is altered by the shape of the tongue, with the main body of the tongue (not the tip) being raised toward the hard palate — the part of the roof of the mouth, just behind the alveolar ridge and forward of the velum (for many speakers, the lateral edges of the midsection of the tongue can be felt pressing up against the molar teeth). This creates a wide and fairly shallow space with the air flowing over the tongue resonating near the roof of the mouth (but not toward the sides of the mouth) and then passing between the alveolar ridge and the downward slope of the tongue and finally out of the mouth. Modern English has only one palatal glide represented by the symbol /j/ as in you, cube, and onion.
A team at the University of Iowa developed this (now abandoned) very useful website a few years ago. It’s an interactive flash site with both audio and visual animation showing the process for speech production. It’s excellent for showing language learners points of articulation, or if you are linguist struggling with IPA or slight differences between languages, this is also a great reference.
Universal Points of Articulation: anatomy.htm
Language-Specific Points of Articulation (English, German, Spanish): http://www.uiowa.edu/~acadtech/phonetics/#