Meet English’s Newest Consonant


As stable as they may seem, every sound of every spoken language, at some point in the past, didn’t exist. The incessant shifting of speech involves the innovation of sounds, when either new ones are born or old ones transform. Consequently, within the sounds that a particular spoken language (such as Modern English) makes use of, we can identify both oldtimers and newcomers, differing in the duration of their employment.

This piece is all about one particular consonant sound, to be heard often in spoken English today, but not in older stages of the language. Because of this limited chronology, this consonant may well be the newest addition to the band – although I just can’t bring myself to formulate it any more strongly than “may well be”.

As I’ll first try to demonstrate, what it means to be a particular sound of a particular language is not straightforward. What it definitely is, though, is a linguistically rich and interesting topic. So, stick with me.


The frustrating elusiveness of sounds

If I sneeze in the middle of uttering an English sentence, is a sneeze one of the sounds of English speech?

This is an intentional absurdity, likely to elicit a prompt response of no, but it has its uses. It tests our conception or intuition for how some sounds, but not others, belong to the phonetic ingredients of certain spoken languages. Some of the sounds that we humans make while speaking, like sneezes and coughs, are easily excluded from that set. For one reason, they contain no obvious linguistic content, interrupting rather than contributing to what we want to get across.

There are also those sounds that are well used in the world’s tongues, just not in English. The Ll-sound of Welsh, the pharyngeal consonant behind Arabic ﻉ, and Czech’s infamous Ř are common in their respective languages, but I’d bet that neither the laylinguist nor the expert could reasonably include them when drawing a border around the sounds of English.

Then there are those sounds whose participation and status in speech is much less clear cut. English speakers, for example, produce a bewildering plethora of consonants with every sentence. Some of them seem alien in isolation, but in fact do occur in English, and with such consistency that they help to create ‘authentic’-sounding English speech.

This great video by the YouTube platform Pronunciation Studio discusses five of them, and I’ll respectfully pinch his example of the labiodental sound [ɱ]. As its IPA symbol suggests, this is a nasal sound, much like a lippy M, but with the bottom lip lightly touched by the top teeth to halt the departing airflow temporarily.

A cross-section of your mouth while making a [ɱ] sound

Uttering a repeated sequence of [ɱ] on its own sounds bizarre and unfamiliar, but it’s what English speakers tend to say in casual speech between some words (e.g. on form) and within others (e.g. symphony, infant, invisible). The option always remains, though, to pronounce the Ms and Ns in such words as [m] and [n], especially when we say the words more slowly and loudly for the purpose of emphasis or education – that is, in ‘hyperspeech’.

For another example, in my English, I might say what or cat seven out of ten times with a glottal stop [ʔ] for the final T-sound. However, if I’m asking what? with a particular degree of venom, or if I’m enthusiastically indicating a cat, I’ll most likely pronounce the T as [t].

These cases demonstrate that among all the many sounds that speech can include, there is a subset that we think of (consciously or not) as the proper ingredients of English speech. If I didn’t, I’d have no problem with pronouncing what as [wɒʔ] in all contexts, stressed or unstressed. For other English speakers who are fortunately not me, that may be fine for them.


That most exclusive subset of sounds contains the phonemes of a spoken language. These are its key sounds. My go-to metaphor is that phonemes are the ‘building blocks’ of speech. I don’t want to get into how some linguists have greatly objected to the concept; it suffices to mention that some linguists have greatly objected to the concept. With that debate acknowledged, the fact remains that phonemes are a very useful and successful theory.

We can identify sounds like [t] and [m] that hold the status of independent phonemes of English. They can appear anywhere in any word and in any conversational context. When we write about a language’s phonemes, the sounds in question don dashes: /t/ and /m/. Their status accords with our intuition, although the exact reality of phonemes is again the subject of debate: are they something in the mouth or in the brain?

Meanwhile, [ʔ] and [ɱ] are possible variants or ‘allophones’ of /t/ and /m/ – common sounds for sure, but dependent on the context of where and when they’re used.¹ Allophones often get replaced in instances of hyperspeech and emphasis. Phonemes, not all their various allophones, are also what we’d expect would be given their own dedicated letters in an alphabet, although this is a flimsy test for phonemes, with some good historical counterexamples.

A sample of writing in Avestan, an ancient Iranian language with a dedicated script that’s extremely phonetically precise, in order to ensure accuracy when reciting.

The classic test for phonemes whether or not they make a meaningful difference to a word. If I were to say [mæp] and [næp], the two sounds would lead other English speakers to identify the two words map and nap. These form a minimal pair, two words that are the same in all respects except for the sound under scrutiny. Map and nap are different English words, so /m/ and /n/ gain entry to the phoneme set.

If I instead said [mæp] and [ɱæp], the second would be received as a weird, mispronounced version of the first word – at least by English speakers. The two sounds might be phonemes in another language, in which [mæp] and [ɱæp] begin with two separate consonants and function as two separate words.

With all this in mind, a typical inventory of the consonants of Modern English will include around twenty-four phonemes. These can be said to ‘belong’ to the sounds of English, with all due caveats and provisos applied.

Just as phonemes can differ across the geography of the world’s languages, so too can they differ across time. A sound can achieve phonemic status over time, especially if it occurs often enough and helps speakers to distinguish one from another. Where later speakers would recognise it and say ‘Sure, you’re making an X sound’, their linguistic forebears would say ‘That sounds like a kind of weird Y sound’.

That the phonemes that define a spoken language can change is an essential fact for my purposes. I want to discuss the consonant that I think most recently joined the club of English VIPs (Very Important Phonemes).


Greet the new kid

Of course, in addition to the difficulty in defining ‘a sound of English’, there’s the further difficulty in defining English. It’s such a broad family of accents and dialects nowadays that one sound could be on its way to gaining (or losing) phoneme-hood for some speakers and not for others.

Vowel sounds in particular fluctuate like the wind, so I won’t bother trying to identify English’s newest among those. But I should have more luck among the air-restricting group of speech-sounds that we call consonants. Indeed, I have one in mind that seems to be both stable and common to most accents under the English umbrella (one exception being Philippine English).

But enough with the academic preamble and caveated caution; I’ve teased you long enough. So, what is the newest consonant of English?

Well, here it is: /ʒ/.

This is the smooth-talking fricative sound that zhooshes up English words like pleasure, visual, confusion, seizure, luxury, television, Asia, sabotage, regime and beige. The variety of methods for spelling this consonant (here: SU, ZU, XU, SI and GE) is indicative of its marginal status in the language and in speakers’ minds, and of its youth. It developed in English long after any window of opportunity for being granted its own letter.

Consequently, it’s one of the many changes in speech that standard English spelling remains largely ignorant of, a theme explore in detail in Why Q Needs U. We English readers instead know to infer /ʒ/ from written contexts like the SU in usual. But occasionally it needs to be taken out of that context, so English writers have had to improvise. ZH has stepped up to spell the sound, such as when the usual is abbreviated to ‘the uzhe’.

The consonant has no such identity issues in other languages, mind you. French has the sound and spells it with J or soft G (e.g. je, joli, manger, plonger). Ukrainian speech employs it too and renders it in writing as Ж, while Czech reserves Ž for the same consonant.


English /ʒ/ is also a variable sound, sometimes shifting from speaking to speaker, word to word. For example, it’s inconstantly present in how I pronounce nausea. I reckon I lean towards ‘norzhuh’, but trisyllabic ‘norzi-yuh’ is an option for me too. The latter is perhaps because of how I pronounce the derived adjective nauseous (‘nor-zi-us’). I can also hear variation with amnesia: ‘am-nee-zhuh’ or ‘am-nee-zi-yuh’.

You might also hear speakers pronounce seizure as ‘seez-yuh’ and visual as ‘viz-yul’, with the internal sequence of two sounds [zj] instead of just [ʒ]. Such speakers have very good reason to do so; theirs is a conservative pronunciation. The [zj] sequence in older English that they maintain has been one of the sources of the sound [ʒ].

Once, all words like measure and leisure, adopted into English from French or Norman, happened to be pronounced with the sequence [zj]. It was spelled with an S, since using Z for the consonant [z] wasn’t an established thing yet. This constancy in spelling makes it hard to pinpoint when the shift began, but we can at least say that it’s been happening within post-medieval English.

Speakers gradually merged the two sounds into one, for reasons of phonetic efficiency. The old [j] sound in the sequence (spelled with I or implicit within the ‘French U’ of measure) pulled the preceding [z] sound back in the mouth, towards the hard palate, and merged with it. The position of the stressed syllable in a particular word likely played a role in whether and when the [z] shifted.

A cross-section of your mouth while making a [ʒ] sound

These instances of [z] acquired a new palatoalveolar position in the mouth by means of the process of ‘assimilatory palatalisation’. A similar thing has gone with the [sj] sequence once pronounced in station and pressure.

The other source of [ʒ], of course, is loanwords adopted from other languages with no such questions over their phonemes. Modern French is the donor par excellence, responsible for the ‘super-soft G’ in lingerie, massage and rouge. Acquisitions from other languages are much rarer, but modern famous names, like that of the USSR’s Marshal Zhukov, may have made a small contribution.


The question does deserve to be asked and answered, is /ʒ/ really an English phoneme? Some linguists would bar it from the VIP club, on account of its limitations. They might argue instead that [ʒ] is just an allophone of /ʃ/ (the shushing sound in shoe and fish) or of /d͡ʒ/ (the jarring consonant in judge).

As mentioned, to be admitted, a phoneme needs to be able to distinguish two words of a language apart. The number of minimal pairs for /ʒ/, /ʃ/ and /d͡ʒ/ is, well, minimal. But we do have measure and mesher (‘a person who meshes’), and also pleasure and pledger (‘a person who pledges’) for the second contrast. A near-minimal pair is offered by vision and fission. By the skin of its teeth, /ʒ/ passes this test.²

Can /ʒ/ appear anywhere in a word, thereby being independent from a particular sound-context? Genre is its best hope here, providing rare evidence of /ʒ/ at the start of a word. It’s always pronounced with an initial /ʒ/ for me, but it’s true that many people say genre with /d͡ʒ/ instead, like John.

This is a case of ‘nativising’ the sound, swapping it for one that’s much more common and comfortable. See also: the second G of garage, which is a super-soft /ʒ/ for some, but a less soft /d͡ʒ/ for others, including me. I rhyme garage with marriage.

This has a long tradition. In the late medieval and early modern eras, English consistently adopted French words with /ʒ/ but said them like /d͡ʒ/ instead. In time, though, the pressure from incoming words and the new pronunciation of [zj] increased the pressure.

At last, the dam burst, and /ʒ/ has established itself a new phoneme. As for when this happened, I’d say it occurred recently in historical-linguistics terms; I’d narrow it down to ‘over the course of the modern era’.


In a strange way, though, English speech was expecting this development. It had room ready for the newbie. The thing is, the sounds of speech work through contrasting features; English has historically relied on the quality of voicing (vibrating in your larynx) to distinguish sounds from one another.

For example, the /s/ in words like sue has no voicing, unlike the voiced /z/ in zoo. Voicing likewise separates fats from vats (/f/ vs. /v/). The difference in plosive sounds between pat and bat (/b/ vs. /p/) and tin and din (/t/ vs. /d/) is also traditionally framed as a voicing contrast, but in the English speech of today, it’s become something quite complex.

All these consonants of English had partnered up (one voiceless, the other voiced) during the Middle Ages – except for /ʃ/. What was the voiced companion for the /ʃ/ in shoe and fish? It was a lonely sound, and the English system of consonants had a vacancy.

The plosive and fricative phonemes of English in, let’s say, the year 1500

To become a phoneme, /ʒ/ sure had to work for it, but English was ready to accept it for many centuries. It now functions as the voiced counterpart to /ʃ/, which is why I find its ZH spelling very appealing; ZH contrasts with SH, together mirroring Z and S. It’s why [ʒ] may ‘feel’ like a natural and easy sound to produce, despite its position on the periphery of speech and spelling.


I’ve seen it suggested that /ʒ/ should be excluded from any reckoning of the sounds of English, because it’s a ‘foreign’ sound. This is to say, because it appears exclusively in loanwords (mainly from French or Latin), it’s not a natural phenomenon within English. I disagree with this; the borders between languages are not fixed, and the bulk of the instances of /ʒ/ are due to English speakers changing the [zj] in visual and measure into a single sound. That development can’t be blamed on the French.

That said, an exotic air still lingers around /ʒ/. Touching on the domain of phonaesthetics, the sound seems to be coloured with hews and nuances of foreignness, if only in the minds and mouths of English speakers. There are words and place-names which many of us pronounce with /ʒ/, despite there being no etymological reason for us to do so.

Azerbaijan, Beijing and Taj Mahal are three prominent examples. I’ll hold my hands up to pronouncing them with /ʒ/ (that is, as if ‘Bey-zhing’). Yet that’s not backed up by their local languages. In all three (Azerbaijani, Mandarin Chinese, Hindi), the English J corresponds to an affricate sound, similar to or the same as the /d͡ʒ/ in jewel. In the absence of a better explanation, this looks like hyperforeignism and an exoticising of places abroad.

It seems that /ʒ/ can’t break free from its marginalised status within English speech – a phoneme now, yes, but still considered new and a bit weird. With time, its lot may improve. More uses for it could appear, ideally with chances to establish its ZH spelling better. It could take its place alongside not only voiceless SH, but also the CH, PH, TH and WH digraphs.

If you’re coining new words anytime soon, why not consider including it? The new phoneme on the block will thank you.

END.


Footnotes
  1. I remain uncertain of whether the definition and criteria for phonemes accidentally exclude what is arguably the most common sound in English speech, the humble schwa vowel [ə].
  2. Sometimes the vocabulary of a language has just worked out in such a way that many minimal pairs can’t be found. The two TH-sounds of English in thin and then (fricative /θ/ and voiced /ð/) also struggle to prove their phoneme-hood this way. There’s thigh and thy at least, and mouth (the body part) and mouth (the action). I remember linguists at Edinburgh being grateful to the Scottish touristy shop Thistle Do Nicely for another example of the contrast, this time between thistle and this’ll.
Non-linked references
  • Martínez, J., & de Vaan, M. (2014). Introduction to Avestan (Sandell, R. Trans.) Brill.
  • Minkova, D. (2013). Historical phonology of English. Edinburgh University Press.

Images my own or from Wikimedia.

Leave a comment