Why does Czech sound like that?

Reading time: 10-15 minutes

To listen to this piece and my mixed success at pronouncing both modern-day Czech and ancient Slavic, click here:

To hear the sounds denoted between /slashes/, click here for an interactive IPA.

The Czech language has, among language learners and lovers, a fairly fierce reputation. Over the intricacies of its grammar and vocabulary, its difficulty is perhaps most associated with its phonology – that is, its sounds and how they’re put together. To best this particular linguistic beast, the learner must get comfortable with complex clusters of consonants and apparently vowel-less words and syllables, to say nothing of the infamous Ř, which I’ve tackled previously.

Czech for example offers the learner phonetically fiendish words like prst ‘finger’, vlk ‘wolf’, tvrdý ‘hard’, plný ‘full’ and hrnek ‘mug’, containing whole syllables without an obvious vowel, while džbán ‘jug’, chci ‘I want’, tchán ‘father-in-law’, íce ‘spoon’, lstivý ‘cunning’ and čtvrt ‘quarter’ begin with challenging sequences of consonants. As the ancient Czech wisdom goes, strč prst skrz krk.

But how did this sound system come to be? How did Czech phonology develop into its modern form with these distinct features, often at odds with its Slavic siblings? That’s where historical linguistics can be of help. By comparing the whole Slavic language family and studying our oldest sources, historical linguistics provides us with the story of how Czech became so darn, well, Czech-y. This post is my answer to the big question above, looking specifically at Czech’s obsession with consonants. In brief, as is so often the case, it’s due to a couple of simple, reasonable changes that had massive, language-wide consequences. Let me show you how.

Winding Back the Clock: Proto-Slavic

The common ancestor of the Slavic language family, including Czech, is a prehistoric language we call Proto-Slavic. I say “prehistoric” because we don’t have historical written records for it, but it was by no means a language of the distant past. Our oldest Slavic sources (written in Old Church Slavonic from the 10th century AD onwards) start to appear shortly after the period of general linguistic unity, when the cracks between different dialects of Slavic were only just emerging. So, even though we don’t have sources for Proto-Slavic, we can know quite a lot about it. By using what we do have, we need only wind the clock back a little bit to theorise about this ancient ancestor.

A rough map of the modern-day distribution of Slavic languages, from here.

Surprisingly, Proto-Slavic was very fussy about the structure of the syllables in its words, in stark contrast with Czech today. Proto-Slavic overwhelmingly preferred open syllables. This is to say, it liked its syllables to consist of either a sequence of consonant-then-vowel or just a vowel, but crucially without another consonant at the end to close the syllable off.

Japanese and Italian are two modern languages with this same drive towards open-ness. My own first name, Danny /dæni/, would be an acceptable Proto-Slavic word, because it has two open syllables and the general sequence Consonant>Vowel>Consonant>Vowel. My surname Bate /beɪt/ on the other hand is C>V>C and a closed syllable, and so would not be okay for the ancient Slavs.

We can see this syllable structure in our Old Church Slavonic sources. Have a look at this real example:

Dostoitъ li dati kinьsъ Kesarevi ili ni? Damъ li ili ne damъ?

‘Is it right to give tax to Caesar or not? Do we give or not give?’
Codex Zographensis. Mark 12.14. 11th century AD.

All the syllables in these two questions are open. “Kesarevi ili ni” for instance has the syllabic sequence Ke-sa-re-vi i-li ni. No final consonant is there to close any of these.

As do-stoi-tъ illustrates, more than one consonant was allowed at the start of a syllable, and the drive towards open syllables did create some new groups of consonants. For example, many syllables at an early Proto-Slavic stage ended in the ‘liquid’ consonants /r/ and /l/. This became a target of the open-syllable push, so the liquids found themselves moved to precede the vowel. Early Proto-Slavic *gȏr ‘castle’, *mol‘young’ and *dòl ‘palm’ would become gradъ, mladъ and dlanь by the time of Old Church Slavonic. Through the early forms, we can better appreciate the ancient connections of *gȏrdъ and *moldъ to related words outside Slavic, including English garden/yard and mild.

This shifting of /r/ and /l/ is in part responsible for the diverse consonant clusters so beloved by Czech, which likewise maintains the three words above as hrad, mladý and dl. Yet in early Slavic we still don’t see anything near the crazy clusters that Czech lovingly inflicts upon us today. So, what happened next?

Introducing: The Yer Yers

So, how did we get from this openness and structured simplicity to Czech’s laissez-faire attitude to sounds and their sequences? It’s to do with those odd letters ъ and ь that you can see in the OCS example above. If you know some Russian, you may recognise these as the hard and soft signs that Russian spelling uses today. Yet, once upon a time, they represented two vowels.

They’re known as the yers, and they once stood for two really common, but really short vowels. Even though they’re letters of the Cyrillic script, it’s typical to use ъ and ь even when writing Old Church Slavonic and Proto-Slavic words in English (as above), because it avoids the issue of what vowels they actually were. We can tell through connections between Slavic and other languages that they developed from very distinct /u/ and /i/ vowels, as in English food and sheep. Yet, by the time of Proto-Slavic, they were likely more like the short vowels /ʊ/ and /ɪ/ in English foot and ship, but also had a tendency to lose their distinctiveness and become simply the central vowel /ə/, as in English about.

Most important for our purposes is the fact that ъ and ь systematically disappeared from Slavic. Despite their importance for the sounds and grammar of the language, they were too short and too indistinct to survive. While ъ and ь are present in our early sources, you can tell that the authors are starting to get these letters confused. For instance, the passage above has the word damъ, but the same passage in another manuscript (Codex Marianus) has damь. These extremely common vowels were fading away, and with their departure, the whole system of open syllables would come crashing down.

A page of Codex Zographensis, showing the first page of the Gospel of Mark, written not in Cyrillic, but Glagolitic. From here.

So, while Proto-Slavic had words like *mǫ̑žь ‘man’ and *stòlъ ‘table’, with their nice (C)CVCV structure, the loss of ь and ъ from the end meant that there were now consonants there to close those first syllables off. These two have become monosyllabic muž and stůl in Czech. Lots of consonants, once separated by ъ and ь, found themselves clustered together. Proto-Slavic *tьma ‘darkness’ and *pъtakъ ‘bird’ are all nicely CV in their syllables, but if you take away ь and ъ, you’re left with Czech tma and pták.

To appreciate how many consonants in Czech have become closer neighbours over time, compare these words in both Old Church Slavonic and Modern Czech:

Old Church SlavonicCzechMeaning
sъpatispátto sleep
bьratibrátto gather, to take

By the way, the loss of the yer vowels meant that the letters ъ and ь in the Cyrillic script were later repurposed to denote not the vowels, but rather the effect those vowels had on the preceding consonant. They thus became what they are in the Cyrillic of today, the hard and soft signs.

Likely at an early stage in this whole story, the combination of a yer and either of the very sonorous liquid sounds /l/ and /r/ created something new: syllabic consonants. In a nutshell, the consonant took over the functions of the short vowels, and came to act in their stead as the core of the syllable. Meanwhile, the original yer disappeared. So, in a word like *vьrxъ ‘top’ in Proto-Slavic, the sequence *ьr in the first syllable had probably already developed into a syllabic /r̩/.

Slavic languages differ as to the development of these combinations of sounds, but Czech has largely maintained the syllabic consonants up to today; *vьrxъ has become Czech vrch ‘hill’. It’s these sounds that that are behind all those seemingly ‘vowel-less’ words, like vlk ‘wolf’, prst ‘finger’, krk ‘neck’ and trh ‘market’. They’re not really made up of only consonants. Rather, /l/ and /r/ are functioning like vowels. Czech just has a liberal definition of what can perform the role of a vowel.

(Syllabic consonants, by the way, aren’t anything weird. English can have them too, at the end of words like rhythm and bottle.)

My Answer:

So, why does Czech sound like that? My main response is that so much of the shape of Czech today is the product of two very general changes in early Slavic:

  • First, the drive towards open syllables, i.e. syllables that do not end in a consonant
  • Second, the widespread loss of two short vowels, known as the yers, on which so many open syllables depended

While the first change is responsible for consonant clusters like the /ml/ in mladý, the second is responsible for the syllabic consonants in words like krk and sequences like the one in džbán. These two together, over time, greatly altered the sounds and structures of the language that would one day become Czech, and are responsible for so much of its distinct character today.

Diving Deeper: Havlík’s law

If you’re still reading, follow me a little further. What I’ve discussed so far is not the full story of the yers. The thing is that the yers disappeared “systemically”, but not completely. Under certain conditions, they in fact got ‘upgraded’ to full vowels, and are part of many Czech words today.

This insight is key to Havlík’s law, a pattern of sound change recognised by the Czech linguist Antonín Havlík in 1889. The law aims to capture the difference between ‘weak’ yers (which were lost) and ‘strong’ yers (which survived). In the case of Czech, the strong yers have become the vowel /ɛ/, as in Czech pes ‘dog’ or English yeah.

The way Havlík’s law works is this: starting from the end of the word and counting backwards, every second yer was strong and got ‘vocalised’ into /ɛ/. If the count back reaches another kind of vowel, start counting again from the next yer prior to that.

Let me illustrate with an example. On the basis of Old Church Slavonic and general comparison, we can tell that the Proto-Slavic word for ‘day’ was *dьnь. This specific form of the word, the nominative singular, has two yers. The final ь is number 1, counting backwards, while the earlier one is number 2. Therefore, according to Havlík’s law, number 1 will be weak and will disappear, while number 2 will be strong and become /ɛ/ in Czech. This is indeed what we see: *dьnь has become Czech den.

Proto-Slavic *dь²¹ > Czech den

(nominative singular)

But what if we put *dьnь in another case? Its genitive singular form (‘of a/the day’) was *dьne. Now that its ending is another vowel, the first ь is number 1 according to the law, and so it will be weak and will disappear. This is again what we see in Czech. The genitive singular of den is dne.

Proto-Slavic *¹ne > Czech dne

(genitive singular)

Havlík’s law can therefore illuminate what is for learners of Czech a key bit of vocabulary with a strange grammar!

Déšť, Czech for ‘rain’, shows the same developments in its history, since it too comes from a word with two yers.

Proto-Slavic *dъ²zdjь¹ > Czech déšť

We can also try out Havlík’s law with a longer word and more yers. Take Proto-Slavic *dьnьsь, which means ‘today’ (literally ‘day-this’). Out of its three yers, we predict that only the second will be strong and survive into Czech today. Sure enough:

PS *³nь²¹ > Czech dnes

Pes ‘dog’ and sen ‘dream’ are further words that in their various forms in Czech today show the systematic loss of the old yers. They go back to Proto-Slavic *pьsъ and *sъnъ and, like den, they show some vowel variation.

PS *pь²¹ > Czech pes

PS *sъ²¹ > Czech sen

(nominative singular)

PS *¹sa > Czech psa

PS *¹na > Czech sna

(genitive singular)

PS *¹si > Czech psi

PS *¹ni > Czech sny

(nominative plural)

Havlík’s law, though it needs a lot of unpacking to be useful knowledge, can really shed some welcome light on a fair few quirks of Czech grammar today. If you’re learning Czech, this law really is your ally.

Bonus history!

Something I love is that the loss of the yers can allow us to say roughly how long a foreign loanword has been part of Czech vocabulary! Missa and molīnum were Latin for ‘Mass’ (the Christian service) and ‘mill’. These got borrowed into Proto-Slavic as *mьša and *mъlinъ, with the yers approximating the vowels of the original language. These words have become mše and mlýn in Czech. The fact that these words show the effects of the yers’ disappearance means that they were borrowed into Slavic before it happened – that is, a very long time ago!

That’s all from me for this month, language lovers.

Děkuji a na shledanou!



6 thoughts on “Why does Czech sound like that?

  1. This is probably what gives Czech a sound I think could be described as ‘staccato’ (or even ‘syncopated’?) – which (in my opinion) makes it good for singing jazz tunes. If you look up Ondřej Havelka and listen to his songs in Czech (including the Czech version of ‘Happy Feet’, ‘Chodidla’), it might give you an idea of what I mean.

    Liked by 1 person

  2. If I were still on the birdsite, I’d give this a like, instead I’m leaving this comment. Thanks for the interesting read and writing it up!

    Liked by 1 person

  3. What I find interesting is that man and woman sounds differently. Because in the masculine the past tense ends with closed syllable but in the feminine the past tense ends with open syllable. For example “skočil” vs. “skočila” or “četl” vs “četla”. (And how do you like my favorite czech word “scvrnkls”?)


  4. Perfect. The clusters of consonants resemble the development of today´s spoken French, a disappearing of /ə/.
    Fais ce que tu veux /fɛ skty voe/. Je te donne /ʃt don/.

    Liked by 1 person

