The Almost Romance Languages

Reading time: 15-20 minutes

If you like languages, you’ve probably heard the terms Romance and the Romance family. Although it started life as a name for the language of medieval France, Romance has come to be the umbrella term for a big group of modern tongues, including French, Spanish, Portuguese, Catalan, Italian and Romanian. The key criterion behind this grouping is that they (and many more besides) are all linguistic descendants of Latin.

As the Roman state expanded beyond its original central-Italian heartland, its legionaries and ruling class spread its de facto official language across the conquered territories. Wherever the Romans ended up, people began to speak Romanly. In much of the empire, Latin went from an alien language of invaders, to an elite language of administrators, to a prestige language of general society, and finally to a mother tongue for all. This centuries-slow ousting of other languages is the reason behind the Romance family and its wide spread.

The modern distribution of Romance languages within Europe. From here.

While we do have this concrete historical reason for saying whether a language is a Romance one or not, matters aren’t clear cut. All of the accepted Romance languages include things that don’t have a Latin origin. These may pre- or post-date the arrival of the Romans. The French in France for example has its vestigial vigesimal counting system, calculating eighty as ‘four-twenties’ (quatre-vingts), which is a possible legacy of Gaulish. Spanish has gained many words from Arabic, thanks to the Islamic states of medieval Iberia. Romanian vocabulary likewise includes words of an unclear pre-Roman substrate language, as well as from local Slavic speech which later arrived on the scene.

Does this then make French, Spanish and Romanian ‘bad’ or ‘partial’ Romance languages? No! It does not, yet it’s a good reminder that recognising membership of a language family isn’t an easy binary yes/no, and that there aren’t some necessary ingredients or essential DNA that makes a language a Romance one. It’s a metaphor and an imprecise status, based on estimation and impression, and imposed by scholars from the outside. Language is infamously hard to pin down.

Latin would have competed with local languages for some time, slowly gaining ground in different social circumstances, different demographics and different generations. The adoption of Latin was not a simple switch, but rather affected the many components of language (vocabulary, morphology, phonology, etc.) at different rates. We can imagine that a Gaul in 52 BC would have good reason to adopt some specific kinds of Latin words, like military terms, but no reason to give up the endings of Gaulish grammar. There would have been some aspects that the linguistic newcomer took much longer to affect, or even never touched at all.

So, with all that in mind, I’d like to offer some nuance to our understanding of the Romance family, and introduce you to two Almost Romance languages.

Now this is a tongue-in-cheek term of my own, with little scientific justification, but it can serve at least to make us think. Contrary to the binary distinction of ‘Romance vs. not Romance’, it is reasonable to expect that in some parts of Europe, Latin would have had a slightly weaker influence on the local languages. In such cases, that influence would still be clear today, but less thoroughly present. We might expect to see plenty of Latinate vocabulary, but also some core words and grammatical structures that have endured since pre-Roman times.

The historian Peter Brown uses the metaphor of a tide in his overview of Roman history (1971). Romanity, once bound to the Mediterranean, washed far inland and flooded much of Europe beneath its waves of influence. I like this metaphor, because the effects of Rome, like waves and the tide, had no clear edges, but rather diminished the further inland it reached. I apply it to the spread of Latin. Where Latin engulfed the linguistic scene entirely, we find the Romance languages. Where Latin gave just a very good drenching, we find Almost Romance.

English is one good candidate for this elite club. Thanks primarily to the Normans, English has been inundated with words of a Latinate origin. That being said, 1066 was some time after the disintegration of the Roman Empire in the west, and it wasn’t Latin that the invaders were speaking. Also, frankly, I’ve written enough about English.

Instead, my two candidates for Almost Romance today are Albanian and Welsh, two languages that continually fascinate me and that I adore. Allow me to share the love and show you what I mean!


Albanian is today a language native to over seven and a half million people, most of whom live in the country of Albania or in Kosovo next door. It’s a member of the larger Indo-European family, although it forms its own solitary branch within it, something I do not dispute. Our sources for Albanian emerge pretty late by comparison with other European languages; the first mention of it dates to 1285, while our oldest written text, a mere fourteen words, dates to 1462.

This means that the prehistorical development and origins of Albanian are very murky. There have been efforts to connect it back to the Illyrian language of antiquity, spoken in the same region, but our sources for Illyrian are extremely sparse, and there’s evidence that Albanian was once spoken further inland and away from the Adriatic Sea.

Here’s the cool thing: despite the historical near-silence of Albanian between the Proto-Indo-European starting point and 1462, we can still propose two distinct developmental stages of Albanian in that long period. These stages are named ‘Pre-Proto-Albanian’ and ‘Proto-Albanian’. The former presumably existed in the 1st millennium BC, while Proto-Albanian had developed by c. 600 AD.

On what basis then can we propose this difference? In brief, the Romans!

The distribution of Albanian and its various dialects. From here.

The Roman state first made its presence felt on the other side of the Adriatic in the 3rd century BC. Their impact went from warfare to conquest, their territorial gains consolidating into the provinces of Illyricum, Macedonia and Epirus. Wherever the linguistic ancestor of Albanian was, Latin came to exert a huge influence on it. Since this contact occurred centuries ago, Albanian has since had time to undergo certain changes, which can obscure the connections back to Latin. Most famously, the Tosk Albanian dialects, on which the modern standard is based, underwent a process of rhotacism, in which an /n/ sound became /r/. Because of this, verë ‘wine’ and rë ‘sand’ don’t much resemble their Latin ancestors, num and arēna. In Gheg Albanian to the north, venë and në remain closer.

Believe me though, the connections are there in their hundreds! Take a look at these words in Standard Albanian, beside their likely Latin ancestor:

AlbanianLatin SourceTranslationAlbanianLatin SourceTranslation
peshkopepiscopusbishoplepurlepushare, rabbit
dhurojdōnāre I givekorbcorvusraven
vijvenīreI comekalëcaballushorse

With all these in mind, I dare say Albanian becomes a bit more familiar to English speakers! Thanks to Latin, there are plenty of cognate connections to spot, like portë and portal, mik and amicable, mbret and emperor, or kalë and cavalry!

The specific form of words like lepur and vjetër also tell us something about the grammatical situation of Latin at that time; they appear to come from words not in the nominative case, hinting at the general decline of the case system.

The lexical contribution of Latin to Albanian is simply vast; one previous estimate for the total today is around 600 words, another around 800 (Gramelová 2013: 101). Many loanwords are unsurprisingly culturally specific. There are those to do with government and ruling, and those to do with viniculture, both of which make sense in light of the Roman way of life. Yet there some that belong to more basic domains of vocabulary.

For example, pak ‘little, few’ and shumë ‘many’ are common words, yet come from Latin paucus and summus. The core ingredients of a family (mother, father, sister, etc.) would have hardly differed between the Roman and the ur-Albanian cultures, so we wouldn’t expect the prestige language to have made such an impact there. However, even some family terms in Albanian seem to come from Latin!

AlbanianLatin OriginEnglishAlbanianLatin OriginEnglish

Numbers usually remain stable in cases of language contact, and yet qind ‘hundred’ and mijë ‘thousand’ are also potential borrowings of Latin centum and mille. There is even some evidence for the borrowing of functional words and grammatical structures. Relative clauses with (‘which, who, that’) could an influence of Latin quī, but without knowledge of Pre-Proto-Albanian grammar, this remains speculative. We must also be careful not to overstate the Latin impact, as there are Albanian words that could equally be borrowed from Latin or inherited all the way from Proto-Indo-European, like qeni ‘dog’ and ve ‘egg’. Latin may have canis and ōvum, but that’s not the only way Albanian has acquired Indo-European-looking vocabulary.

In sum, it must be said that there are many ingredients that have gone into Modern Albanian; it has Greek, Latin, Slavic, Venetian, Turkish and now English loanwords, as well as of course its own inherited systems of grammar, sounds and vocabulary. Yet it was the Romans that left the biggest external impression on it. Even the Albanian for ‘Albanian’, shqip, has been derived by Hamp (1999) from a Latin verb!

However, the impression was not quite big enough to completely replace Pre-Proto-Albanian and create a new Romance language, or to qualify Albanian today for the status. So, it gains my esteemed personal accolade of Almost Romance.


Moving over now to the west of Europe, the island of Great Britain has been home for centuries to the Celtic language family. When the Romans landed in the first century AD, the locals (at least in what is now England, Wales and southern Scotland) seem to have spoken one language, or dialects thereof. This language was Common Brythonic, or simply ‘British’.

By the time the Roman administration packed up in the fifth century, the Romans had drastically altered the linguistic lie of the land. British still thrived in some areas, but in others Latin must have become the common mother tongue, as it did on the Continent. Presumably it was into such Latinised parts that the Angles and Saxons later migrated, as otherwise we have a more difficult time explaining the conspicuous absence of Brythonic influences in the newcomer to Britain, Old English.

Brythonic unity was not to last. Out of the changing socio-political situation in Britain arose three distinct languages: Breton, Cornish and Welsh. These three bear witness today to the linguistic landscape of Britain in those first few centuries AD. Welsh, while still a Brythonic language, was born out of that complex landscape and shows all the signs of intense and long-term contact with Latin.

Percentage of Welsh speakers in Wales according to the 2011 UK census. From here.

One such sign is vocabulary. Parina (2010) has compiled a set of the thousand most common words in Welsh, out of a corpus of one million. Of those thousand, 87 are of Latin origin, which even my poor maths can calculate as a noticeable minority of 8.7%.

Many of these still resemble their Latin ancestor, such as eglwys ‘church’, capel ‘chapel’, corff ‘body’ and syml ‘simple’. Many are quite basic in their meaning, such as pobl ‘people’, llaeth ‘milk’ and cadair ‘chair’, which come from Latin populus, lac and cathedra. Many unsurprisingly concern Christianity, a religion transmitted via and later promoted by the Roman state. Nadolig, the Welsh for ‘Christmas’, goes back to a Latin adjective for things to do with birthdays (nātālicius), while ysbryd ‘spirit’ and esgob ‘bishop’ derive respectively from spīritus and episcopus. Meanwhile, the Pagan gods of Rome endure in the Welsh days of the week!

WelshLatin SourceTranslation
Dydd LlunDiēs LūnaeMonday
(the day of the Moon)
Dydd MawrthDiēs MārtisTuesday
(the day of Mars)
Dydd MercherDiēs MercuriīWednesday
(the day of Mercury)
Dydd IauDiēs IovisThursday
(the day of Jupiter)
Dydd GwenerDiēs VenerisFriday
(the day of Venus)
Dydd SadwrnDiēs SāturnīSaturn
(the day of Saturn)
Dydd SulDiēs SōlisSunday
(the day of the Sun)

As with Albanian, Welsh has had many subsequent centuries to undergo additional sound changes, altering and obscuring the Latinity of some words. Llyfr ‘book’ looks very Welsh and has the famous /ɬ/ sound, but it derives from Latin liber. A library is a llyfrgell, which a Roman might recognise as a cella (in English: cell) for books.

Welsh also spoils us with the glimpses it offers into ordinary Latin in Roman times, often called ‘Vulgar Latin’. Like any language, Latin had its differences in prestige and register, but it can be hard to see the daily low-register language of the majority in our surviving sources. Welsh can help. The Welsh colour coch and Albanian kuq together tell us that people widely used a word for ‘red’ (or some similar colour) derived from coccum, a berry that produced a scarlet dye. Likewise, diwrnod ‘day’ goes back to Latin diurnāta, apparently an alternative word for ‘day’ that also led to French journée and Italian giornata.

Latin words written with an initial H appear borrowed into Welsh without an /h/ sound, such as awr ‘hour’ from Latin hōra. This absence of H is unlikely to be the result of Welsh’s assimilation of loanwords, since Welsh does and did have a /h/ sound. Instead these H-less words may reflect the widespread loss of /h/ within the Late Latin input, which is a well-known step in the development of Romance. Furthermore, words that started with sc- and st- in Latin now begin with ysg- and yst- in Welsh, having gained an extra vowel.

WelshLatin SourceTranslation

This phenomenon, called prothesis, has been productive in Welsh, and hasn’t only affected words of Latin origin. However, it’s plausible to connect it to the prothesis seen in Romance languages, such as French and Spanish, in which a vowel was added in the same contexts. Latin schola and status have become not only ysgol and ystad in Welsh, but also escuela and estado in Spanish and école and état in French. Welsh may have therefore taken on and preserved a common phonetic feature of everyday Latin in the Roman Empire.

A bilingual sign for the Caernarfon Record Office. From here. Alongside the Latinate days of the week, we also have oriau ‘hours’ from Latin hōra ‘hour’, and addysg ‘education’ from Latin discere ‘to learn’.

So, not only do we have evidence for Latin’s lexical influence, but also for a lasting legacy in Welsh phonology. Furthermore, various grammatical features of Welsh have been laid at Latin’s door too. Russell (2011) discusses four of the big contenders, and dismisses two, but concludes that the creation of compound prepositions and a pluperfect tense pass his strict criteria for a Latin origin. All in all, while still meriting the status of a Celtic and Brythonic language, Welsh is also a product of Roman Britain and as such it maintains the effects of that interesting chapter in British language. With all due respect and much affection, it joins Albanian as one of my Almost Romance languages.

To Conclude

Here are my two candidates for my not-entirely-serious language family of Almost Romance. I think considering them in this light offers necessary nuance for the history of European languages, and adds complicating colour to the life of Latin. That famous Roman tongue may have won great popularity, much of it at the point of a sword, but that popularity varied from place to place and from people to people. Latin had a spectrum of influence, with Albanian and Welsh attesting to the upper-mid-range of that scale. They therefore take their place alongside French, Spanish, Italian and the rest in illuminating the subsequent history of Latin and the creation of members of a new language family – well, members and almost members.



  • Brown, P. (1971). The World of Late Antiquity: AD 150-750. Harcourt Brace Jovanovich.
  • Gramelová, L. (2013). Latinské a románské prvky v albánštině. Dissertation. Filozofická fakulta Univerzity Karlovy.
  • Hamp, E. (1999). Lectures on the Albanian language: History and dialectology. Ohio State University.
  • Lloyd-Jones, J. (1910). Some Latin Loan-words in Welsh. Zeitschrift für celtische Philologie 7(1). 462-474.
  • Parina, E. (2010). Loanwords in Welsh: Frequency Analysis on the Basis of Cronfa Electronaeg o Gymraeg. Studia Celto-Slavica 3. 183–194.
  • Rusakov, A. (2017). Albanian. The Indo-European Languages. Second edition. 552-608. Routledge.
  • Russell, P. (2011). Latin and British in Roman and Post-Roman Britain: methodology and morphology. Transactions of the Philological Society 109(2). 138–157.
  • Schrijver, P. (1995). Studies in British Celtic historical phonology. Volume 5 of Leiden studies in Indo-European. Brill Rodopi.
  • Schrijver, P. (2002). The rise and fall of British Latin: evidence from English and Brittonic. The Celtic Roots of English. 87-110.
  • Geiriadur Prifysgol Cymru.

SPQR flag image from here. Skilfully edited by me.

12 thoughts on “The Almost Romance Languages

  1. Hi Danny,I tried to post a comment, but I had to go through a ridiculously convoluted process and it didn’t work. Anyhow I wanted to say that it was an interesting article.

    I have a question: I’ve read somewhere that some scholars have put forward the idea that English could be considered a Romance-Germanic Creole language. What are your thoughts on this? Best Regards,Simon

    Liked by 1 person

    1. Hi Simon,

      Thank you for your kind words and interesting question! It’s admittedly such a relief to know that someone liked the piece!

      The potential creole status of English is an intriguing issue that I have often pondered myself. I have no particular problem with ascribing to English a special status per se that acknowledges its mixed ancestry; the French/Latin influence has been extremely strong, and it’s reasonable to want to describe English with a term that contrasts with languages like German that show less external influence.

      My qualm about the term ‘creole’ specifically is that it’s a confusing term that means different things to different people, linguists and non-linguists alike. It’s a term that’s still caught up in the period of European colonial expansion and the languages that emerged then, and there’s a longstanding idea that creoles are of some different grammatical nature, compared with other supposed ‘real’ languages, which doesn’t stand up to scrutiny. For this reason, I myself prefer ‘mixed language’ or ‘fusion’. Regarding English though, all aspects of language considered, I’d say that it’s still a Germanic language. Its system of grammar, sounds and lexicon remain pretty Germanic, despite the countless external influences.

      I hope this makes sense and answers your question. Probably a longer answer than you were expecting!

      Best wishes,


    2. Interesting and enjoyable. Appreciate the “superficial” analysis with its various cultural implications.
      Disregard the prig. Folks like that are the reason I quit a DPhil, and made a career in high-rise construction.

      Liked by 1 person

  2. Woaw! The scheduele with the days of the week în Welsh looks like as if IT was in Romania (except “Friday”, în Romanian “vineri”). Interesting stuff. Thank you!

    Liked by 1 person

  3. I am sorry to disappoint language enthusiasts. That is not how language typolgy works. Suprastratic penetration of the Lexicon does not entail genetics. – Languages are considered to be related when their underlying Grammar (morphosyntax; i.e., the structure that sustains it) is evidently parallel. That is why a language such as Romanian which coincides scarcely in Lexicon with Latin is considered a Romance Language. – By your standards, Mexican Spanish should be an almost Uto-Aztec language.


    1. None of this I ever deny. As I made quite clear, the status of ‘almost’ is unscientific and unserious, and only meant to make us think and reconsider the Romance languages in their general context. I would willingly apply ‘almost’ to other circumstances, because it implies nothing in terms of language genealogy. Do read it more thoroughly and more charitably.


  4. Wonderful and fascinating article. Thank you for taking the time to research and write this. Interesting to think of how the Latinate words, as they are today, have the accents of the original learners locked in, then a millennia of further evolution on top. Much like how I imagine French is Latin spoken with a Gaulish accent, and then put in a blender on “liquify”.

    Liked by 2 people

  5. This is fascinating. Thank you so much for sharing this with us. I gave it a quick skim while at work and cannot wait to read it more closely tonight when I get home. Thank you again.

    Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: