Genealogical classification


(The first half of the lecture will consist of a video screening. There is a handout that helps you organise some of the information in the video.)


Ethnologue, the largest survey of languages today, first attempted a world-wide review in 1974 where 5,687 languages were referred to. The present edition (2018) gives a listing of 7,097 languages. (Click here to go to the Ethnologue website.) If we may be a little sceptical about some of these numbers (we might prefer to think of some of these ‘languages’ as different varieties or dialects of a language), there are still very many languages in the world. How did language itself arise in the human species?


If we compare the human species to other animals, it is clearly significant that the human brain is relatively larger than that of other animals. Early Homo erectus in Africa (from about 1.7 to 1 million years BC) averaged 900 cc in brain size, but later Homo erectus specimens from 500,000 BC average 1,100–1,200 cc (cm³). Today, the average brain size is 1,400 cc. If we assume a correlation a correlation between brain size and intelligence, we might then say that language arose with increased intelligence. (The truth of the matter must be more complex than this though. People with small brains such as nanocephablic dwarfs still have language.)


Secondly, bi-pedalism (standing upright) must have been a factor as well. The hands are freed up for other actions, such as carrying, which in turn frees up the mouth from needing to perform this function. The development of a resonating chamber of about 1½ inches (4 centimetres) above the larynx allowed for the development of various speech sounds.


Did language arise independently in different locations (this view is known as polygenesis)? Or are all languages ultimately evolved from a common ancestor (this view is known as monogenesis)?


We can do a small-scale study based on the instruction leaflet found in an Ikea self-assembled item of furniture.




Svenksa [‘Swedish’]


Kontrollera först innehållet. Vad som ingår ser du längst ner på nästa side. Om något saknas eller du får problem, kontakta ditt varuhus.



First check the contents. There is a list of contents on the left of the other side. If anything is missing, or you have a problem, contact your store.

Deutsch [‘German’]


Zuerst den Inhalt kontrollieren. Was dazu gehört, sehen Sie ganz links auf der nächsten Seite. Sollte etwas fehlen oder sollten Probleme auftreten, setzen Sie sich bitte mit Ihrem Einrichtungshaus in Verbindung.

Français [‘French’]


Commencez par contrôler le contenu en le comparant à liste page suivante, à l’extrême gauche. Si quelque chose manquait ou que vous aviez un problème, contactez votre magasin.

Nederlands [‘Dutch’]


Controleer eerst de inhoud. Uiterst links op de volgende bladzijde staat alles opgesomd. Als er iests ontbreekt of als je problemen krijgt, neem dan kontakt op met het woonwarenhuis.

Español [‘Spanish’]


Verifica primero el contenido. En la página siguiente, a la izquierda, encontrarás la descripción del contenido. Monta el mueble siguiendo el orden numérico y las indicaciones de los dibujos. Si algo hace falta o si tienes dificultades, llama a tu tienda distribuidora. Al cabo de unas dos semanas debes apretar nuevamente todos los herrajes.

Italiano [‘Italian’]


Controlla prima il contenuto. Il contenuto é segnato a sinistra nella pagina seguente. Monta il mobile secondo l’ordine del disegno. Se manca qualcosa o se sorgono dei dubbi, chiama il punto vendita. Ristringere tutte le viti dopo alcune settimane.

Based on this we can say a little about:

  • grammar (a bit)
  • lexis (more)
  • pronunciation (not much)



words similar to English in other languages

other similar words with the same meaning


important (Fr), importante (Sp), importante (It)

viktigt (Sw), wichtig (Ge)



monterings- (Sw), montier- (Ge), montage (Fr), monteage- (Du), montaje (Sp), montaggio (It)


instructions (Fr), instrucciones (Sp), instruzioni (It)

anvisning (Sw), anleitung (Ge), aanwijzing (Du)


verifica (Sp
[like ‘verify’?]

kontrollera (Sw), kontrollieren (Ge), contrôler (Fr), kontroleer (Du), controlla (It)


först (Sw), eerst (Du)

primero (Sp), prima (It) 
[like ‘primary’?]


contenu (Fr), contenido (Sp), contenuto (It)

innehållet (Sw), Inhalt (Ge), inhoud (Du)


1. Consider the words for ‘important’, ‘instructions’, ‘first’ and ‘contents’

  • Group A – French, Spanish and Italian
  • Group B – Swedish, German and Dutch
  • English seems problematic?


2. Consider the arrangement ‘assembly instructions’ or ‘instructions for assembly’

  • Group A languages seem to prefer the noun + prepositional phrase construction
  • Group B languages seem to prefer the noun + noun construction.
  • from the point of view of the grammar of the noun phrase, English resembles the Group B languages more than the Group A languages


How come?

(a) Hypothesis I: These were originally different languages, but because of contact between the different speakers, they were influenced by one another’s lexical items and grammatical structures.

(b) Hypothesis II: These were originally one language, only they gradually became different. Perhaps people migrated, and the language changed in different ways: lexically, grammatically and phonologically.

Hypothesis I = centripetal force (convergence)

Hypothesis II = centrifugal force (divergence).

Diagram: language origins

Figure 1

  • Cheating: the number one in the various languages:
  • en (Sw), un (Fr), ein (Ge), uno (Sp), un (It) and een (Du).
  • cf. onru (Tamil), hitotsu or ichi (Japanese), moja (Swahili), satu (Malay), yi (Mandarin Chinese) and yat (Cantonese Chinese).

  we can imagine a common source for the ‘original’ Group A and Group B languages.

Image on source language


  • Group B languages are usually called the Germanic or Teutonic group of languages
  • Other existing Germanic languages include Afrikaans (sometimes called Cape Dutch and spoken in southern Africa), Danish (spoken in Denmark), Icelandic (spoken in Iceland), Norwegian (spoken mainly in Norway) and Yiddish (the language of the Jews in Europe and America).
  • The Group A languages are commonly called the Romance (= vernacular) languages.
  • These were originally all local or vernacular spoken versions of Latin (‘vulgar Latin’) used throughout the Roman empire.
  • This group is called the Italic group of languages.
  • Other existing Italic (Romance) languages include Catalan (spoken mainly in some parts of Spain and France), Portuguese (spoken mainly in Portugal and Brazil) and Romanian (spoken mainly in Romania).

The Germanic, the Italic group and other groups of languages form a larger family of languages. They call this the Indo-European family of languages

branches of Indo-European languages

This website is interesting and males a similar point:

Sir William JonesWhilst many had been aware of the similarities between and therefore the common source of the Romance languages (French, Spanish, Italian, Portuguese, etc.), it took a British judge Sir William Jones who was stationed in India to cast the net much wider and noticed similarities between apparently very different languages. He made systematic comparisons of the lexis and the grammar of languages like Greek, Latin, English and Sanskrit in an orderly fashion. This provided strong evidence for the existence of a so-called family of Indo-European languages, with an ultimate common source that was now extinct.


This is what he had to say:


The Sanscrit language, whatever be its antiquity, is of a wonderful structure; more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either, yet bearing to both of them a stronger affinity, both in the roots of verbs and the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have spring from some common source, which, perhaps, no longer exists: there is a similar reason, though not quite so forcible, for supposing that both the Gothick and the Celtick, though blended with a very different idiom, had the same origin with the Sanscrit, and the old Persian might be added to this family, if this were the place for discussing any question concerning the antiquities of Persia.


This eventually gave rise to a way of classifying languages, known as the genetic classification or the genealogical classification. The main metaphor that has been employed for talking about languages this way is the metaphor of the family tree (introduced by the German linguist Schleicher who thought of language as an organism that could grow and decay). This method compares different languages and use as many written remains that are available. Clearly, this kind of method of research would be more successful in places where more written records were available. Where there are gaps in the tree, reconstruction is possible (indicated by an asterisk below) by comparing cognate forms.


The table above shows that Italian, Spanish, French, Portuguese and Catalan are ‘sister languages’, derived from the parent language Latin. We can extend the diagram and show ‘daughter languages’ for Sanskrit and Gothic for example as well. The name given to the ‘common source’ is Proto-Indo-European or PIE. It is thought to have been spoken before 3000 BC and to have split up into different languages, so that by 2000 BC many of the linguistic differences had been established.


PIE speakers would seem to have lived in the steppe region of southern Russia around 4000 BC but subsequently migrated to other regions. Experts have been able to reconstruct some vocabulary items, including many items pertaining to family. There were also many words for in-laws, used only in relation to the bride – which suggests that the society must have been patriarchal in nature. There are also terms for domesticated animals, suggesting that there was farming activity. There are words for the body, tools and weapons, as well as abstract notions relating to law, religious belief and social status. Numerals were available up to at least 100.


There are no written records of PIE, which suggests that PIE was not a written language. PIE sounds therefore have had to be reconstructed. (Some suggest that PIE was a development from an even earlier language, sometimes called Nostratic; here is a transcript of a programme ‘In Search of the First Language’. [If you go either of the links, click on your browser’s Back button to return here.]) If we examine the table above, it is clear that cognate words are pronounced differently in the various languages in the same family. When we examine a sufficient number of words, we will notice that the changes are not haphazard but often quite systematic. The 19th-century German philologist Jakob Grimm (1785–1863) worked out a sound law, known as Grimm’s law or ‘the first sound-shifting’, of how some Germanic consonants diverged from that of PIE. This is illustrated in the table below.


Aspirated voiced stops


Voiced stops


Voiceless stops


Voiceless fricatives














Examples of words that show Grimm’s law include the following.


Change illustrated





(Old) English

p f






p f




feoh ‘cattle, money’

p f




t T






t T






t T/D






d t






d t





to wit


When you’re ready to take the quiz based on this topic, go to the IVLE page and click on ‘Assessment’ on the left, and then on ‘beginnings’.


Back to the home page.