Matt Spike

the life logistic


Project maintained by matspike Hosted on GitHub Pages — Theme by mattgraham

Week 4 reading: Evolution of Speech

This week’s lecture will be on the evolution of speech, the default (but not the only) human modality for language. The lecture will focus on the evolution of the mechanisms underlying speech and vocal learning, and how learning sets up the self-organisation of contrastive systems of speech sounds: for this reading, we’d like you to do relevant background information on the physical apparatus behind speech (mainly the vocal tract), by reading Fitch Chapter 8.

The bulk of this chapter is on the position of the larynx in humans and other animals – Fitch has done important work here, using comparative data to overturn some widely-accepted but incorrect assumptions about the uniqueness of the human descended larynx. As described in the reading, most animals, and very young humans, have a larynx that is positioned high in the throat, which allows them to engage the larynx directly with the velum, forming a nice tight seal which prevents food/liquid being inhaled.

In many other animals, and young human infants, the larynx engages with the velum and nasal cavity, making it possible to breathe through the nose (red arrows) while swallowing (blue arrows). Image from http://thebrain.mcgill.ca/flash/capsules/outil_bleu21.html.](./week4_files/outil_bleu21_img02.jpg “Breathing while swallowing”){width=”574” height=”286”}

The resting position of the human larynx is lower in the throat, and too far down to allow the larynx to engage with the velum in this way – the larynx moves to its lower position early in development (around age 3 months), with a second descent to an even lower position occurring during puberty in males only (which turns out to be informative about possible evolutionary functions for the low position of the larynx in humans, see below). This low larynx position increases (at least hypothetically) the risk of choking – every time you swallow, the stuff you are swallowing has to pass over the top of your windpipe and into the oesophagus, the epiglottis pulls down over the opening to the windpipe to cover the opening, as shown here:

It has long been thought that the risks of the descended larynx – the danger of choking on food or liquid – must be outweighed by some advantage, perhaps related to language?

{.alignnone width=”431” height=”437”}

One possibility is that the lower position of the larynx also drags the tongue root down into the pharynx, which gives us a two-tube vocaltract: we can manipulate the size of the oral cavity with the tonguetip, and independently manipulate the pharyngeal tube by pushing thetongue backwards and forwards – this is really nicely illustrated inthe MRI images on this page, under section 2 “VowelArticulation”. The idea is that this two-chamber vocal tract gives access to a widerrange of formant frequencies, boosting the range of distinctive speechsounds we can produce.

However, as explained in section 8.4 of the Fitch reading, it turns out that the low position of the larynx/tongue root in humans isn’t actually very unusual. Firstly, many other mammals can dynamically reconfigure their vocal tract during vocalisation – they pull their larynx low in the throat while vocalising, giving them (temporarily) a two-tube vocal tract very similar in configuration to ours.  You can see this in action in the X-ray movies of a vocalising goat (clip uploading soon) Second, it turns out that many other species (e.g. koalas, big cats, deer) have a permanent low larynx position, giving them a two-tube vocal tract similar to humans – you can see the low resting position of the larynx (big lump in the throat), and also the fact that these animals pull it even lower when vocalising, in this video of a vocalising red deer(also coming later today)

There are two consequences of this. Firstly, since many mammals can reconfigure their vocal tract while vocalising, previous attempts tofigure out the position of the larynx/tongue root in fossil hominids arerather pointless – even if we were able to infer a high resting position, there would still be the possibility that the vocal tract was reconfigured during vocalisation. Second, since a (permanently ortemporarily) descended larynx/tongue root is seen in species that don’thave language or even complex vocalisations (e.g. deer, big cats – theybasically just roar, which I think we can all agree is prettyspectacular but not language), there must be some other pressure that explains the convergent evolution of the descended larynx in humans andthese other species.

Fitch suggests that the most likely explanation is size exaggeration – the lower your larynx, the longer your vocal tract, and the bigger you sound. This can be ‘faked’ to a certain extent by pulling your larynx lower, but is intrinsically an honest signal – you can’t pull your larynx outside your body, so bigger individuals have longer vocal tracts and sound bigger. Size exaggeration might be useful in sexual displays, in male-male competition (think of the roaring red deer) or more generally in territory defence (both male and female big cats roar, and both have a permanently descended larynx that emphasises their size for anyone hearing it). In humans, the second descent of the larynx in puberty in males suggests a sexual signalling function; the descent of the larynx in both sexes at age 3 months is presumably driven by different pressures, applying to both sexes. The descended larynx may therefore give us a nice vocal tract for producing contrastive speech sounds, but needn’t (initially) have been selected for this function. Instead, the descended larynx might be a preadaptation for speech: the reconfigured vocal tract was originally selected for due to fitness payoffs associated with size exaggeration, then subsequently re-tooled for the benefits it offered for complex speech.

The overall picture on the human vocal tract is therefore that, sadly (?), humans aren’t as special as we thought in physical apparatus forspeech production. Fitch briefly reviews the evidence suggesting that wealso aren’t special in our auditory apparatus for speech perception –we have a standard mammal ear, and other mammals have been shown toexhibit e.g. categorical perception effects that were at one pointassumed to be uniquely human. For instance, chinchillas trained in thelab to discriminate between /t/ and /d/ show the same sort ofdiscrimination curve as adult humans. The original paper showingthis is quiteshort, if a little stomach-churning in terms of the training methodused.

The conclusion is therefore that any human-unique adaptations for speech must be in the brain, not in the peripheral apparatus.

This was originally written by James Winters in 2015, with edits by Kenn Smith in 2016, Marieke Schowstra 2017, Christine Cuskley in 2018, and Matt Spike in 2021