The Statistical Structure of Human Speech Sounds Predicts Musical Universals

David A. Schwartz, Catherine Q. Howe, and Dale Purves

The Journal of Neuroscience, August 6, 2003, 23(18):7160-7168

Department of Neurobiology and Center for Cognitive Neuroscience, Duke University Medical Center, Duke University, Durham, North Carolina 27710

Abstract:

The similarity of musical scales and consonance judgments across human populations has no generally accepted explanation. Here we present evidence that these aspects of auditory perception arise from the statistical structure of naturally occurring periodic sound stimuli. An analysis of speech sounds, the principal source of periodic sound stimuli in the human acoustical environment, shows that the probability distribution of amplitude-frequency combinations in human utterances predicts both the structure of the chromatic scale and consonance ordering. These observations suggest that what we hear is determined by the statistical relationship between acoustical stimuli and their naturally occurring sources, rather than by the physical parameters of the stimulus per se. (Bold text emphasis by Martin Braun)

Comment:

This study shows that the spectral content of speech sounds is universally biased towards the frequency ratios that occur in the consonant intervals of the common 12-tone scale. It had previously been known that the auditory system of humans, and of other mammals, is biased towards these frequency ratios, as seen in psychoacoustic and behavioral results, as well as in the anatomy of the assumed apparatus of pitch extraction. That it now turns out that the bias in hearing apparently is an adaptation to the bias in vocalization could be expected on evolutionary and developmental grounds. The new results confirm these expectations, and an important link between speech and hearing is now established. An interesting side result of the study is that male speech, but not female speech, includes a bias towards the minor third, whereas female speech includes a stronger bias towards the major third than male speech (Fig. 2D). The reason for this sex difference is the lower fundamental in male voices, which means that different partials are favored by the resonance of the vocal tract. A corresponding sex difference had previously been found in the distribution of frequency ratios in >5000 pairs of spontaneous otoacoustic emissions (Braun, 1997). This detail further emphasizes the adaptation of the auditory system to the physical parameters of the organs of vocalization. (Comment Martin Braun)

NOM Home