Reaction to “Speech perception as categorization” by Holt & Lotto, 2010

Reaction to Holt & Lotto (2010) by Richard Thripp
EXP 6506 Section 0002: Fall 2015 – UCF, Dr. Joseph Schmidt
October 21, 2015 [Week 9]

Holt and Lotto discuss several open research questions in speech perception (SP). A key area that Holt and Lotto clarify involves categorical perception (pp. 4–6)—specifically, the unsupported belief that the “abrupt shift” as sounds on a continuum are manipulated, e.g., a continuum between “ba” and “pa” sounds, is unique to speech perception. This “abrupt shift” has also been found for human faces, facial expressions, and many other things. Unfortunately, many laypersons still believe categorical perception is unique to SP. While visual models are of some use, auditory perception has its own challenges and special considerations (p. 6). Unfortunately, it remains an under-researched field.

If research on auditory processing is too simplistic (p. 10), I can only imagine that research on the other three senses is far worse! Yet, even many visual tasks do not translate to the real-world settings the researchers are supposedly trying to elucidate, and vision is the flagship sense for experimental research. When one takes into account how amazing the brain is at recognizing speech even when it is distorted and compressed in numerous ways (p. 8), how can we fathom implementing effective artificial intelligence for the task—or even understanding its underlying mechanics their full complexity? Speech processing itself seems capable of operating independently of “attention, executive processing, or working memory” (p. 8), which is an amazing feat for such a complicated, voluntary task.

It would be interesting to see the authors relate SP to other sounds, such as singing, music, and engine noises. It has been shown that singing is a separate process from speaking that activates different brain regions—even people with expressive aphasia can sing. How then, do we perceive singing as compared to speaking? Are their differences? Given that there are differences in production of song versus speech, might there also be differences in perception? The authors make a rather strong conclusion in this statement: “The fact that nonspeech signals shift the mapping from speech acoustics to perceptual space demonstrates that general auditory processes are involved in relating speech signals and their contexts” (p. 7). However, the contextual framework for understanding speech seems to be different from the framework for understanding other sounds; a language-specific pruning process even goes on in early childhood (p. 4), resulting in children losing the ability to distinguish certain phonemes that are not required to be distinguished in their language(s). Does this pruning process go on for music, animal noises, etc.? If a young child is exposed to many different cats with distinctive “meows,” might he or she retain improved abilities to distinguish cat noises from a child who is only exposed to one cat? While Holt and Lotto have digested several difficult concepts, it would be enlightening to read a more thorough comparison and contrasting between speech perception and other types of auditory perception.


Holt, L. L., & Lotto, A. J. (2010). Speech perception as categorization. Author manuscript, Department of Psychology, Carnegie Mellon University, Pittsburgh, PA.

Leave a Reply

Your email address will not be published. Required fields are marked *