Reaction to “Speech perception as categorization” by Holt & Lotto, 2010

Reaction to Holt & Lotto (2010) by Richard Thripp
EXP 6506 Section 0002: Fall 2015 – UCF, Dr. Joseph Schmidt
October 21, 2015 [Week 9]

Holt and Lotto discuss several open research questions in speech perception (SP). A key area that Holt and Lotto clarify involves categorical perception (pp. 4–6)—specifically, the unsupported belief that the “abrupt shift” as sounds on a continuum are manipulated, e.g., a continuum between “ba” and “pa” sounds, is unique to speech perception. This “abrupt shift” has also been found for human faces, facial expressions, and many other things. Unfortunately, many laypersons still believe categorical perception is unique to SP. While visual models are of some use, auditory perception has its own challenges and special considerations (p. 6). Unfortunately, it remains an under-researched field.

If research on auditory processing is too simplistic (p. 10), I can only imagine that research on the other three senses is far worse! Yet, even many visual tasks do not translate to the real-world settings the researchers are supposedly trying to elucidate, and vision is the flagship sense for experimental research. When one takes into account how amazing the brain is at recognizing speech even when it is distorted and compressed in numerous ways (p. 8), how can we fathom implementing effective artificial intelligence for the task—or even understanding its underlying mechanics their full complexity? Speech processing itself seems capable of operating independently of “attention, executive processing, or working memory” (p. 8), which is an amazing feat for such a complicated, voluntary task.

It would be interesting to see the authors relate SP to other sounds, such as singing, music, and engine noises. It has been shown that singing is a separate process from speaking that activates different brain regions—even people with expressive aphasia can sing. How then, do we perceive singing as compared to speaking? Are their differences? Given that there are differences in production of song versus speech, might there also be differences in perception? The authors make a rather strong conclusion in this statement: “The fact that nonspeech signals shift the mapping from speech acoustics to perceptual space demonstrates that general auditory processes are involved in relating speech signals and their contexts” (p. 7). However, the contextual framework for understanding speech seems to be different from the framework for understanding other sounds; a language-specific pruning process even goes on in early childhood (p. 4), resulting in children losing the ability to distinguish certain phonemes that are not required to be distinguished in their language(s). Does this pruning process go on for music, animal noises, etc.? If a young child is exposed to many different cats with distinctive “meows,” might he or she retain improved abilities to distinguish cat noises from a child who is only exposed to one cat? While Holt and Lotto have digested several difficult concepts, it would be enlightening to read a more thorough comparison and contrasting between speech perception and other types of auditory perception.


Holt, L. L., & Lotto, A. J. (2010). Speech perception as categorization. Author manuscript, Department of Psychology, Carnegie Mellon University, Pittsburgh, PA.

Reaction to “Basic objects in natural categories” by Rosch, Mervis, Gray, Johnson, & Boyes-Braem (1976)

Reaction to Rosch, Mervis, Gray, Johnson, & Boyes-Braem (1976) by Richard Thripp
EXP 6506 Section 0002: Fall 2015 – UCF, Dr. Joseph Schmidt
October 22, 2015 [Week 9]

This was a ground-breaking and lengthy study with multiple experiments designed to attack the issue of categories from several angles. Naturally, being 40 years old, the study can be criticized for many practices which were common at the time but frowned upon now, such as simply recruiting undergraduate students from the psychology department as participants. However, such issues are mitigated by the careful thought the authors have given to the messy issue that categorization is—such as in their discussions of the philosophical issues concerning superordinate, basic, and subordinate categories.

While the authors state that more than three categories are often in play, it would be nice to see them to address this in one of their experiments and to discuss it in more depth. Importantly, the biological taxonomies seemed significantly less useful and had to be excluded in Experiment 5 (p. 408). Their superordinate categories are basically “timeless,” such as fruit, trees, fishes, and birds, but were problematic; they were akin to the “basic” levels from categories such as musical instruments and clothing (pp. 392–393), though the latter may be more influenced by culture. However, even the biological and fruit taxonomies have a cultural component—obviously, different specimens of these categories appear on different continents, in different regions, and across different cultures. It is unfortunate the authors did not give due consideration to culture, though it is welcomed that they performed a speculative experiment regarding sign language (pp. 426–428).

One of the issues with recruiting homogeneous participants such as undergraduate psychology students is cohort effects. Even something as basic as common age can skew results—for example, participants in their early 20s might be more likely to identify clothing or vehicles differently than their parents, because they grew up in the 1960s and their parents probably grew up in the 1930s or 1940s. While it is definitely convenient, particularly when it is compulsory as in Experiment 6 (participants received “course credit,” not extra credit for their participation—what were the costs of declining to participate in terms of reduced grades?), I cannot see how researchers, even in the 1970s, could be blind to the issues it raises. It is alarming that even in Experiments 8 and 9, where the authors were methodologically compelled to recruit children, they still managed to find a place for their undergraduate students (pp. 416, 419)!

I was pleased to see that two and a half pages of discussion were dedicated to the effects of existing knowledge and expertise (pp. 430–432), since I was constantly thinking about this when reading the experiments, and the ichthyologist example (p. 393) was not satisfying. If the authors would have replaced airplanes with boats, would enthusiasts for each of these vehicles yield different results for these categories? Obviously, such differences should not be important in a heterogeneous, sufficiently large sample, but they are interesting enough to explore in their own right. Figure 4 (p. 431) is manifestly crude and pointless—obviously experts on a topic are going to have more intermediate categorization levels than laypersons. Consider that the present taxonomy for the classification of life has seven levels (domain, kingdom, phylum, class, order, genus, and species), and we can see that three levels might not represent the norm, but rather the bare minimum for a taxonomy. Intermediate levels are clearly evident even in the authors’ choices—in Table 1 (p. 388), even musically illiterate individuals can identify a class of categories between the superordinate category of “musical instruments” and the basic categories of individual instruments, consisting of categories such as strings, woodwinds, percussion, etc. The authors’ focus on the three-level superordinate–basic–subordinate paradigm eschews additional intermediate levels that are definitely present and probably important.

On a lighter, concluding note, it is rare that a typo can make me laugh out loud, but I would love to see a car equipped with a “food pedal” (p. 395).


Rosch, E., Mervis, C. B., Gray, W. D., Johnson, D. M., & Boyes-Braem, P. (1976). Basic objects in natural categories. Cognitive Psychology, 8, 382–439.

Reaction to “Vision: Are models of object recognition catching up with the brain?” by Poggio & Ullman (2013)

Reaction to Poggio & Ullman (2013) by Richard Thripp
EXP 6506 Section 0002: Fall 2015 – UCF, Dr. Joseph Schmidt
October 13, 2015 [Week 8]

Poggio and Ullman (2013) give us an intriguing overview of many reasons why computers have gotten much better at recognizing objects over the past decade, such as implementing learning via examples rather than design (p. 74), feature databases (pp. 74–75), and hierarchization and related algorithmic improvements (pp. 75–76).

I was surprised that no mention is given of the improvements in computer performance over the past decade. Jeff Preshing (2012), a Canadian computer programmer, created the following graph of the single-threaded floating-point performance of common computer processors from 1995 to 2011, showing an annual performance increase of approximately 21% from 2004 through 2011. While this increase is fairly modest compared to the 64% average annual increase he calculated from 1995 through 2003, multicore computing has also became very common since 2004. Since this graph only concerns single cores, the processing performance increase for the past decade is likely ten-fold, conservatively.

Graph of Single-Threaded Floating-Point Performance Improvements by Jeff Preshing

Poggio and Ullman (2013) do not discuss CPU (central processing unit) improvements, nor the vast increases in RAM (random access memory) and GPU (graphics processor unit) speed and capacity that have occurred over the past decade. Such improvements have allowed more data to be maintained in memory and operated upon more quickly. The authors state that “high-performance computer vision systems require millions of labeled examples to achieve good performance,” but do not discuss that the computing power required for the millions of examples is much cheaper and more readily available now. This seems a rather large oversight.

Poggio and Ullman (2013) mention the PASCAL Visual Object Classes challenge (p. 73), the requirements for which do not discuss the performance or computing power requirements of the object categorization algorithms (Everingham, Gool, Williams, Winn, & Zisserman, 2012). However, the contest ran from 2005 to 2012, and notably became more complex each year. This does not necessarily mean that computer programmers are simply getting better—they may be aided by better equipment and tools, as well as past experience and emerging cognitive and neuropsychological research.

An image categorization algorithm may achieve incrementally improved accuracy at immense computational cost, and this is sufficient perhaps for a proof-of-concept paper or contest entry. However, deploying the algorithm in the real world, for example, Facebook’s 2015 rollout of automatic tagging of people in uploaded photos (Woollaston, 2015), requires not only algorithmic optimization and compromises, but a cheapening of computing resources which has occurred, but was not broached by Poggio and Ullman (2013). Deploying image categorization algorithms on a massive scale, as Facebook has done, may provide critical insights into improving object recognition and closing the performance gap that Poggio and Ullman (2013) have identified (pp. 77–78).


Everingham, M., Gool, L., Williams, C., Winn, J., & Zisserman, A. (2012). The PASCAL visual object classes homepage. Retrieved October 13, 2015, from

Poggio, T., & Ullman, S. (2013). Vision: Are models of object recognition catching up with the brain? Annals of the New York Academy of Sciences, 1305, 72–82. doi:10.1111/nyas.12148

Preshing, J. (2012). A look back at single-threaded CPU performance. Preshing on Programming. Retrieved October 13, 2015, from

Woollaston, V. (2015). Facebook can tag you in photos AUTOMATICALLY: Social network starts rolling out DeepFace recognition feature. Daily Mail Online. Retrieved October 13, 2015, from

Reaction to “Psychological science can improve diagnostic decisions” by Swets, Dawes, & Monahan (2000)

Reaction to Swets, Dawes, & Monahan (2000) by Richard Thripp
EXP 6506 Section 0002: Fall 2015 – UCF, Dr. Joseph Schmidt
October 7, 2015 [Week 7]

Swets, Dawes, and Monahan (2000) have given us a strong exposition of probability modeling, including engaging and practical applications, with the intention to shift public policy for the better (p. 23). I love this topic, and the need for it is basically summed in this quote: “The distribution of the degrees of evidence produced by the positive condition overlaps the distribution of the degrees produced by the negative condition” (p. 2), meaning that diagnostics is a tradeoff—since there is an overlapping range where scores can indicate both having and not having a condition (such as glaucoma or dangerously cracked airplane wings), whatever decision model is adopted will yield both false positives and false negatives.

I cannot understand why the authors never compare decision making to type I and type II errors from statistical hypothesis testing. For the statistically inclined, this seems an analogy with immense expository power. While the graphs and explanations in the article are helpful and clear, they do become repetitive—9 of 12 figures are receiver operating characteristic (ROC) curves and 2 more are ROC scatterplots. Figures relating to statistical prediction rules (SPRs) would have been welcome, such as a graph showing how reliability increases with number of cases (p. 8).

The possibilities with probability are endless, and while it may initially appear that they are valuable only to highly-educated professionals such as actuaries and medical diagnosticians, they are actually quite relevant even for personal financial literacy. For instance, I was recently tempted by a postcard advertisement to enter a $5,000 sweepstakes that requires calling in and listening to a life insurance pitch. However, after noticing the odds were listed as 1:250,000, I realized that entrants would earn, on average, 2¢. If the phone call takes five minutes of undivided attention, that is 24¢ per hour—a shockingly low return. Would not many of our decisions and practices be changed with a habitualization for seeking solid probabilistic statistics? For example, we might drive far less after realizing that our risk of bodily injury or death is so high—and we would understand a possible reason why auto insurance is expensive.

One grievance with Swets et al. (2000) is that they focused heavily on binary decisions. Diagnosing cancer (pp. 11–15) is a true/false decision, as are the decisions to utilize a particular test. While there may be a choice between tests of varying accuracy and expense, you cannot do a little bit of a biopsy because you are a little bit concerned about breast cancer—you either choose to perform or not perform the test. This might be a criticism of the field in general, since SPRs and ROCs are obviously geared toward binary tests—and a whole bunch of binary tests can approximate a continuous scale. Nonetheless, I would have liked more examples regarding non-binary decisions—for example, deciding what interest rate and credit line to extend to a borrowing customer, or rating the structural integrity of a bridge. We do have the weather forecasting example, but it was only briefly discussed (p. 18).

Screenings with a low frequency of “hits” are an interesting topic (pp. 16, 19). A detector for plastic explosives produces 5 million false positives per true positive (p. 19); 85% of low-risk HIV patients might receive a false positive diagnosis (p. 16). Statistics like these prompt us to question whether we should even bother with tests in low-risk cases? However, airport security is an area where comprehensive screening is required—we cannot simply select every nth passenger because the costs of missing a terrorist are so high and the commonness of terrorists is so low. On the other hand, the U.S. Postal Service does not need to open every package sent via media mail to ensure the contents are eligible—there is no loss of life at stake. Of course, when both false positives and false negatives are costly, such as with detecting cracks in airplane wings (pp. 16–18), detecting HIV, or detecting terrorists, SPRs and ROCs shine. We can then choose how many true positives we want and exactly how far beyond the point of diminishing returns we are willing to go.


Swets, J. A., Dawes, R. M., & Monahan, J. (2000). Psychological science can improve diagnostic decisions. Psychological Science in the Public Interest, 1(1), 1–26.

Reaction to “Signal detection theory and the psychophysics of pain” by Lloyd & Appel (1976)

Reaction to Lloyd & Appel (1976) by Richard Thripp
EXP 6506 Section 0002: Fall 2015 – UCF, Dr. Joseph Schmidt
October 7, 2015 [Week 7]

Lloyd and Appel (1976) review many articles involving signal detection theory and pain research, reaching the conclusion that greater methodological consistency is needed (p 79). This seems a valid point—3 of 9 of the studies from Table 2 (p. 89) did not even report (and may not have used) a zero pain stimulus, so how can they make inferences about pain detection without even knowing what their subjects report at the baseline? The authors rightly criticize the existing literature for other issues such as too few repetitions and lack of consistent standards for SDT measures (p. 88). Further, d′ itself (the sensitivity measure) tends to be higher with a binary measure than a rating system with more categories (p. 90)—to the point that using both types of measures simultaneously has been advocated. Few if any of the available articles considered this.

Unfortunately, the authors fail to draw sufficient connections between the articles or critically analyze them as a whole, in my opinion. While several connections and comparisons are made, there is no real discussion section to synthesize the research, and the “summary and conclusions” section is less than a single page. While this may be partly forgiven due to a scarcity of SDT research as of 1976, the authors discussed at least 13 articles, which may be enough to attempt such a synthesis. We do have a “Summary of Criticisms” table (p. 89) that helps us identify possible flaws in nine of the articles. Unfortunately, this table is half-baked—all cells are binary “Yes” or “No” data, without any indications of magnitude in columns such as “change in sensitivity” and “change in response bias.” Though yes/no might be a welcome relief from the overwhelming quantity of numbers in recent cognitive psychology literature, the authors could have, at minimum, used superscripts and footnotes to indicate particularly large changes or egregious methodological problems. Further, though an accurate criticism of much of the research, what constitutes “sufficient stimulus presentations” is not specified in the table. Finally, the authors included a table only for the nine modification studies they reviewed (pp. 84–91), neglecting to include a table for the four normative and comparative studies (pp. 91–92).

Lloyd and Appel have produced a strong primer on signal detection theory, complete with copious graphs and figures (pp. 80–84). It would be nice to see some of this explanatory power applied to the literature review, which is devoid of graphs and figures. For instance, they could have provided bar graphs comparing sensitivity and response bias for the different studies by Clark and associates involving diazepam, suggestion, and acupuncture (pp. 85–87). While the authors have explained that comparisons are difficult due to different standards between the studies (p. 88), it seems that studies with the same principal investigator (Clark) should be easier to compare. Nevertheless, the authors have vocalized solid criticisms, and I found the section discussing response biases and placebo effects to be particularly cogent (p. 84).


Lloyd, M. A., & Appel, J. B. (1976). Signal detection theory and the psychophysics of pain: An introduction and review. Psychosomatic Medicine, 38(2), 79–94.

Check out our new campaign website,! I have set the home page (only) to redirect there. Click here to view my older posts.