Reaction to “Psychological science can improve diagnostic decisions” by Swets, Dawes, & Monahan (2000)

Reaction to Swets, Dawes, & Monahan (2000) by Richard Thripp
EXP 6506 Section 0002: Fall 2015 – UCF, Dr. Joseph Schmidt
October 7, 2015 [Week 7]

Swets, Dawes, and Monahan (2000) have given us a strong exposition of probability modeling, including engaging and practical applications, with the intention to shift public policy for the better (p. 23). I love this topic, and the need for it is basically summed in this quote: “The distribution of the degrees of evidence produced by the positive condition overlaps the distribution of the degrees produced by the negative condition” (p. 2), meaning that diagnostics is a tradeoff—since there is an overlapping range where scores can indicate both having and not having a condition (such as glaucoma or dangerously cracked airplane wings), whatever decision model is adopted will yield both false positives and false negatives.

I cannot understand why the authors never compare decision making to type I and type II errors from statistical hypothesis testing. For the statistically inclined, this seems an analogy with immense expository power. While the graphs and explanations in the article are helpful and clear, they do become repetitive—9 of 12 figures are receiver operating characteristic (ROC) curves and 2 more are ROC scatterplots. Figures relating to statistical prediction rules (SPRs) would have been welcome, such as a graph showing how reliability increases with number of cases (p. 8).

The possibilities with probability are endless, and while it may initially appear that they are valuable only to highly-educated professionals such as actuaries and medical diagnosticians, they are actually quite relevant even for personal financial literacy. For instance, I was recently tempted by a postcard advertisement to enter a $5,000 sweepstakes that requires calling in and listening to a life insurance pitch. However, after noticing the odds were listed as 1:250,000, I realized that entrants would earn, on average, 2¢. If the phone call takes five minutes of undivided attention, that is 24¢ per hour—a shockingly low return. Would not many of our decisions and practices be changed with a habitualization for seeking solid probabilistic statistics? For example, we might drive far less after realizing that our risk of bodily injury or death is so high—and we would understand a possible reason why auto insurance is expensive.

One grievance with Swets et al. (2000) is that they focused heavily on binary decisions. Diagnosing cancer (pp. 11–15) is a true/false decision, as are the decisions to utilize a particular test. While there may be a choice between tests of varying accuracy and expense, you cannot do a little bit of a biopsy because you are a little bit concerned about breast cancer—you either choose to perform or not perform the test. This might be a criticism of the field in general, since SPRs and ROCs are obviously geared toward binary tests—and a whole bunch of binary tests can approximate a continuous scale. Nonetheless, I would have liked more examples regarding non-binary decisions—for example, deciding what interest rate and credit line to extend to a borrowing customer, or rating the structural integrity of a bridge. We do have the weather forecasting example, but it was only briefly discussed (p. 18).

Screenings with a low frequency of “hits” are an interesting topic (pp. 16, 19). A detector for plastic explosives produces 5 million false positives per true positive (p. 19); 85% of low-risk HIV patients might receive a false positive diagnosis (p. 16). Statistics like these prompt us to question whether we should even bother with tests in low-risk cases? However, airport security is an area where comprehensive screening is required—we cannot simply select every nth passenger because the costs of missing a terrorist are so high and the commonness of terrorists is so low. On the other hand, the U.S. Postal Service does not need to open every package sent via media mail to ensure the contents are eligible—there is no loss of life at stake. Of course, when both false positives and false negatives are costly, such as with detecting cracks in airplane wings (pp. 16–18), detecting HIV, or detecting terrorists, SPRs and ROCs shine. We can then choose how many true positives we want and exactly how far beyond the point of diminishing returns we are willing to go.

Reference

Swets, J. A., Dawes, R. M., & Monahan, J. (2000). Psychological science can improve diagnostic decisions. Psychological Science in the Public Interest, 1(1), 1–26.

Reaction to “Signal detection theory and the psychophysics of pain” by Lloyd & Appel (1976)

Reaction to Lloyd & Appel (1976) by Richard Thripp
EXP 6506 Section 0002: Fall 2015 – UCF, Dr. Joseph Schmidt
October 7, 2015 [Week 7]

Lloyd and Appel (1976) review many articles involving signal detection theory and pain research, reaching the conclusion that greater methodological consistency is needed (p 79). This seems a valid point—3 of 9 of the studies from Table 2 (p. 89) did not even report (and may not have used) a zero pain stimulus, so how can they make inferences about pain detection without even knowing what their subjects report at the baseline? The authors rightly criticize the existing literature for other issues such as too few repetitions and lack of consistent standards for SDT measures (p. 88). Further, d′ itself (the sensitivity measure) tends to be higher with a binary measure than a rating system with more categories (p. 90)—to the point that using both types of measures simultaneously has been advocated. Few if any of the available articles considered this.

Unfortunately, the authors fail to draw sufficient connections between the articles or critically analyze them as a whole, in my opinion. While several connections and comparisons are made, there is no real discussion section to synthesize the research, and the “summary and conclusions” section is less than a single page. While this may be partly forgiven due to a scarcity of SDT research as of 1976, the authors discussed at least 13 articles, which may be enough to attempt such a synthesis. We do have a “Summary of Criticisms” table (p. 89) that helps us identify possible flaws in nine of the articles. Unfortunately, this table is half-baked—all cells are binary “Yes” or “No” data, without any indications of magnitude in columns such as “change in sensitivity” and “change in response bias.” Though yes/no might be a welcome relief from the overwhelming quantity of numbers in recent cognitive psychology literature, the authors could have, at minimum, used superscripts and footnotes to indicate particularly large changes or egregious methodological problems. Further, though an accurate criticism of much of the research, what constitutes “sufficient stimulus presentations” is not specified in the table. Finally, the authors included a table only for the nine modification studies they reviewed (pp. 84–91), neglecting to include a table for the four normative and comparative studies (pp. 91–92).

Lloyd and Appel have produced a strong primer on signal detection theory, complete with copious graphs and figures (pp. 80–84). It would be nice to see some of this explanatory power applied to the literature review, which is devoid of graphs and figures. For instance, they could have provided bar graphs comparing sensitivity and response bias for the different studies by Clark and associates involving diazepam, suggestion, and acupuncture (pp. 85–87). While the authors have explained that comparisons are difficult due to different standards between the studies (p. 88), it seems that studies with the same principal investigator (Clark) should be easier to compare. Nevertheless, the authors have vocalized solid criticisms, and I found the section discussing response biases and placebo effects to be particularly cogent (p. 84).

Reference

Lloyd, M. A., & Appel, J. B. (1976). Signal detection theory and the psychophysics of pain: An introduction and review. Psychosomatic Medicine, 38(2), 79–94.

Reaction to “The what, where, and why of priority maps and their interactions with visual working memory” by Zelinsky & Bisley (2015)

Reaction to Zelinsky & Bisley (2015) by Richard Thripp
EXP 6506 Section 0002: Fall 2015 – UCF, Dr. Joseph Schmidt
September 30, 2015 [Week 6]

Zelinsky and Bisley (2015) have presented a literature review regarding visual working memory and priority maps, reaching the conclusion that a vital relationship exists between these concepts, even though researchers often ignore the connection (p. 159). Further, the authors believe priority maps play an integral role in goal-directed behavior, and propose the common source hypothesis: that visual working memory is the foundation for goal prioritization, which is “propagated to all the effector systems” through tailored priority maps for each system, all reaching toward a common goal such as making a cup of tea (pp. 159–160). Priority maps may prevent “interrupts” (distractions) from stealing priority and preventing the goal from being reached.

Zelinsky and Bisley spend a great deal of time talking about the oculomotor system (pp. 156–158). They argue that it and visual working memory provide us with the model for priority maps and that this model generalizes to other visuomotor systems. They discuss evidence of a transformation from retinotopic to motor reference frames as priority maps move from the parietal cortex to the frontal lobe, and predict that a similar transformation will be found for responses in the premotor cortices (pp. 157–158).

The authors seem to have conflated general working memory (“WM”) with visual working memory (“VWM”)—they only refer to WM in regard to the tea-brewing task (p. 160) and argue for the centrality and singular importance of visual working memory throughout their paper. They reach the perhaps regrettable conclusion that all priority maps must have a topographical representation (p. 156). They give agreeable examples involving arm movements (p. 158), saccades (p. 159), and choosing to run right or left (p. 161), while conveniently leaving out discussion of hearing, smell, taste, and touch. Can we not have an auditory or olfactory priority map? Mechanics might listen for particular sounds to diagnose their machines; humans in general may have priority maps for particular smells and tastes to warn them of spoiled or poisonous food. How are these maps topographical? Just because there is a glut of research on vision and visual working memory does not mean that we should simply interpolate such findings to other domains without supporting evidence. Perhaps “priority map” is not the best term, since it is admittedly “by definition, organized into a map of some space” (p. 161). Zelinsky and Bisley seem to want to generalize priority maps to all domains of human attention, and yet the analogy is ill-suited to many of them.

Principally, Baddeley and Hitch’s working memory model and its derivatives focus on the senses that are most salient and important to survival: sight (visuo-spatial sketchpad) and hearing (phonological loop). However, congenitally blind subjects have been found to have significantly better tactile working memory than even semi-blind subjects who were equally fluent in Braille (Cohen, Scherzer, Viau, Voss, & Lepore, 2011). Zelinsky and Bisley (2015) do not even once discuss blindness, nor the possibility of visual working memory’s dominance being experiential in origin. In their defense, congenitally blind subjects have been found to have spatial recognition for Braille reading and to use the same pathways for Braille reading that are typically used for the visual system (Cohen et al., 2010). Nevertheless, the role of experience should always be considered—the priority maps of sighted, congenitally blind, acquired blind, and semi-sighted individuals may have distinct differences. While it is easy to gloss over blind individuals due to their rarity, there may be much to learn from studying blindness. The authors may have benefited from identifying it as an area requiring further research, rather than extrapolating over it.

Cohen et al. (2011) present an intriguing possibility: working memory might have a higher capacity when spread over multiple modalities. Could this allow for several simultaneously operating priority maps? Consider a hunter-gatherer exploring a forest—he or she may have multiple priority maps for sight, hearing, smell, and touch (e.g. wind direction and skin temperature), each contributing to finding food and avoiding danger. It is then apparent that a more fundamental analysis, rather than the technical analysis that Zelinsky and Bisley have provided, may be in order. Not only should experiential sensory history and alternate models be considered, but even the possible evolutionary origins of priority maps.

References

Cohen, H., Scherzer, P., Viau, R., Voss, P., & Lepore, F. (2011). Working memory for braille is shaped by experience. Communicative & Integrative Biology, 4(2), 227–229. doi:10.4161/cib.4.2.14546

Zelinsky, G. J., & Bisley, J. W. (2015). The what, where, and why of priority maps and their interactions with visual working memory. Annals of the New York Academy of Sciences, 1339, 154–164. doi:10.1111/nyas.12606

Reaction to “Different states in visual working memory” by Olivers, Peters, Houtkamp, & Roelfsema (2011)

Reaction to Olivers, Peters, Houtkamp, & Roelfsema (2011) by Richard Thripp
EXP 6506 Section 0002: Fall 2015 – UCF, Dr. Joseph Schmidt
September 30, 2015 [Week 6]

Olivers, Peters, Houtkamp, and Roelfsema (2011) have presented a review of literature regarding interactions between visual working memory and attentional deployment with respect to search tasks. A major focus of their review is on orthogonal coding, where different informational sources are represented by different coding patterns within the same neuronal populations (p. 327). Their literature review concludes, purportedly by convergent evidence, that “only one memory representation can serve as a search template, and this representation blocks attentional guidance by other memory representations” (p. 330).

For me, the idea that only one “search template” can be loaded into visual working memory for active processing, while other templates must be held in abeyance, brings two computing analogies to mind. First, the Microsoft Windows “Clipboard”—a space where text, images, files, or other data may be held, but only one item or set of items can be held at a time—anything from a single character of text to a massive folder with hundreds of files and subfolders. While the virtually unlimited capacity of the Windows clipboard is not analogous, the idea of having to swap things in and is, and becomes particularly salient when you have two types of content that you want to paste into a file in multiple different places, or when you must remember not to accidentally overwrite your clipboard contents. Second, the entire concept reminds me of paging and swap files. Modern computer operating systems exchange information between random access memory or RAM (lower capacity but very fast processing) and hard disk drives or solid-state drives (higher capacity but much slower processing). In this analogy, the active search template is loaded into RAM for efficient processing, while the accessory item(s) are maintained on the HDD or SSD. Swapping search templates is not trivial—RAM is often 1,000 times faster than conventional hard disk drives. While this latency difference is much greater than the sub-5% latency differences shown in typical experiments (p. 329), it represents a conceptually similar process.

If the search target is used repeatedly, the search process is offloaded to “less demanding memory representations” and becomes automated (pp. 328–329), thus freeing up explicit, effortful working memory for a new search template. This is seen in the differences between color search tasks for 1 of 3 colors as compared to 2 of 3 colors—the former is more efficient and neither distractor color captures attention, but the lone distractor color captures attention when looking for 2 of 3 colors (p. 330). The authors ask whether this generalizes to other types of memory, and lament there is a lack of research in this area (p. 332). There is a potential conceptual overlap with other types of memory—for example, one’s name might be an example of an automized search template with respect to auditory cognition and might have explanatory power for the cocktail party phenomenon (Wood & Cowan, 1995). Text search may be another area of interest—for example, say you are searching a printed bank statement for two transactions of different amounts. Should you try to load both search templates at once, or should you make two passes over the statement, looking for only one amount on each pass? How will completion time and error rates vary? While text search involves vision, it is also distinct from colors or objects and involves different considerations such as language, words versus numbers, context, etc. Moreover, implications drawn from visual working memory research might apply in many other areas. At the very least, they can aid in developing research questions.

References

Olivers, C. N. L., Peters, J., Houtkamp, R., & Roelfsema, P. R. (2011). Different states in visual working memory: When it guides attention and when it does not. Trends in Cognitive Sciences, 15(7), 327–334. doi:10.1016/j.tics.2011.05.004

Wood, N., & Cowan, N. (1995). The cocktail party phenomenon revisited: How frequent are attention shifts to one’s name in an irrelevant auditory channel? Journal of Experimental Psychology, 21(1), 255–260.

Reaction to “Visual attention within and around the field of focal attention: A zoom lens model” by Eriksen & St. James (1986)

Reaction to Eriksen & St. James (1986) by Richard Thripp
EXP 6506 Section 0002: Fall 2015 – UCF, Dr. Joseph Schmidt
September 24, 2015 [Week 5]

Eriksen and St. James (1986) believe their experiments support the zoom lens model, which differs from the spotlight model in that it proposes we can vary our attentional distribution on a continuum from a wide field of view to a fraction of a degree (pp. 226–227). The spotlight model typically involves a discrete or even binary dichotomy where we can only have a broadly or narrowly focused attentional field, but with restricted or nonexistent choices in between these two poles (p. 226).

As a photographer, I could not help but thinking of analogies to camera lenses and digital processing chips while reading this article, particularly since that is the crux of the authors’ analogy. The authors indicate in the discussion for experiment 1 that a 50 ms stimulus onset asynchrony (SOA) does not allow time for the attentional field to “zoom in,” so to speak, so an incompatible noise letter three positions away from the cued area delays the subject’s response time, but if given 100 ms, delayed reaction time is not observed, which may indicate the noise letter is now excluded or “cropped out,” so to speak (p. 233). This reminds me of the autofocus delay on cameras, which often measures in hundreds of milliseconds and can prevent the photographer from capturing desired moments.

Regarding the displays in experiment 1 where 3 of 8 letters were cued, reaction times paradoxically increased in the 200-ms SOA condition as compared to the 100 ms or even 50 ms displays. As an explanation, the authors present the possibility that with 3 of 8 letters cued, attending to the whole display may be nearly as efficient as attending to the cued area; thus, subjects may have elected to attend to the whole field in some displays, increasing reaction time (p. 234). Unfortunately, this is a post-hoc explanation and the experiments did not entail the collection of data to support this possibility. While the authors believe experiment 2 verified this explanation, it also had a very small sample size (n = 6), fewer trials, and an incompatible noise letter that was comparatively ineffective (p. 239). Fortunately, the authors seem to have produced a stronger argument that the cued letters are searched simultaneously rather than serially—specifically, that reaction times between 1, 2, and 3 cued positons increased far less than it should have if the positions were searched serially (p. 234).

In both experiments, multiple cued letters were always adjacent to each other in the circle. It would be interesting to see 2 cued locations not adjacent to each other—would the subject revert to processing the whole display, or somehow divide attention between the non-adjacent cued locations? How would this fit into the zoom lens model? Also: the authors assume that with no precues, all display elements are processed in parallel (pp. 232–233). In experiment 2, they include displays where all 8 letters are precued (p. 237). It would be nice to see if there are any implications of precueing all the positions versus none of them. When all the letters are underlined, does the underlining have any effects on reaction time? In neither experiment were there any conditions that had no cued or precued letters.

The authors’ ANOVA results have impressive statistical significance and they have purposely used methodology similar to past research in the hopes of allowing compatible comparisons (p. 229). However, I have lingering doubts about unspecified variables. We are given very little detail about the subjects—only that they are right-handed University of Illinois students who self-reported having normal or corrected vision (pp. 229, 237). Who is to say these self-reports were accurate? Why did the authors not bother with a visual acuity test? What were the ages of the participants? Did they have any other visual or attentional problems? The sample sizes of 8 and 6 are fairly small, meaning that a smaller number of non-equivalent participants could have thrown off the results. This research was published in 1986 and used a tachistoscope with individually constructed slides with affixed letters, rather than computer displays (pp. 229–230). The care and uniformity with which these slides were constructed is not specified. We are told subjects received reaction time feedback after each trial (p. 230), but not how this feedback was structured or conveyed. Whether the feedback was spoken by the researchers or conveyed in text or graphs may have implications. Further, encouraging participants to keep their error rate below 10% (p. 230) could have been a factor in the unusual reaction time pattern shown in figure 4 (p. 232)—perhaps this pattern is not indicative of parallel processing, but rather error avoidance? Despite the statistical power of their results, the authors may be overreaching, based on the assuredness of their conclusions.

Reference

Eriksen, C., & St. James, J. (1986). Visual attention within and around the field of focal attention: A zoom lens model. Perception & Psychophysics, 40(4), 225–240. doi:10.3758/BF03211502