Tables and Rates: Summarizing Task Performance

Tabulating outcomes

When we perform a block of trials, we can count up the occurrences of each of our four outcomes:

By counting our outcomes across all of the trials in the block we now have our first aggregate measures of performance. However, the counts are dependent not only our level of performance, but also on the total number of trials, which often isn’t that interesting to us. We can account for that by calculating rates of performance.

Accuracy

Perhaps the most familiar rate is accuracy, which summarizes our overall performance with a single number, by aggregating across all trials. Accuracy tells us, on average, how often we were correct versus how often we made an error:

Hits and correct rejections are combined together since they are both correct responses, while false alarms and misses are combined since they are both errors.

This is a live equation. You can edit the number of Hits, Misses, False Alarms, or Correct Rejections and the Accuracy will update immediately!

Hit rate (HR)

It turns out (as we will discuss in more detail below) that accuracy often isn’t a great measure to use. We can learn more by focusing a bit more…

If we only focus on trials when the signal was actually present, and calculate our average performance, we get a hit rate (HR), also called a true positive rate:

The hit rate tells us the proportion of trials when the signal was present that we correctly responded ‘present’.

Another live equation! Edit Hits or Misses and the Hit Rate updates immediately.

False alarm rate (FAR)

Likewise, if we only focus on trials when the signal was actually absent, and calculate our average performance, we get a false alarm rate (FAR), also called a false positive rate:

The false alarm rate tells us the proportion of trials when the signal was absent that we erroneously responded ‘present’.

Tabulating outcomes and rates

Now that we have a few rates to work with, we can add them to our table. Since the hit rate summarizes performance when the signal was present and the false alarm rate summarizes performance when the signal was absent, we can add them to the corresponding rows. And since accuracy is an overall summary, we’ll stick it in the bottom right corner.

Run a block of trials, and see how the table of outcomes and rates provides a running overview of performance:

Exploring outcomes and rates

Let’s learn a little more about the relationships between our rates and measures by playing around with hypothetical values:

This is a live table. Change any count or rate and observe the effect on other values. Note that since the counts are necessarily whole numbers, whereas the rates are real numbers, small changes in a rate may not be reflected in the counts due to rounding.

If you play around a bit, you may notice that, even if you keep the total number of trials constant, you can come up with more than one set of values that has the same accuracy. Hmmm, interesting…

Accuracy, the great deceiver

At first glance, accuracy seems like a very convenient way to summarize overall performance in a single number. But as students of Signal Detection Theory, we must be very wary of it. Indeed it is the great deceiver! Let us see why:

Consider this table of outcomes:

And now consider this table of outcomes:

First, note that the accuracy is identical in the two tables at 50%. But now note that the actual patterns of performance are completely different! In the first instance, the participant had one hundred hits and zero misses, whereas in the second instance the participant had the exact opposite. On the other hand, the first participant had one hundred false alarms and zero correct rejections, whereas again the second participant had the exact opposite. Indeed, if we look down the columns of the tables, we can observe that the first participant always responded ‘present’, whereas the second participant always responded ‘absent’. And yet, despite completely different performance, and completely different patterns of outcomes, they both resulted in the same accuracy. It turns out that in the context of signal detection, accuracy is a surprisingly poor indicator of performance.

On the other hand, the hit rate and false alarm rate are completely different for our two participants, clearly communicating the vast differences in performance. By telling us how the participant performed when the signal was present and when the signal was absent, the combination of HR and FAR give us a more complete summary picture of performance.

Now, you may be mumbling to yourself that something else peculiar is going on in this example, namely that neither participant seems to be paying any attention to the stimuli at all. (Always responding either ‘present’ or ‘absent’ can be done without regard to the stimulus.) You, my astute reader, are on to something important, which the next few pages will bring into focus.