ROC & AUC, animated

How well can you tell two stimuli apart from a noisy firing rate alone? Following Maneesh Sahani’s Theoretical Neuroscience slides on rate codes, suppose a neuron must support a binary choice — present / absent, up / down, horizontal / vertical. Call the two stimuli $\mathtt{s_0}$ and $\mathtt{s_1}$. On any trial the response (say a spike count) is $n$, drawn from one of two distributions:

$$ P(n\mid \mathtt{s_0})\quad\text{and}\quad P(n\mid \mathtt{s_1}). $$

A decision rule sets a criterion $c$ and reports “$\mathtt{s_1}$” whenever $n > c$. Two numbers summarise it, exactly as on the slides — the hit rate, in Wikipedia’s terms the true-positive rate (TPR), and the false-alarm rate, the false-positive rate (FPR):

$$ \begin{aligned} \text{TPR}(c)\ \ (\text{hit rate}) &= P(n>c \mid \mathtt{s_1}) = \int_{c}^{\infty}\! P(n\mid \mathtt{s_1})\,dn,\\[3pt] \text{FPR}(c)\ \ (\text{false-alarm rate}) &= P(n>c \mid \mathtt{s_0}) = \int_{c}^{\infty}\! P(n\mid \mathtt{s_0})\,dn. \end{aligned} $$

Both depend on the criterion $c$ — that is the whole point — so we carry the $(c)$ everywhere.

There is no single best $c$: lowering it catches more real $\mathtt{s_1}$ trials but also raises false alarms. Sweep $c$ across the whole axis and plot $\big(\text{FPR}(c),\,\text{TPR}(c)\big)$ — the receiver operating characteristic. Press play to sweep the criterion, drag the purple criterion line, or grab the operating point on the ROC plot directly; change $d'$ — how far apart the two distributions sit — to see the curve bow toward the perfect corner.

Receiver operating characteristic — sweep the criterion d′ 1.50 c 0.00 TPR(c) · hit rate 0.77 FPR(c) · false-alarm 0.23 AUC 0.86

response distributions P(n | s) — drag to move criterion

ROC space — drag the operating point

criterion c 0.00 separation d′ 1.50

TP · hit — area under P(n|s₁), n > c FP · false alarm — area under P(n|s₀), n > c FN · miss TN · correct rejection criterion c (threshold) operating point (FPR(c), TPR(c))

Each criterion c gives one (false-alarm, hit) point; sweeping c from left (call everything "s₁": top-right corner) to right (call nothing "s₁": bottom-left) traces the whole curve. AUC — the shaded area — is the probability a random s₁ response outranks a random s₀ response; for equal-variance Gaussians it equals Φ(d′/√2). The dashed diagonal is the no-discrimination line (AUC = ½, chance).

Reading the picture

The decision rule cuts the response axis at the criterion $c$: everything to the right is predicted “$\mathtt{s_1}$” (positive), everything to the left “$\mathtt{s_0}$” (negative). That splits each distribution into the four cells of the Wikipedia confusion matrix — true/false × positive/negative:

	predicted $\mathtt{s_1}$: $n>c$	predicted $\mathtt{s_0}$: $n\le c$
condition $\mathtt{s_1}$ — positives (P)	true positive · TP (hit)	false negative · FN (miss)
condition $\mathtt{s_0}$ — negatives (N)	false positive · FP (false alarm)	true negative · TN (correct rejection)

Counting each cell as the area under the relevant distribution, and since the decision is “positive when $n>c$”, each ROC axis is a tail probability — written in Wikipedia’s canonical form:

$$ \begin{aligned} \text{TPR}(c) &= P(n>c \mid \mathtt{s_1}) = \frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}} = \int_{c}^{\infty}\! P(n\mid \mathtt{s_1})\,dn,\\[4pt] \text{FPR}(c) &= P(n>c \mid \mathtt{s_0}) = \frac{\mathrm{FP}}{\mathrm{FP}+\mathrm{TN}} = \int_{c}^{\infty}\! P(n\mid \mathtt{s_0})\,dn. \end{aligned} $$

(TPR is also called sensitivity or recall; FPR is fall-out, $1-\text{specificity}$.) So the teal tail under $P(n\mid \mathtt{s_1})$ is $\text{TPR}(c)$ and the terracotta tail under $P(n\mid \mathtt{s_0})$ is $\text{FPR}(c)$; drop both onto the ROC axes and you get one point.

The operating point is the criterion

Each value of $c$ produces exactly one pair $\big(\mathrm{FPR}(c),\,\mathrm{TPR}(c)\big)$ — a single dot in ROC space, the operating point, drawn here in the same purple as the criterion line because it is the same thing seen twice. The ROC curve is nothing more than the image of the whole criterion axis under that map,

$$ c\ \longmapsto\ \big(\mathrm{FPR}(c),\,\mathrm{TPR}(c)\big), \qquad c:\ +\infty\to-\infty \ \Longrightarrow\ (0,0)\to(1,1), $$

so the two panels share a single degree of freedom: sliding the purple line and dragging the purple dot are the same knob. Raise the threshold (push $c$ right) and the point slides down the curve toward the origin — few hits but few false alarms, a conservative “rarely say $\mathtt{s_1}$”. Lower it (pull $c$ left) and the point climbs toward $(1,1)$ — a liberal “almost always say $\mathtt{s_1}$”. The map is monotone, so the operating point can only travel along the curve, never off it: the criterion chooses where on the curve you sit, while the separation $d'$ below sets which curve you are on.

What AUC measures

The area under the ROC curve is a single, threshold-free summary of how separable the two distributions are. Equivalently, it is the probability that a randomly drawn $\mathtt{s_1}$ response outranks a randomly drawn $\mathtt{s_0}$ response,

$$ \mathrm{AUC}=P\!\big(n_1 > n_0\big),\qquad n_1\sim P(n\mid \mathtt{s_1}),\; n_0\sim P(n\mid \mathtt{s_0}). $$

Show the derivation — why AUC = P(n₁ > n₀)

Write the ROC curve parametrically in the criterion $c$, keeping the rule “report $\mathtt{s_1}$ when $n>c$”. With $F_0,F_1$ the cumulative distribution functions of the two responses and $f_0=P(\,\cdot\mid \mathtt{s_0})$ the $\mathtt{s_0}$ density,

$$ \mathrm{FPR}(c)=P(n_0>c)=1-F_0(c),\qquad \mathrm{TPR}(c)=P(n_1>c)=1-F_1(c). $$

The AUC is the area under the curve — $\mathrm{TPR}(c)$ integrated against $\mathrm{FPR}(c)$ as $c$ sweeps the axis. Substituting $d\,\mathrm{FPR}=-f_0(c)\,dc$ and noting that $c:+\infty\to-\infty$ drives $\mathrm{FPR}:0\to1$, the two sign flips cancel:

$$ \mathrm{AUC}=\int_0^1 \mathrm{TPR}\;d\,\mathrm{FPR} =\int_{+\infty}^{-\infty}\!\big(1-F_1(c)\big)\big(-f_0(c)\big)\,dc =\int_{-\infty}^{\infty}P(n_1>c)\,f_0(c)\,dc. $$

Now read $f_0(c)\,dc$ as the chance the $\mathtt{s_0}$ draw lands at $n_0=c$. The integrand is then $P(n_1>c\mid n_0=c)\,P(n_0=c)$, and integrating over $c$ is exactly the law of total probability (using that $n_0,n_1$ are independent):

$$ \mathrm{AUC}=\int_{-\infty}^{\infty}P\!\big(n_1>n_0\mid n_0=c\big)\,P(n_0=c)\,dc =P\!\big(n_1>n_0\big). $$

So the area under the ROC curve is the probability that a random response to $\mathtt{s_1}$ outranks a random response to $\mathtt{s_0}$ — a single, threshold-free measure of separability. (Ties split evenly, contributing $\tfrac12$; for continuous distributions they have probability zero.)

For two unit-variance Gaussians a distance $d'=\mu_1-\mu_0$ apart — the equal-variance case drawn here — this has a closed form,

$$ \mathrm{AUC}=\Phi\!\left(\frac{d'}{\sqrt2}\right), $$

so $d'=0$ gives AUC $=\tfrac12$ (the dashed no-discrimination diagonal, pure chance) and larger $d'$ pushes the curve toward the perfect corner $(0,1)$ with AUC $\to 1$. Pull the $d'$ slider to watch the distributions separate and the area fill in.

Notation and the criterion-sweep construction after Maneesh Sahani, Theoretical Neuroscience (rate codes / signal detection, p. 301 ff). The FPR–TPR axes, the confusion-matrix terminology (TP/FP/FN/TN, sensitivity / fall-out) and the no-discrimination diagonal follow Wikipedia: Receiver operating characteristic; the criterion and its operating point share one colour to mark them as the same object.