Turnout didn’t decide the election. Preferences did.

Ahead of the election, I was chatting with a reporter. They mentioned that everyone was talking about turnout. Philadelphia had seen weak relative turnout in the last two years, they pointed out, and that would decide this race.

I pushed back. Turnout is not the story here. We know basically what turnout will be. The big open question is preferences.

I wish I had gone on the record. I would have looked like a genius.

Put another way, as I explained on that Friday ahead of the election, if there were one piece of information that would help me predict the outcome, it would not be the relative turnouts of Voting Blocs. It would be Helen Gym’s performance in the Black Wards. If she gets 30%, she wins. If she gets 10%, she loses. Both seemed in play. She ended up at 14.

This brings up an important point about this Municipal Primary:

Preferences decided the election, not turnout

First, let’s be clear what I mean. It is trivially true that if zero of a candidate’s supporters turned out to vote, she would have lost. So in a completely uninteresting sense, turnout mattered.

A more useful statement is that the plausible range of turnouts in Voting Blocs had much less impact on the final result than the plausible range of preferences. The Black Voters Divisions were certainly going to represent between 35-45% of the City’s votes.

But with only limited polling, we had only a rough guess at voters’ preferences. Support for Cherelle Parker in those Divisions could have plausibly been anywhere from 30% to 60% in this crowded race. She ended up at 57.

The plausible range of turnouts in Black Voter Divisions could have swung the topline result +/- 2.5 percentage points. The plausible range of preferences in those Black Voter Divisions could have swung it +/- 6. Cherelle Parker won because she did extremely well among people we always expected to vote, and not by achieving an extreme turnout among her base.

Some Math

Let’s formalize this. A candidate’s overall proportion of the vote is the average of their proportions \(p_i\) in each geography \(i\), weighted by turnout \(t_i\).

\[ p = \frac{1}{\sum t_i} \sum t_i p_i \]

We can normalize turnout using \(\tilde{t}_i = \frac{t_i}{\sum t_i}\), so that \(\tilde{t}_i\) is each geography’s proportion of total turnout. Then

\[ p = \sum \tilde{t}_i p_i \]

Ahead of the election, we don’t know what each \(\tilde{t}_i\) and \(p_i\) is. Instead, we have priors with variances. The Law of Total Variance tells us

\[ Var(p) = E_p[Var( \sum \tilde{t}_i p_i) |\vec{p}] + Var_p(E[ \sum \tilde{t}_i p_i] |\vec{p}) \] where \(\vec{p}\) is the vector of all \(p_i\).

The expectations in the second term simply add. And we will assume that a candidate’s \(p_i\) is uncorrelated across geographies (that is, of course, the definition of my Voting Blocs).

\[\begin{align*} Var(p) &= E_p[Var( \sum \tilde{t}_i p_i) |\vec{p}] + Var_p( \sum E[\tilde{t}_i] p_i |\vec{p}) \\ &= E_p[Var( \sum \tilde{t}_i p_i) |\vec{p}] + \sum E[\tilde{t}_i] Var(p_i) \end{align*}\]

The variances in the first term are a little bit complicated, since the normalization of \(\tilde{t}\) means they will negatively covary across Divisions.

If we simplify to assuming only two geographies (for example, two Voting Blocs), then the Blocs will have a perfect -1 correlation in \(\tilde{t}\).

\[ Var(\tilde{t}) = \left[\begin{array}{rr} \sigma_t^2 & -\sigma_t^2 \\ -\sigma_t^2 & \sigma_t^2 \end{array}\right] \]

In this case, the uncertainty in the topline result reduces to

\[ Var(p) = \sigma_t^2 E[(p_1 – p_2)^2] + (E[\tilde{t}_1] Var(p_1) + E[\tilde{t}_2] Var(p_2)) \]

Which of these terms is bigger? In this past election, the second was much bigger than the first.

The standard deviation of uncertainty in turnout proportions (\(\sigma_t\)) was maybe 0.04, with expected performance differences between Blocs of maybe \(p_1 – p_2 \approx 0.4\), to be generous. That gives a contribution of \(0.04^2 \times 0.4^2 = 0.016^2\).

The standard deviation of uncertainty in candidate preferences (\(\sqrt{Var(p_i)}\)) was maybe 0.10. Since \(\sum{\tilde{t}} = 1\), that contributes the full \(0.10^2\).

The result: uncertainty in candidate’s performance in each Bloc contributed six times as much uncertainty as that of Blocs’ relative turnout!

So yeah, while it’s trivially true that with zero turnout, a candidate can’t win, it was Cherelle Parker’s high percentage among those who we expected to vote, and not a surprisingly high turnout among any group, that won the day.

A caveat

It’s important to note this analysis is for a municipal election with many candidates and sparse, uneven polling. Uncertainty in \(p\) was huge! In a Presidential election, comparatively, we know a lot more about \(p\). Uncertainty in \(t\) is probably more comparable in size to uncertainty in \(p\), though I doubt enough to actually become more important.