The Turnout Funnel

The turnout in November was unprecedented. Nationally, we had the highest midterm turnout since the passage of the Voting Rights Act. In Philadelphia some 53% of registered voters voted a huge increase over 4 years ago, or really any midterm election in recent memory. Turnout can be hard to wrap your mind around, with a lot of factors that affect who votes and in what elections. What drives those numbers? Is it registration deviations? Or variations in the profile of the election? And how is that structured by neighborhood?

Decomposing the Funnel
I’m conceptualizing the entire process that filters down the whole population as a funnel, with four stages at which a fraction of the population drops out.

Let’s break down the funnel using the following accounting equation:

Voters per mile =
(Population over 18 per mile) x
(Proportion of over-18 population that are citizens) x
(Proportion of citizens that are registered to vote) x
(Proportion of registered voters that voted in 2016) x
(Proportion of those who voted in 2016 that voted in 2018).

The implications are very different if a neighborhood lags in a given one of these steps. If a ward has a low proportion of citizens that are registered, registration drives make sense. If a neighborhood has very low turnout in midterms vs presidential elections, then awareness and motivation is what matters.

We are also going to see that using metrics based on Registered Voters in Philadelphia is flawed. The lack of removal of voters from the rolls—which is actually a good practice, since the alternative is to aggressively remove potential voters—means that the proportion of registered voters that vote is not a useful metric.

Funnel Maps
Let’s walk through the equation above, map by map.

Overall, 44% of Philadelphia’s adults voted in 2018. That represented high turnout (measured as a proportion of adults) from the usual suspects: the ring around Center City, Chestnut Hill, Mount Airy, and the Oak Lanes.

How does the funnel decompose these difference?

First, consider something not in that map: population density. This is obviously, almost definitionally, important. Wards with more population will have more voters. ​

​The ring around Center City, North Philly, and the lower Northeast have much higher densities than the rest of the city. [Note: See the appendix for a discussion of how I mapped Census data to Wards]

​We don’t allow all of those adults to vote. I don’t have data on explicit voter eligibility, but we can at least look at citizenship.

The city as a whole is 92% citizens, and some 53 of the city’s 66 wards are more than 90%. Notable exceptions include University City, and immigrant destinations in South Philly and the lower Northeast.

The next step in the funnel: what fraction of those citizens are registered to vote? Here’s where things get hairy.

Overall in the city, there are 93% as many registered voters as there are adult citizens. That’s high, and a surprising 22 wards have *more registered voters than the census thinks there are adult citizens*. What’s up? Philadelphia is very slow to remove voters from the rolls, and this imbalance likely represents people who have moved, but haven’t been removed.

While people tend to use facts like this to suggest conspiracy theories, Philadelphia’s ratio is actually quite typical for the state, across wealthy and poor, Republican and Democratic counties. [See below!]

And it’s very (very) important to point out that this is good for democracy: being too aggressive in removing voters means disenfranchising people who haven’t moved. And we have no evidence that anybody who has moved is actually *voting*, just that their name remains in the database.

It does, however, mean that measuring turnout as a fraction of registered voters is misleading.

I explore the registered voter question below, but for the time being let’s remove the registration process from the equation above to sidestep the issue:

Voters per mile =
(Population over 18 per mile) x
(Proportion of Over 18 pop that are citizens) x
(Proportion of citizens that voted in 2016) x
(Proportion of those who voted in 2016 that voted in 2018).

Why break out 2016 voters first, then 2018? Because there are deep differences in the processes that lead people to voting in Presidential elections versus midterms, and low participation in 2016 and 2018 have different solutions. Presidential elections are obviously high-attention, high-energy affairs. If you didn’t vote in 2016, didn’t turn out in the highest-profile, biggest-budget election of the cycle, you either have steep barriers to voting (time constraints, bureaucratic blockers, awareness), or are seriously disengaged. If someone didn’t vote in 2016, it’s hard to imagine you’d get them to vote in 2018.

Compare that to people who voted in 2016 but not in 2018. These voters (a) are all registered, (b) are able to get to the polling place, and (c) know where their polling place is. This group is potentially easier to get to vote in a midterm: they’re either unaware of or uninterested in the lower-profile races of the midterm, or have made the calculated decision to skip it. Whichever the reason, it seems a lot easier to get presidential voters to vote in a midterm than to get the citizens who didn’t even vote in the Presidential race.

So which citizens voted in 2016?

Overall, 63% of Philadelphia’s adult citizens voted. This percentage has large variation across wards, ranging from the low 50s along the river, to 81% in Grad Hospital’s Ward 30 [Scroll to the bottom for a map with Ward numbers]. Wards 28 and 47 in Strawberry Mansion and North Philly have high percentages here, and also had high percentages in the prior map of registered voters, which to me suggests a combination of high registration rates in the neighborhood and a low population estimate from the ACS (see the discussion of registered voters below).

How many of these 2016 voters came out again two years later?

This map is telling. While there were fine-grained changes in the map versus 2014’s midterm, the overall pattern of who votes in midterms didn’t fundamentally change: it’s the predominantly White neighborhoods in Center City, its neighbors, and Chestnut Hill and Mount Airy, coupled with some high-turnout predominantly Black Wards in Overbrook, Wynnefield, and West Oak Lane.

The dark spot is the predominantly Hispanic section of North Philly. It’s not entirely surprising that this region would have low turnout in a midterm, but remember that this map has as its denominator *people who voted in 2016*! So there’s even disproportionate fall-off among people who voted just two years ago.

So that’s the funnel. We have some small variations in citizenship, confusing variations in proportions registered, steep differences in who voted in 2016, and then severely class- and race-based differences in who comes out in the midterm. Chestnut Hill and Center City have high scores on basically all of these dimensions, leading to their electoral dominance (except for Chestnut Hill’s low population density and Center City’s relatively high proportion of non-citizens). Upper North Philly and the Lower Northeast are handcuffed at each stage, with telling correlations between non-citizen residents and later voting patterns, even those which could have been unrelated, such as the turnout among citizens.

What’s going on with the Registered Voters?
It’s odd to see more registered voters than the Census claims there are adult citizens. I’ve claimed this is probably due to failing to remove from the the rolls people who move, so let’s look at some evidence.

First, let’s consider the problem of uncertainty in the population estimates. The American Community Survey is a random sample, meaning the population counts have uncertainty. I’ve used the 5-year estimates to minimize that uncertainty, but some still exists. The median margin of error in the population estimates among the wards is +/- 7%. This uncertainty is larger for less populous wards: the margin of error for Ward 28’s population is +/-12%. So the high registration-to-population ratios may be partially driven by an unlucky sample giving a small population estimate. But the size of the uncertainty isn’t enough to explain the values that are well above one, and even if it were large enough to bring the ratios just below one, the idea that nearly 100% of the population would be registered would still be implausibly high.

So instead, let’s look at evidence that these high ratios might be systematic, and due to lags in removal. ​First, consider renters.

Wards with more renters have higher registration rates in the population, including many rates over 100%. On the one hand, this is surprising because we actually expect renters to be *less* likely to vote; they are often newer to thei
r home and to the city, and less invested in local politics. On the other hand, they have higher turnover, so any lag in removing voters will mean more people are registered at a given address. The visible positive correlation suggests that the second is so strong as to even overcome the first.

​(For reference, here’s a map of Ward numbers)

There’s an even stronger signal than renters, though. Consider the Wards that lost population between 2010 and 2016, according to the Census.

The group of Wards that saw the greatest declines in population also have the highest proportions of registered voters (except for 65, which is a drastic outlier). This correlation suggests that the leaving residents may not be removed from the rolls, further pointing the finger at that as the culprit.

Finally, let’s look at Philadelphia versus *the rest of the state*. It turns out that Philadelphia’s registered voter to adult citizen ratio is typical of the state, especially when you consider its high turnover.

First, there isn’t a strong correlation between Registered Voters per adult citizen and Democratic counties.

Allegheny and Philadelphia, to two most Democratic counties are 1st and 6th, respectively, but predominantly-Republican Pike County, outside of Scranton, is second, and the rest of Philadelphia’s wealthier suburbs are 3, 4, 5, and 7. (I’m not totally sure what’s going on with Forest County, at the bottom of the plot, but the next scatterplot helps a little bit).

​More telling is a county’s fraction of the residents that have moved in since 2000: it looks like counties with higher turnover has higher ratios, which we would expect if the culprit were lagging removal.

Philadelphia here looks entirely normal: counties with more recent arrivals have higher ratios. This also sheds some light on Forest County. With a population of only 7,000, it’s an outlier in terms of long-term residents, and thus has a *very* low registration-to-citizen ratio.

Apologies for the long explanation. I didn’t want to just ignore this finding, but I’m terrified it’ll be discovered and used by some conspiracy theorist.

Exactly where the voters fall out of the funnel matters
Philadelphia’s diverse wards have a diverse array of turnout challenges. Unfortunately, the voter registration rolls are pretty unhelpful as a signal, at least in the simplistic aggregate way I considered them here. (Again: good for democracy, bad for blogging).

Which stage of the funnel matters depends on two things: their respective sizes, and how easy it is to move each. Is it more plausible to get out the citizens who didn’t vote in 2016? Or is it more plausible to get the 2016 voters out again in 2018? Where will registration drives help, and where is that not the issue?

​Next year’s mayoral primary, a local election with an incumbent mayor, will likely be an exaggerated version of the midterm, with even more significant fall-off than we saw in November. More on that later.

Appendix: Merging Census data with Wards
I use population data from the Census Bureau’s American Community Survey, an annual sample of the U.S. population with an extensive questionnaire on demographics and housing. Because it’s a sample, I use the five-year aggregate collected from 2012-2016. This will minimize the sampling uncertainty in the estimates, but mean there could be some bias in regions where the population has been changing since that period.

The Census collects data in its own regions: block groups and tracts. These don’t line up with Wards, so I’ve built a crosswalk from census geographies to political ones:
– I map the smallest geography–Census Blocks–to the political wards.
– Using that mapping, I calculate the fraction of each block group’s 2010 population in each Ward. I use 2010 because blocks are only released with the Decennial Census. Largely, any changes over time in block populations are swamped by across-block variations in populations that are consistent through time.
– I then apportion each block group’s ACS counts–of adults, of citizens–according to those proportions.
This method is imperfect; it will be off wherever block groups are split by a ward and the block group is internally segregated, but I think the best available approach.

The State House Model did really well. But it was broken.

Before November’s election, I published a prediction of the Pennsylvania General Assembly Lower House: Democrats would win on average 101.5 seats (half of the 203), and were slight favorites (53%) to win the majority. Then I found a bug, and published an updated prediction: Democrats would win 96 seats on average (95% confidence interval from 88 to 104), and had only a 13% chance of taking the house. This prediction, while still giving Republicans the majority, represented an average 14 seat pickup for Democrats from 2016.

That prediction ended up being correct: Democrats currently have the lead in 93 seats, right in the meat of the distribution.

But as I dug into the predictions, seat-by-seat, it looked like there were a number of seats that I got wildly wrong. And I’ve finally decided that the model had another bug; probably one that I introduced in the fix to the first one. In this post I’ll outline what it was, and what I think the high-level modelling lessons are for next time. 

Where the model did well, and where it didn’t

The mistake ended up being in the exact opposite of the fix I implemented: candidates in races with no incumbents.

The State House has 203 seats. This year, there were 23 uncontested Republicans and 55 uncontested Democrats. I got all of them right 😎. The party imbalance in uncontested seats was actually unprecedented in at least the last two decades, they’re usually about even. Among the 125 contested races, 21 had a Democratic incumbent, 76 a Republican incumbent, and 28 no incumbent. I was a little bit worried this new imbalance meant that the process of choosing which seats to contest had changed, and that past results in contested races would be different from this year. Perhaps Democrats were contesting harder seats, and would win a lower percentage of them than in the past. That didn’t prove true.

The aggregate relationship between my predictions and the results look good. I got the number of incumbents that would lose, both Democrat and Republican right. In races without incumbents, I predicted an even split of 14 Democrats and 14 Republicans, and the final result of 17 R – 11 D might seem… ok? Within the range of error, as we saw in the histogram above. But the scatterplot tells a different story.

Above is a scatterplot of my predicted outcome (the X axis) versus the actual outcome (the Y axis). Perfect predictions would be along the 45 degree line. Points are colored by the party of the incumbent (if there was one). The blue dots and the red dots look fine; my model actually expected any given point to be centered around this line, plus/minus 20 points, so that distribution was exactly expected.  But black dots look horribly wrong. Among those 28 races without incumbents,  I predicted all but three to be close contests, and missed many of them wildly. (That top black dot, for example, was Philadelphia’s 181, where I predicted Malcolm Kenyatta (D) slightly losing to Milton Street (R). That was wrong, to put it mildly. Kenyatta won 95% of the vote.)

What happened? My new model specification, done in haste to fix the first bug, imposed a faulty logic. It forced the past Presidential races to carry the same information about races without incumbents as races with, even though races with incumbents had other information. I should have allowed the model to fall back on district partisanship when it didn’t have incumbent results, but the equivalent of a modelling typo didn’t allow that. Instead, all of these predictions ended up at a bland, basically even race, because the model couldn’t use the right information to differentiate them. My overall house prediction ended up being good only because a few 28 of the total 203 districts were affected, and getting three too many wrong didn’t make the topline look too bad. But it was a bug.

Modelling Takeaways
I’m new to this world of publishing predictive models based on limited datasets and with severe time constraints (I can’t spend the months on a model that I would in grad school or at work). What are the lessons of how to build useful models under these constraints?

Lesson 1: Go through every single prediction. I never looked at District 181. If I had seen that prediction, I would have realized something was terribly wrong. Instead, I looked at the aggregate predictions (similar to the table, and things looked okay enough). Next time, I’ll force myself to go through every single prediction (or a large enough sample of predictions if there are too many). When I tried to hand-pick sanity checks based on my gut, I happened to not choose “a race with no incumbents, but which had an incumbent for decades, and which voted for Clinton at over 85%”.

Lesson 2: Prefer clarity of the model’s calculations over flexibility. I fell into the trap of trying to specify the full model in a single linear form. Through generous use of interactions, I thought I would allow the model flexibility for it to identify different relationships between historic presidential races and length of incumbency. This would have been correct, if I had implemented it bug-free. But I happened to leave out an important three-way interaction. If I had fit separate models for different classes of races–perhaps allowing the estimates to be correlated across models–I would have immediately noticed the differences.

Lesson 2b: I actually learned this extension to Rule 2 in the process of fitting, but the post-hoc assessment bangs it home: when you have good information, the model can be quite simple. In this case, the final predictions did well even with the bug because the aggregate result was really pretty easy to predict given three valuable pieces of information: (a) incumbency, (b) past state house results, and (c) FiveThirtyEight’s Congressional predictions. The last was vital: it’s a high-quality signal about the overall sentiment of this year’s election, which is the biggest factor after each district’s partisanship. A model that used only these three data points and then correctly estimated districts’ correlated errors around those trends would have gotten this election spot on.

Predictions will be back
For better or worse, I’ve convinced myself that this project is actually possible to do well, and I’m going to take another stab at it in upcoming elections. First up is May’s Court of Common Pleas election. These judges are easy to predict: nobody knows anything about the candidates, so you can nail it with just structural factors. More on that later!