Who’s requesting mail-in ballots?

Covid-19 and PA’s first-ever mail-in ballot election.

With Coronavirus upon us, Philadelphians are re-evaluating how we might vote in the June 2nd Primary. Coincidentally, this is also the first year that Pennsylvanians can request mail-in ballots even if they won’t be absentee. This combination has led to a huge amount of requests, and big questions about what this might mean for turnout and the results.

The friendly folks at the Commissioners’ Office have sent me all of Philadelphia’s requests through Thursday May 7th. Let’s dig in.

An astounding 91,587 voters have requested ballots. Philadelphia has 913,000 active registered voters. In the 2019 Primary, 217,000 people voted. For a brand new program, that number is a huge percent.

Who is requesting ballots?

As with everything political in this city, that pattern wasn’t equal across the city.

View code
library(tidyr)
library(dplyr)
library(readr)
library(ggplot2)
library(sf)
library(magrittr)
library(lubridate)

setwd("C:/Users/Jonathan Tannen/Dropbox/sixty_six/posts/mail_in/")
source("../../admin_scripts/util.R")

fve <- readRDS("../../data/processed_data/voter_db/fve.Rds")
v2e <- readRDS("../../data/processed_data/voter_db/voters_to_elections.Rds")
elections <- readRDS("../../data/processed_data/voter_db/elections.Rds")

mail_in <- read_csv("../../data/voter_registration/mail_in/Absentee_Mail-in Ballot Listing (5-7-20).csv")

mail_in %<>% filter(!duplicated(`ID Number`))
mail_in %<>% left_join(fve, by=c("ID Number" = "voter_id"))
# table(mail_in$voter_status, useNA = "always")

divs <- st_read("../../data/gis/warddivs/201911/Political_Divisions.shp")divs %<>% mutate(warddiv = pretty_div(DIVISION_N))

mail_in_div <- mail_in %>% 
  group_by(warddiv) %>%
  summarise(n_mail = n())

fve_div <- fve %>%
  filter(voter_status == "A") %>%
  group_by(warddiv) %>%
  summarise(n_reg = n())

df_major <- readRDS("../../data/processed_data/df_major_20191203.Rds")
df_major %<>% mutate(warddiv = pretty_div(warddiv))

turnout <- df_major %>%
  filter(is_topline_office) %>%
  group_by(warddiv, year, election) %>%
  summarise(turnout = sum(votes))

turnout_wide <- turnout %>%
  unite("key", year, election) %>%
  spread(key, turnout)

divs %<>% left_join(mail_in_div) %>% mutate(n_mail = ifelse(is.na(n_mail), 0, n_mail))
divs %<>% left_join(fve_div)
divs %<>% left_join(turnout_wide)

These requestors are much more likely to have voted than a typical Philadelphian. A whopping 53% of them voted in the 2019 Primary (overall turnout among active voters was 24%). Some 52% of them voted in at least three of the last four elections, and 44% in six of the last eight (the number is presumably lower because of people who didn’t live in the state eight years ago).

View code
mail_in_v2e <- v2e %>%
  mutate(mail_in_reqd = voter_id %in% mail_in$`ID Number`) %>%
  inner_join(
    elections %>% filter(is_general | is_primary, year >= 2016) %>%
      select(election_date, year, fve_election_id)
  ) %>%
  group_by(voter_id, mail_in_reqd) %>%
  summarise(
    voted_2019 = !is.na(vote_method[election_date == "2019-05-21"]),
    voted_l4 = sum(!is.na(vote_method[year >= 2018])),
    voted_l8 = sum(!is.na(vote_method)),
    cnt = n()
  ) %>%
  group_by(mail_in_reqd) %>%
  summarise(
    voted_2019 = sum(voted_2019),
    voted_l4 = sum(voted_l4 >= 3),
    voted_l8 = sum(voted_l8 >= 6),
    cnt = n()
  )

mail_in_v2e <- bind_rows(
  mail_in_v2e %>% mutate(mail_in_reqd = ifelse(mail_in_reqd, "Requested Mail-In", "No Request")),
  mail_in_v2e %>% mutate(mail_in_reqd = "All") %>% group_by(mail_in_reqd) %>% summarise_all(sum)
)

ggplot(
  mail_in_v2e %>% filter(mail_in_reqd %in% c("Requested Mail-In", "All")) %>%
    mutate(mail_in_reqd = forcats::fct_inorder(mail_in_reqd)) %>%
    gather("key", "value", voted_2019, voted_l4, voted_l8) %>%
    mutate(
      key = case_when(
        key == "voted_2019" ~ "Voted 2019 Primary",
        key == "voted_l4" ~ "Voted in 3 of last 4",
        key == "voted_l8" ~ "Voted in 6 of last 8"
      )
    ),
  aes(x=key, y=100*value / cnt, fill=mail_in_reqd)
) + 
  geom_bar(stat="identity", position="dodge") +
  theme_sixtysix() +
  scale_fill_manual(values = c(strong_blue, grey(0.70))) +
  labs(
    x = NULL,
    y = "Percent",
    fill = NULL,
    title = "Requestors are much more likely to have voted.",
    subtitle = "Percents among voters who requested mail-in ballots"
  )

By Age

The mail-in requests are coming disproportionately from younger voters. The cohort aged 30-40 made up only 15% of votes in the 2019 Primary, but over 23% of the requests so far.

View code
age_cuts <- c(18, seq(10, 120, 10))
fve_age <- fve %>%
  filter(voter_status == "A") %>%
  mutate(
    age = (lubridate::ymd("2020-06-02") - lubridate::ymd(dob)) / dyears(1),
    age_cat = cut(age, age_cuts)
  ) %>%
  group_by(age_cat) %>%
  summarise(n = n()) %>%
  ungroup() %>%
  mutate(prop = n / sum(n))

fve_age_2019 <- fve %>%
  inner_join(
    v2e %>% 
      inner_join(elections %>% filter(election_date == "2019-05-21")) %>%
      filter(!is.na(vote_method))
  ) %>%
  mutate(
    age = (lubridate::ymd("2020-06-02") - lubridate::ymd(dob)) / dyears(1),
    age_cat = cut(age, age_cuts)
  ) %>%
  group_by(age_cat) %>%
  summarise(n = n()) %>%
  ungroup() %>%
  mutate(prop = n / sum(n))

mail_in_age <- mail_in %>%
  mutate(
    age = (lubridate::ymd("2020-06-02") - lubridate::ymd(dob)) / dyears(1),
    age_cat = cut(age, age_cuts)
  ) %>%
  group_by(age_cat) %>%
  summarise(n = n()) %>%
  ungroup() %>%
  mutate(prop = n / sum(n))

age_cats <- bind_rows(
  mail_in_age %>% mutate(source = "Mail-In Requests"),
  fve_age_2019 %>% mutate(source = "2019 Primary Voters"),
  fve_age %>% mutate(source = "Active Registered Voters")
) %>%
  mutate(source = forcats::fct_inorder(source))


ggplot(
  age_cats %>% 
    filter(!is.na(age_cat), !age_cat %in% c("(0,10]","(100,110]","(110,120]")) %>%
    mutate(
      age_cat = droplevels(age_cat)
    ),
  aes(x = age_cat, y = 100 * prop, fill=source)
) + geom_bar(
  stat="identity",
  position="dodge"
) +
  theme_sixtysix() +
  labs(
    y="Percent of Total",
    x = "Age",
    fill = ""
  ) +
  scale_fill_manual(values = c(strong_blue, grey(0.70), grey(0.9))) +
  ggtitle("Voters under 40 are disproportionately requesting ballots")

By Neighborhood

Geographically, the patterns are even starker. More than 13,000 of the requests came from the two Center City wards–5 and 8–alone.

[See the Appendix for Division-level maps.]

View code
library(leaflet)

make_leaflet <- function(
  sf, 
  color_col, 
  title,
  is_percent=FALSE,
  pal="viridis"
){
  zoom <- 11
  
  min_value <- min(sf[[color_col]], na.rm=TRUE)
  max_value <- max(sf[[color_col]], na.rm=TRUE)
  
  sigfig <- round(log10(max_value-min_value)) - 1
  step_size <- 2 * 10^(sigfig)
  
  min_value <- step_size * (min_value %/% step_size)
  max_value <- step_size * (max_value %/% step_size + 1)
  
  color_numeric <- colorNumeric(
    pal, 
    domain=c(min_value, max_value)
  )
  
  legend_values <- seq(min_value, max_value, step_size)
  legend_colors <- color_numeric(legend_values)
  if(is_percent){
    legend_labels <- sprintf("%s%%", legend_values)
  } else{
    legend_labels <- scales::comma(legend_values)
  }
  
  bbox <- st_bbox(sf)
  
  leaflet(
    options=leafletOptions(
      minZoom=zoom, 
      # maxZoom=zoom,
      zoomControl=TRUE
      # dragging=FALSE
    )
  )%>% 
    addProviderTiles(providers$Stamen.TonerLite) %>%
    addPolygons(
      data=sf$geometry,
      weight=0, 
      color="white", 
      opacity=1, 
      fillOpacity = 0.8, 
      smoothFactor = 0,
      fillColor = color_numeric(sf[[color_col]]),
      popup=sf$popup
    ) %>%
    setMaxBounds(
      lng1=bbox["xmin"],
      lng2=bbox["xmax"],
      lat1=bbox["ymin"],
      lat2=bbox["ymax"]
    ) %>%
    addControl(title, position="topright", layerId="map_title") %>% 
    addLegend(
      layerId="geom_legend",
      position="bottomright",
      colors=legend_colors,
      labels=legend_labels
    )
}

wards <- divs %>% 
  mutate(ward = substr(warddiv, 1, 2)) %>%
  group_by(ward) %>%
  summarise_at(
    .funs=list(sum), 
    .vars=vars(n_mail, n_reg, `2002_general`:`2019_primary`)
  )

wards %<>% 
  mutate(
    popup = sprintf(
      paste(
        c(
          "<b>Ward %s</b>",
          "Active Registered Voters: %s",
          "Mail-In Requests: %s",
          "Votes in the 2019 Primary: %s",
          "Votes in the 2016 Primary: %s"
        ),
        collapse = "<br>"
      ),
      ward,
      scales::comma(n_reg),
      scales::comma(n_mail),
      scales::comma(round(`2019_primary`)),
      scales::comma(round(`2016_primary`))
    )
  )

wards$color_mail <- wards$n_mail
wards_mail <- make_leaflet(
  wards,
  "color_mail",
  "Number of Mail-In Requests"
)
library(widgetframe)
knitr::opts_chunk$set(widgetframe_self_contained = TRUE) 
frameWidget(wards_mail)

As a percent of all voters, Center City, the wealthy ring around it, and Mount Airy/Chestnut Hill stand out. These Wards make up what I’ve dubbed the Wealthy Progressives.

View code
wards$color_mail_vs_reg <- 100 * wards$n_mail / wards$n_reg
mail_reg <- make_leaflet(
  wards,
  "color_mail_vs_reg",
  "Mail-In Requests as % of Active Registered Voters",
  is_percent=TRUE
)
knitr::opts_chunk$set(widgetframe_self_contained = TRUE)
frameWidget(mail_reg)
View code
wards$color_mail_vs_2019 <- 100 * wards$n_mail / wards$`2019_primary`
mail_2019 <- make_leaflet(
  wards,
  "color_mail_vs_2019",
  "Mail-in requests as % of 2019 Primary voters",
  is_percent=TRUE
)
knitr::opts_chunk$set(widgetframe_self_contained = TRUE)
frameWidget(mail_2019)

In fact, aggregating by those Division Cohorts paints a stark picture.

View code
div_cats <- readRDS("../../data/processed_data/div_cats_20200411.RDS")
divs %<>% left_join(div_cats %>% select(warddiv, cat))
cat_colors <- c(
  "Black Voters" = light_blue, 
  "Wealthy Progressives" = light_red, 
  "White Moderates" = light_orange, 
  "Hispanic North Philly" = light_green
)

div_cats <- divs %>% 
  as.data.frame() %>%
  select(-geometry) %>%
  group_by(cat) %>%
  summarise_at(
    .funs=list(sum), 
    .vars=vars(n_mail, n_reg, `2002_general`:`2019_primary`)
  ) %>%
  ungroup() 

ggplot(
  div_cats,
  aes(x = cat, y = n_reg, fill=cat)
) + geom_bar(
  stat="identity",
  position="dodge"
) +
  theme_sixtysix() + # %+replace% theme(axis.text.x = element_blank())+
  scale_y_continuous(labels=scales::comma) +
  labs(
    y=NULL,
    x = NULL,
    fill = "",
    title="Black Voter Divisions represent nearly half of Active Voters",
    subtitle="Active Registered Voters"
  ) +
  scale_fill_manual(values = cat_colors, guide=FALSE)

View code
ggplot(
  div_cats %>%
    mutate(prop_mail = n_mail / n_reg),
  aes(x = cat, y = 100 * prop_mail, fill=cat)
) + geom_bar(
  stat="identity",
  position="dodge"
) +
  theme_sixtysix() %+replace% theme(title = element_text(size = 10)) +
  labs(
    y=NULL,
    x = NULL,
    fill = "",
    title="But Wealthy Progressive Divisions have the most mail-in requests",
    subtitle="Mail-in requests as a percent of active registered voters"
  ) +
  scale_fill_manual(values = cat_colors, guide=FALSE)

View code
ggplot(
  div_cats %>%
    mutate(prop_2019 = n_mail / `2019_primary`),
  aes(x = cat, y = 100 * prop_2019, fill=cat)
) + geom_bar(
  stat="identity",
  position="dodge"
) +
  theme_sixtysix() + # %+replace% theme(axis.text.x = element_blank())+
  labs(
    y=NULL,
    x = NULL,
    fill = "",
    title="Mail-in requests as percent of 2019 Primary turnout",
    subtitle="Requests from Wealthy Progressive Divisions are nearly 80% of their 2019 turnout."
  ) +
  scale_fill_manual(values = cat_colors, guide=FALSE)

What this means for the election

Since we’ve never had mail-in voting before and it’s been over 100 years since the last pandemic, it’s impossible to know what this means for total turnout in June.

Two things are possible:

  1. These mail-in percents represent the actual turnout. Voters decide not to vote in person, and Wealthy Progressives vastly over-represent the electorate.

  2. Voters that aren’t requesting mail-in ballots are just planning to vote in person, and the gap totally closes on Election Day. The city-wide turnout looks normal.

Probably, the outcome will be somewhere in between. What does that mean for PA-1 and other local races? Coming soon!

Appendix: Division Maps

View code
divs %<>% 
  mutate(
    popup = sprintf(
      paste(
        c(
          "<b>Division %s</b>",
          "Active registered voters: %s",
          "Mail-in requests: %s",
          "Votes in the 2019 Primary: %s",
          "Votes in the 2016 Primary: %s"
        ),
        collapse = "<br>"
      ),
      warddiv,
      scales::comma(n_reg),
      scales::comma(n_mail),
      scales::comma(round(`2019_primary`)),
      scales::comma(round(`2016_primary`))
    )
  )


divs$color_mail <- divs$n_mail
div_color_mail <- make_leaflet(
  divs,
  "color_mail",
  "Number of mail-in requests"
)
knitr::opts_chunk$set(widgetframe_self_contained = TRUE)
frameWidget(div_color_mail)
View code
divs$color_mail_vs_reg <- 100 * divs$n_mail / divs$n_reg
l0 <- make_leaflet(
  divs,
  "color_mail_vs_reg",
  "Mail-in requests as % of active registered voters",
  is_percent=TRUE
)
knitr::opts_chunk$set(widgetframe_self_contained = TRUE)
frameWidget(l0)
View code
divs$color_mail_vs_2019 <- 100 * divs$n_mail / divs$`2019_primary`
l1 <- make_leaflet(
  divs,
  "color_mail_vs_2019",
  "Mail-in requests as % of 2019 Primary voters",
  is_percent=TRUE
)
knitr::opts_chunk$set(widgetframe_self_contained = TRUE)
frameWidget(l1)
View code
divs$color_mail_vs_2016 <- 100 * divs$n_mail / divs$`2016_primary`
l2 <- make_leaflet(
  divs,
  "color_mail_vs_2016",
  "Mail-in requests as % of 2016 Primary voters",
  is_percent=TRUE
)
knitr::opts_chunk$set(widgetframe_self_contained = TRUE)
frameWidget(l2)

The race for PA Senate 1

Party control of the Pennsylvania Senate will be contested this November, especially important in 2020 as the winners will draw the post-Census boundaries.

But that’s in November. Before that, in April’s Primary, we are going to have a rare contested senate race right here in our city. Incumbent Senator Larry Farnese is being challenged by fellow (former) Ward leader Nikil Saval (Saval stepped down from the second to be able to run for this seat). Saval, a co-founder of Reclaim Philadelphia, is presumably trying to win by running to Farnese’s left, a heat check of the recent successes of Larry Krasner and Kendra Brooks.

What are the dynamics at play? Let’s break down Philadelphia’s State Senate PA-1.

View code
library(tidyverse)
library(sf)
library(ggmap)
library(magrittr)
source("../../admin_scripts/util.R")

df_major <- readRDS("../../data/processed_data/df_major_2019-12-03.Rds")
df_major %<>% mutate(warddiv = pretty_div(warddiv))

divs <- st_read("../../data/gis/warddivs/201911/Political_Divisions.shp") %>%
  mutate(
    warddiv = pretty_div(DIVISION_N),
    area = as.numeric(st_area(geometry)) / (1609^2)  # mi^2
  )wards <- st_read("../../data/gis/warddivs/201911/Political_Wards.shp") %>%
  mutate(ward = sprintf("%02d", asnum(WARD_NUM)))sts <- st_read("../../data/gis/state_senate/tigris_upper_house_2015.shp") %>%
  st_transform(4326)sts_1 <- sts %>% filter(SLDUST == "001")
expand <- function(x, factor){
  mean_x <- mean(x)
  return(mean_x + factor * (x - mean_x))
}

expand_bb <- function(bb, factor){
  bb[c(1,3)] <- expand(bb[c(1,3)], factor)
  bb[c(2,4)] <- expand(bb[c(2,4)], factor)
  bb
}

st_bbox <- function(obj, expand, ...){
  bb <- sf::st_bbox(obj, ...)
  bb %<>% expand_bb(expand)
  names(bb) <- c("left", "bottom", "right", "top")
  bb
}

bb <- st_bbox(sts_1, expand=1.2)

basemap <- ggmap::get_stamenmap(
  bbox=bb, 
  zoom = 12, 
  maptype="toner-lite"
)

get_labpt_df <- function(sf){
  df <- st_centroid(sf) %>%
    mutate(
      x=sapply(geometry, "[", 1),
      y=sapply(geometry, "[", 2)
    ) %>%
    as.data.frame() %>%
    select(-geometry)
  return(df)
}

district_map <- ggmap(
  basemap, 
  extent="device",
  base_layer = ggplot(sts),
  maprange = FALSE
) +
  theme_map_sixtysix() %+replace%
  theme(legend.position = c(0.8, 0.2))

The geography of PA-1

[Reminder: You can make all of these maps and more on the Ward Portal!]

PA-1 covers most of South Philly (minus Point Breeze), Center City, and reaches North into Brewerytown, Kensington, and Port Richmond. Both of the candidates were until recently Ward Leaders in the district: Farnese still is in West Center City’s 8th Ward, and Saval was in Queen Village’s 2nd until resigning to run for this seat.

View code
phila_whole <- st_union(wards)
bb_phila <- st_bbox(phila_whole, 1.2)

ggplot(phila_whole) +
  geom_sf(fill=strong_green, color=NA, alpha=0.4) +
  geom_sf(
    data=st_crop(sts, bb_phila),
    fill=strong_green, color = "white", size = 1, alpha=0.3
  ) +
  geom_sf(
    data=sts %>% filter(SLDUST == "001"),
    fill=strong_green, color = "white", size = 1
  ) +
  geom_sf_text(
    data = st_crop(sts, bb_phila),
    aes(label=asnum(SLDUST))
  ) +
  ggtitle("State Senate Districts") +
  theme_map_sixtysix() %+replace%
  theme(panel.grid.major = element_blank())

View code
divs$sts <- divs %>%
  st_centroid() %>%
  st_covered_by(sts) %>%
  (function(x){sts$SLDUST[unlist(x)]})

district <- "001"
district_map +
  geom_sf(
    data = wards %>% filter(ward %in% c("02", "08")),
    fill="black",
    alpha = 0.5,
    color=NA
  ) +
  geom_sf(
    aes(alpha = (SLDUST == district)),
    fill="black",
    color = "grey50",
    size=2
  ) +
  geom_sf_text(
    data=wards %>% filter(ward %in% c("02", "08")),
    aes(label = sprintf("Ward %i", asnum(ward))),
    color="white"
  ) +
  scale_alpha_manual(values = c(`TRUE` = 0.2, `FALSE` = 0), guide = FALSE) +
  ggtitle(sprintf("State Senate District %s", asnum(district)))

The vast majority of the district’s votes come from Center City, South Philly, and Fairmount, which represented the highest-turnout neighborhoods in the entire city in November’s General.

View code
turnout <- df_major %>%
  filter(is_topline_office) %>%
  group_by(warddiv, year, election) %>%
  summarise(turnout = sum(votes))

maxval <- 20e3
labelmax <- function(x, max) {
  ifelse(
    x == max, 
    paste0(scales::comma(x), "+"), 
    scales::comma(x)
  )
}

district_map +
  geom_sf(
    data=divs %>% 
      left_join(
        turnout %>% filter(year == 2019, election == "general")
      ),
    aes(fill = pmin(turnout/area, maxval)),
    color=NA,
    alpha=0.8
  ) +
  geom_sf(
    data=st_crop(sts, bb_phila), 
    fill=NA, 
    color="white",
    size=2
  ) +
  scale_fill_viridis_c(
    "Votes/\n mile^2", 
    labels=function(x) labelmax(x, maxval)
  ) +
  ggtitle("Votes in the 2019 General")

The district’s residents are mostly White: the ACS 2017 estimates have the district 61% White, 13% Black, 12% Asian, and 11% Hispanic. The boundaries manage to almost perfectly gerrymander out the predominantly-Black neighborhoods of Point Breeze, carving out below Washington, above Passyunk, and West of Broad; and North Philly, running along the racial boundary of Girard.

View code
census <- read_csv("../../data/census/acs_2013_2017_phila_bg_race_income.csv")

maxpop_name <- function(wht, blk, hisp, asn){
  maxes <- apply(cbind(wht, blk, hisp, asn), 1, which.max)
  maxes <- sapply(maxes, function(x) if(length(x) == 1) x else NA)
  c("white", "black", "hispanic", "asian")[maxes]
}

maxpop_val <- function(wht, blk, hisp, asn){
  apply(cbind(wht, blk, hisp, asn), 1, max)
}

census <- census %>%
  mutate(tract = substr(as.character(Geo_FIPS), 1, 11)) %>%
  group_by(tract) %>%
  summarise_at(
    vars(starts_with("pop_")), 
    funs(sum)
  ) %>%
  mutate(pop_total = pop_hisp + pop_nh) %>%
  mutate(
    pwht = pop_nh_white / pop_total,
    pblk = pop_nh_black / pop_total,
    phisp = pop_hisp / pop_total,
    pasn = pop_nh_asian / pop_total
  ) %>%
  mutate(
    maxpop = maxpop_name(pwht, pblk, phisp, pasn),
    maxpop_val = maxpop_val(pwht, pblk, phisp, pasn)
  )

tracts <- st_read("../../data/gis/census/cb_2015_42_tract_500k.shp") %>%
  filter(COUNTYFP == "101") %>%
  st_transform(4326) %>%
  left_join(census, by=c("GEOID" = "tract"))district_map +
  geom_sf(
    data=tracts %>% 
      filter(!is.na(maxpop_val)) %>%
      mutate(maxpop = format_name(maxpop)),
    aes(
      fill=maxpop,
      alpha = 100 * maxpop_val
    ),
    color=NA
  ) +
  geom_sf(
    data=st_crop(sts, bb_phila), 
    fill=NA, 
    color="white",
    size=2
  ) +
  scale_alpha_continuous(
    range=c(0, 1),
    guide=FALSE
  ) +
  scale_fill_manual(
    "Predominant\n Race/Ethn.",
    values = c(
      Black = light_blue,
      White = light_red,
      Hispanic = light_green,
      Asian = strong_orange
    )
  ) +
  ggtitle(
    "Predominant Race and Ethnicity", 
    "2017 ACS 5-year estimates. Opacity = Percent."
  )

PA-1’s recent elections

Farnese hasn’t been plausibly challenged since 2008, when he won the seat vacated by indicted Senator Vince Fumo.

Ironically, Farnese won that race on the votes of Center City’s wealthy progressives, edging out Johnny Doc 43% to 38% (candidate Anne Dicker won the last 19%).

View code
sts_2008 <- df_major %>% 
  filter(
    election == "primary", party == "DEMOCRATIC", 
    year == 2008, office == "SENATOR IN THE GENERAL ASSEMBLY",
    district == 1
  ) %>%
  group_by(ward, candidate) %>%
  summarise(votes = sum(votes)) %>%
  group_by(ward) %>%
  mutate(
    total_votes = sum(votes), 
    pvote=100 * votes/total_votes,
    candidate=format_name(candidate),
    is_winner = rank(desc(pvote)) == 1,
    margin = pvote - pvote[rank(desc(pvote)) == 2]
  )

district_map +
  geom_sf(
    data = st_intersection(wards, sts_1) %>% 
      inner_join(sts_2008 %>% filter(is_winner)),
    aes(fill=candidate, alpha=margin),
    color=NA,
  ) +
  scale_fill_manual(
    "Winner",
    values=c(light_green, light_purple)
  ) +
  scale_alpha_continuous("Margin (%)") +
  theme(legend.box = "vertical") +
  ggtitle("The 2008 PA-1 Democratic Primary")

Those progressive districts have only become more influential. In more recent elections, the district has run significantly more progressive than the city as a whole. Bernie Sanders performed 7 percentage points better than in the city as a whole (though still lost the district to Clinton), Krasner 6 pp better, and Brooks 1.6 pp.

View code
races <- tribble(
  ~year, ~election, ~office, ~short_office, ~candidates,
  "2016", "primary", "PRESIDENT OF THE UNITED STATES", "President", list("HILLARY CLINTON", "BERNIE SANDERS"),
  "2017", "primary", "DISTRICT ATTORNEY", "District Attorney",list("LAWRENCE S KRASNER", "JOE KHAN"),
  "2019", "general", "COUNCIL AT LARGE", "City Council", list("KENDRA BROOKS", "DAVID OH")
)

div_cats <- readRDS("../../data/processed_data/div_cats_2019-12-03.RDS")
divs %<>% left_join(div_cats)

df_cand <- df_major %>% 
  inner_join(races) %>%
  filter(election == "general" | party == "DEMOCRATIC") %>%
  left_join(divs %>% as.data.frame() %>% select(warddiv, sts, cat)) %>%
  group_by(warddiv, year, office) %>%
  mutate(total_votes = sum(votes), pvote=100 * votes/total_votes) %>%
  filter(mapply(`%in%`, candidate, candidates))

df0 <- df_cand %>%
  group_by(candidate, year, election, short_office) %>%
  summarise(
    pvote_city = weighted.mean(pvote, total_votes),
    pvote_sts01 = weighted.mean(pvote, ifelse(sts == "001", total_votes, 0))
  ) %>%
  ungroup()

df0 %>%
  mutate(candidate = format_name(candidate)) %>% 
  arrange(year) %>%
  mutate(election = paste(year, format_name(election))) %>%
  select(election, candidate, pvote_city, pvote_sts01) %>%
  knitr::kable(digits=1, col.names=c("Election", "Candidate", "City %", "PA-1 %"))
Election Candidate City % PA-1 %
2016 Primary Bernie Sanders 37.0 44.2
2016 Primary Hillary Clinton 62.6 55.3
2017 Primary Joe Khan 20.3 24.7
2017 Primary Lawrence S Krasner 38.2 44.6
2019 General David Oh 4.0 4.5
2019 General Kendra Brooks 4.5 7.1

Revisit Philadelphia’s Voting Blocs. The district is mage up mostly of Wealthy Progressive and White Moderate divisions.

View code
bloc_colors <- c(
  "Black Voters" = light_blue,
  "Wealthy Progressives" = light_red,
  "White Moderates" = light_orange,
  "Hispanic North Philly" = light_green
)
district_map +
  geom_sf(
    data=divs,
    aes(fill = cat),
    color=NA,
    alpha=0.8
  ) +
  geom_sf(
    data=st_crop(sts, bb_phila), 
    fill=NA, 
    color="white"
  ) +
  scale_fill_manual(
    "Voting Bloc",
    values=bloc_colors
  ) +
  ggtitle("Voting Blocs of PA-1")

(Notice that my blocs based on voting patterns don’t perfectly match the Census populations; my Black Voter bloc extends above Washington Ave and the Census population doesn’t. That’s because the blocs are based on all elections back to 2002, while the Census data is more recent. That’s an emergent boundary that’s moving South as Point Breeze gentrifies.)

View code
turnout_nums <- turnout %>% 
  filter(year %in% c(2016, 2017, 2019)) %>% 
  ungroup() %>% 
  inner_join(
    divs %>%
      filter(sts == "001") 
  ) %>% 
  group_by(year,election,cat) %>%
  summarise(turnout = sum(turnout)) %>% 
  group_by(year, election) %>%
  mutate(pturnout = turnout / sum(turnout)) %>% 
  as.data.frame()

The Wealthy Progressives cast 67% of the votes in the 2016 Primary, 73% in the 2017 Primary, and 68% in the 2019 General. Their votes drive the overall results of the district.

View code
cand_nums <- df_cand %>%
  group_by(year, election, short_office) %>%
  mutate(candnum = as.numeric(factor(candidate))) %>%
  select(year, election, short_office, candidate, candnum) %>%
  unique()

ggplot(
  df_cand %>% 
    ungroup() %>%
    filter(sts == "001") %>%
    group_by(candidate, year, election, short_office, cat) %>%
    summarise(votes = sum(votes)) %>%
    left_join(cand_nums)
) +
  geom_bar(
    aes(x = cat, y = votes, fill=cat),
    stat="identity"
  ) +
  scale_fill_manual(NULL, values=bloc_colors) +
  facet_grid(paste(year, format_name(election)) ~ candnum) +
  geom_label(
    data=cand_nums, 
    aes(label=format_name(candidate)), 
    x = 4.5, y=20e3,
    hjust=1,
    vjust=0
  ) +
  scale_y_continuous("Votes", labels = scales::comma) +
  theme_sixtysix() %+replace% 
  theme(
    strip.text.x = element_blank(),
    axis.text.x = element_blank()
  ) +
  labs(
    title="Recent results in PA-1",
    x=NULL
  )

The recent left-wing support comes slightly more from the gentrifying ring around Center City than from Center City itself, while deeper South Philly and Northern Kensington and Port Richmond remain more conservative.

View code
district_map +
  geom_sf(
    data=divs %>% 
      left_join(df_cand) %>%
      # left_join(office_max) %>%
      left_join(cand_nums) %>%
      group_by(year, election, short_office, warddiv) %>%
      mutate(pvote_head2head = 100 * pvote / sum(pvote)),
    aes(fill=pvote_head2head),
    color=NA,
    alpha=0.8
  ) +
  geom_sf(
    data=st_crop(sts, bb_phila), 
    fill=NA, 
    color="white"
  ) +
  geom_label(
    data = cand_nums,
    aes(label=format_name(candidate)),
    hjust=1.1,
    vjust=1,
    x = bb["right"],
    y = bb["top"]
  ) +
  facet_grid(year + election ~ candnum) +
  theme(
    strip.text.x = element_blank(),
    legend.position = "right"
  ) +
  scale_fill_viridis_c("Head-to-head\n % of Vote") +
  ggtitle("Recent Elections in PA-1", "Head-to-head Vote Percentages")

Farnese vs Saval in April

This race will test a dynamic that the city hasn’t really tested yet in the post-2016 world: a candidate who historically has performed well among Wealthy Progressive divisions being challenged from the left.

Because of that, it’s hard to use past elections to predict this one. There aren’t easily predictable fault-lines, the way there were in 2019’s Council races for District 2 and District 3.

But let’s assume Farnese does better than Saval in the White Moderate and Black Voter divisions, maybe by a gap of 70-30. And assume that the Wealthy Progressive divisions again make up 70% of the vote. Saval would need to win the Wealthy Progressives by 59 – 41 to win the district. That’s a steep challenge, but we’ve seen steep challenges overcome in recent months.

View code
prop_wealthy_progressive <- 0.7
line_slope <- -prop_wealthy_progressive / (1-prop_wealthy_progressive)
y_intercept <- 0.5 / (1-prop_wealthy_progressive)

cand_scatter <- df_cand %>% 
  ungroup() %>%
  filter(sts == "001") %>%
  mutate(
    is_wealthy = ifelse(
      cat == "Wealthy Progressives", "wealthyprogs", "other"
    )
  ) %>%
  group_by(candidate, year, election, short_office, is_wealthy) %>%
  summarise(votes = sum(votes)) %>%
  group_by(year, election, short_office, is_wealthy) %>%
  mutate(pvote_h2h = votes / sum(votes)) %>%
  ungroup() %>%
  select(year, election, short_office, candidate, is_wealthy, pvote_h2h) %>%
  spread(key=is_wealthy, value=pvote_h2h)
    

ggplot(
  cand_scatter,
  aes(x = 100*wealthyprogs, y=100*other)
) +
  geom_label(aes(label = format_name(candidate))) +
  geom_abline(
    slope = line_slope, 
    intercept = 100 * y_intercept,
    linetype="dashed"
  ) +
  coord_fixed() + 
  xlim(c(0,100)) +
  ylim(c(0, 100)) +
  annotate(
    "text",
    y=0,
    # x=-100*y_intercept/line_slope,
    x=0,
    label=sprintf("Assuming Wealthy Progressives have %s%% of turnout.\nCandidates are at their head-to-head percentages.", round(100*prop_wealthy_progressive)),
    hjust=0,
    vjust=-0.1,
    color="grey50"
  ) +
  labs(
    title="Win Line for PA-1",
    subtitle="Candidates to the top-right win.",
    x="Percent in Wealthy Progressive divisions",
    y="Percent in other divisions"
  ) +
  theme_sixtysix()

Need more? Create your own analyses on the Ward Portal!