Covid-19 and PA’s first-ever mail-in ballot election.

With Coronavirus upon us, Philadelphians are re-evaluating how we might vote in the June 2nd Primary. Coincidentally, this is also the first year that Pennsylvanians can request mail-in ballots even if they won’t be absentee. This combination has led to a huge amount of requests, and big questions about what this might mean for turnout and the results.

The friendly folks at the Commissioners’ Office have sent me all of Philadelphia’s requests through Thursday May 7th. Let’s dig in.

An astounding 91,587 voters have requested ballots. Philadelphia has 913,000 active registered voters. In the 2019 Primary, 217,000 people voted. For a brand new program, that number is a huge percent.

Who is requesting ballots?

As with everything political in this city, that pattern wasn’t equal across the city.

View code

library(tidyr)
library(dplyr)
library(readr)
library(ggplot2)
library(sf)
library(magrittr)
library(lubridate)

setwd("C:/Users/Jonathan Tannen/Dropbox/sixty_six/posts/mail_in/")
source("../../admin_scripts/util.R")

fve <- readRDS("../../data/processed_data/voter_db/fve.Rds")
v2e <- readRDS("../../data/processed_data/voter_db/voters_to_elections.Rds")
elections <- readRDS("../../data/processed_data/voter_db/elections.Rds")

mail_in <- read_csv("../../data/voter_registration/mail_in/Absentee_Mail-in Ballot Listing (5-7-20).csv")

mail_in %<>% filter(!duplicated(`ID Number`))
mail_in %<>% left_join(fve, by=c("ID Number" = "voter_id"))
# table(mail_in$voter_status, useNA = "always")

divs <- st_read("../../data/gis/warddivs/201911/Political_Divisions.shp")divs %<>% mutate(warddiv = pretty_div(DIVISION_N))

mail_in_div <- mail_in %>% 
  group_by(warddiv) %>%
  summarise(n_mail = n())

fve_div <- fve %>%
  filter(voter_status == "A") %>%
  group_by(warddiv) %>%
  summarise(n_reg = n())

df_major <- readRDS("../../data/processed_data/df_major_20191203.Rds")
df_major %<>% mutate(warddiv = pretty_div(warddiv))

turnout <- df_major %>%
  filter(is_topline_office) %>%
  group_by(warddiv, year, election) %>%
  summarise(turnout = sum(votes))

turnout_wide <- turnout %>%
  unite("key", year, election) %>%
  spread(key, turnout)

divs %<>% left_join(mail_in_div) %>% mutate(n_mail = ifelse(is.na(n_mail), 0, n_mail))
divs %<>% left_join(fve_div)
divs %<>% left_join(turnout_wide)

These requestors are much more likely to have voted than a typical Philadelphian. A whopping 53% of them voted in the 2019 Primary (overall turnout among active voters was 24%). Some 52% of them voted in at least three of the last four elections, and 44% in six of the last eight (the number is presumably lower because of people who didn’t live in the state eight years ago).

View code

mail_in_v2e <- v2e %>%
  mutate(mail_in_reqd = voter_id %in% mail_in$`ID Number`) %>%
  inner_join(
    elections %>% filter(is_general | is_primary, year >= 2016) %>%
      select(election_date, year, fve_election_id)
  ) %>%
  group_by(voter_id, mail_in_reqd) %>%
  summarise(
    voted_2019 = !is.na(vote_method[election_date == "2019-05-21"]),
    voted_l4 = sum(!is.na(vote_method[year >= 2018])),
    voted_l8 = sum(!is.na(vote_method)),
    cnt = n()
  ) %>%
  group_by(mail_in_reqd) %>%
  summarise(
    voted_2019 = sum(voted_2019),
    voted_l4 = sum(voted_l4 >= 3),
    voted_l8 = sum(voted_l8 >= 6),
    cnt = n()
  )

mail_in_v2e <- bind_rows(
  mail_in_v2e %>% mutate(mail_in_reqd = ifelse(mail_in_reqd, "Requested Mail-In", "No Request")),
  mail_in_v2e %>% mutate(mail_in_reqd = "All") %>% group_by(mail_in_reqd) %>% summarise_all(sum)
)

ggplot(
  mail_in_v2e %>% filter(mail_in_reqd %in% c("Requested Mail-In", "All")) %>%
    mutate(mail_in_reqd = forcats::fct_inorder(mail_in_reqd)) %>%
    gather("key", "value", voted_2019, voted_l4, voted_l8) %>%
    mutate(
      key = case_when(
        key == "voted_2019" ~ "Voted 2019 Primary",
        key == "voted_l4" ~ "Voted in 3 of last 4",
        key == "voted_l8" ~ "Voted in 6 of last 8"
      )
    ),
  aes(x=key, y=100*value / cnt, fill=mail_in_reqd)
) + 
  geom_bar(stat="identity", position="dodge") +
  theme_sixtysix() +
  scale_fill_manual(values = c(strong_blue, grey(0.70))) +
  labs(
    x = NULL,
    y = "Percent",
    fill = NULL,
    title = "Requestors are much more likely to have voted.",
    subtitle = "Percents among voters who requested mail-in ballots"
  )

By Age

The mail-in requests are coming disproportionately from younger voters. The cohort aged 30-40 made up only 15% of votes in the 2019 Primary, but over 23% of the requests so far.

View code

age_cuts <- c(18, seq(10, 120, 10))
fve_age <- fve %>%
  filter(voter_status == "A") %>%
  mutate(
    age = (lubridate::ymd("2020-06-02") - lubridate::ymd(dob)) / dyears(1),
    age_cat = cut(age, age_cuts)
  ) %>%
  group_by(age_cat) %>%
  summarise(n = n()) %>%
  ungroup() %>%
  mutate(prop = n / sum(n))

fve_age_2019 <- fve %>%
  inner_join(
    v2e %>% 
      inner_join(elections %>% filter(election_date == "2019-05-21")) %>%
      filter(!is.na(vote_method))
  ) %>%
  mutate(
    age = (lubridate::ymd("2020-06-02") - lubridate::ymd(dob)) / dyears(1),
    age_cat = cut(age, age_cuts)
  ) %>%
  group_by(age_cat) %>%
  summarise(n = n()) %>%
  ungroup() %>%
  mutate(prop = n / sum(n))

mail_in_age <- mail_in %>%
  mutate(
    age = (lubridate::ymd("2020-06-02") - lubridate::ymd(dob)) / dyears(1),
    age_cat = cut(age, age_cuts)
  ) %>%
  group_by(age_cat) %>%
  summarise(n = n()) %>%
  ungroup() %>%
  mutate(prop = n / sum(n))

age_cats <- bind_rows(
  mail_in_age %>% mutate(source = "Mail-In Requests"),
  fve_age_2019 %>% mutate(source = "2019 Primary Voters"),
  fve_age %>% mutate(source = "Active Registered Voters")
) %>%
  mutate(source = forcats::fct_inorder(source))


ggplot(
  age_cats %>% 
    filter(!is.na(age_cat), !age_cat %in% c("(0,10]","(100,110]","(110,120]")) %>%
    mutate(
      age_cat = droplevels(age_cat)
    ),
  aes(x = age_cat, y = 100 * prop, fill=source)
) + geom_bar(
  stat="identity",
  position="dodge"
) +
  theme_sixtysix() +
  labs(
    y="Percent of Total",
    x = "Age",
    fill = ""
  ) +
  scale_fill_manual(values = c(strong_blue, grey(0.70), grey(0.9))) +
  ggtitle("Voters under 40 are disproportionately requesting ballots")

By Neighborhood

Geographically, the patterns are even starker. More than 13,000 of the requests came from the two Center City wards–5 and 8–alone.

[See the Appendix for Division-level maps.]

View code

library(leaflet)

make_leaflet <- function(
  sf, 
  color_col, 
  title,
  is_percent=FALSE,
  pal="viridis"
){
  zoom <- 11
  
  min_value <- min(sf[[color_col]], na.rm=TRUE)
  max_value <- max(sf[[color_col]], na.rm=TRUE)
  
  sigfig <- round(log10(max_value-min_value)) - 1
  step_size <- 2 * 10^(sigfig)
  
  min_value <- step_size * (min_value %/% step_size)
  max_value <- step_size * (max_value %/% step_size + 1)
  
  color_numeric <- colorNumeric(
    pal, 
    domain=c(min_value, max_value)
  )
  
  legend_values <- seq(min_value, max_value, step_size)
  legend_colors <- color_numeric(legend_values)
  if(is_percent){
    legend_labels <- sprintf("%s%%", legend_values)
  } else{
    legend_labels <- scales::comma(legend_values)
  }
  
  bbox <- st_bbox(sf)
  
  leaflet(
    options=leafletOptions(
      minZoom=zoom, 
      # maxZoom=zoom,
      zoomControl=TRUE
      # dragging=FALSE
    )
  )%>% 
    addProviderTiles(providers$Stamen.TonerLite) %>%
    addPolygons(
      data=sf$geometry,
      weight=0, 
      color="white", 
      opacity=1, 
      fillOpacity = 0.8, 
      smoothFactor = 0,
      fillColor = color_numeric(sf[[color_col]]),
      popup=sf$popup
    ) %>%
    setMaxBounds(
      lng1=bbox["xmin"],
      lng2=bbox["xmax"],
      lat1=bbox["ymin"],
      lat2=bbox["ymax"]
    ) %>%
    addControl(title, position="topright", layerId="map_title") %>% 
    addLegend(
      layerId="geom_legend",
      position="bottomright",
      colors=legend_colors,
      labels=legend_labels
    )
}

wards <- divs %>% 
  mutate(ward = substr(warddiv, 1, 2)) %>%
  group_by(ward) %>%
  summarise_at(
    .funs=list(sum), 
    .vars=vars(n_mail, n_reg, `2002_general`:`2019_primary`)
  )

wards %<>% 
  mutate(
    popup = sprintf(
      paste(
        c(
          "<b>Ward %s</b>",
          "Active Registered Voters: %s",
          "Mail-In Requests: %s",
          "Votes in the 2019 Primary: %s",
          "Votes in the 2016 Primary: %s"
        ),
        collapse = "<br>"
      ),
      ward,
      scales::comma(n_reg),
      scales::comma(n_mail),
      scales::comma(round(`2019_primary`)),
      scales::comma(round(`2016_primary`))
    )
  )

wards$color_mail <- wards$n_mail
wards_mail <- make_leaflet(
  wards,
  "color_mail",
  "Number of Mail-In Requests"
)
library(widgetframe)
knitr::opts_chunk$set(widgetframe_self_contained = TRUE) 
frameWidget(wards_mail)

As a percent of all voters, Center City, the wealthy ring around it, and Mount Airy/Chestnut Hill stand out. These Wards make up what I’ve dubbed the Wealthy Progressives.

View code

wards$color_mail_vs_reg <- 100 * wards$n_mail / wards$n_reg
mail_reg <- make_leaflet(
  wards,
  "color_mail_vs_reg",
  "Mail-In Requests as % of Active Registered Voters",
  is_percent=TRUE
)
knitr::opts_chunk$set(widgetframe_self_contained = TRUE)
frameWidget(mail_reg)

View code

wards$color_mail_vs_2019 <- 100 * wards$n_mail / wards$`2019_primary`
mail_2019 <- make_leaflet(
  wards,
  "color_mail_vs_2019",
  "Mail-in requests as % of 2019 Primary voters",
  is_percent=TRUE
)
knitr::opts_chunk$set(widgetframe_self_contained = TRUE)
frameWidget(mail_2019)

In fact, aggregating by those Division Cohorts paints a stark picture.

View code

div_cats <- readRDS("../../data/processed_data/div_cats_20200411.RDS")
divs %<>% left_join(div_cats %>% select(warddiv, cat))
cat_colors <- c(
  "Black Voters" = light_blue, 
  "Wealthy Progressives" = light_red, 
  "White Moderates" = light_orange, 
  "Hispanic North Philly" = light_green
)

div_cats <- divs %>% 
  as.data.frame() %>%
  select(-geometry) %>%
  group_by(cat) %>%
  summarise_at(
    .funs=list(sum), 
    .vars=vars(n_mail, n_reg, `2002_general`:`2019_primary`)
  ) %>%
  ungroup() 

ggplot(
  div_cats,
  aes(x = cat, y = n_reg, fill=cat)
) + geom_bar(
  stat="identity",
  position="dodge"
) +
  theme_sixtysix() + # %+replace% theme(axis.text.x = element_blank())+
  scale_y_continuous(labels=scales::comma) +
  labs(
    y=NULL,
    x = NULL,
    fill = "",
    title="Black Voter Divisions represent nearly half of Active Voters",
    subtitle="Active Registered Voters"
  ) +
  scale_fill_manual(values = cat_colors, guide=FALSE)

View code

ggplot(
  div_cats %>%
    mutate(prop_mail = n_mail / n_reg),
  aes(x = cat, y = 100 * prop_mail, fill=cat)
) + geom_bar(
  stat="identity",
  position="dodge"
) +
  theme_sixtysix() %+replace% theme(title = element_text(size = 10)) +
  labs(
    y=NULL,
    x = NULL,
    fill = "",
    title="But Wealthy Progressive Divisions have the most mail-in requests",
    subtitle="Mail-in requests as a percent of active registered voters"
  ) +
  scale_fill_manual(values = cat_colors, guide=FALSE)

View code

ggplot(
  div_cats %>%
    mutate(prop_2019 = n_mail / `2019_primary`),
  aes(x = cat, y = 100 * prop_2019, fill=cat)
) + geom_bar(
  stat="identity",
  position="dodge"
) +
  theme_sixtysix() + # %+replace% theme(axis.text.x = element_blank())+
  labs(
    y=NULL,
    x = NULL,
    fill = "",
    title="Mail-in requests as percent of 2019 Primary turnout",
    subtitle="Requests from Wealthy Progressive Divisions are nearly 80% of their 2019 turnout."
  ) +
  scale_fill_manual(values = cat_colors, guide=FALSE)

What this means for the election

Since we’ve never had mail-in voting before and it’s been over 100 years since the last pandemic, it’s impossible to know what this means for total turnout in June.

Two things are possible:

These mail-in percents represent the actual turnout. Voters decide not to vote in person, and Wealthy Progressives vastly over-represent the electorate.
Voters that aren’t requesting mail-in ballots are just planning to vote in person, and the gap totally closes on Election Day. The city-wide turnout looks normal.

Probably, the outcome will be somewhere in between. What does that mean for PA-1 and other local races? Coming soon!

Appendix: Division Maps

View code

divs %<>% 
  mutate(
    popup = sprintf(
      paste(
        c(
          "<b>Division %s</b>",
          "Active registered voters: %s",
          "Mail-in requests: %s",
          "Votes in the 2019 Primary: %s",
          "Votes in the 2016 Primary: %s"
        ),
        collapse = "<br>"
      ),
      warddiv,
      scales::comma(n_reg),
      scales::comma(n_mail),
      scales::comma(round(`2019_primary`)),
      scales::comma(round(`2016_primary`))
    )
  )


divs$color_mail <- divs$n_mail
div_color_mail <- make_leaflet(
  divs,
  "color_mail",
  "Number of mail-in requests"
)
knitr::opts_chunk$set(widgetframe_self_contained = TRUE)
frameWidget(div_color_mail)

View code

divs$color_mail_vs_reg <- 100 * divs$n_mail / divs$n_reg
l0 <- make_leaflet(
  divs,
  "color_mail_vs_reg",
  "Mail-in requests as % of active registered voters",
  is_percent=TRUE
)
knitr::opts_chunk$set(widgetframe_self_contained = TRUE)
frameWidget(l0)

View code

divs$color_mail_vs_2019 <- 100 * divs$n_mail / divs$`2019_primary`
l1 <- make_leaflet(
  divs,
  "color_mail_vs_2019",
  "Mail-in requests as % of 2019 Primary voters",
  is_percent=TRUE
)
knitr::opts_chunk$set(widgetframe_self_contained = TRUE)
frameWidget(l1)

View code

divs$color_mail_vs_2016 <- 100 * divs$n_mail / divs$`2016_primary`
l2 <- make_leaflet(
  divs,
  "color_mail_vs_2016",
  "Mail-in requests as % of 2016 Primary voters",
  is_percent=TRUE
)
knitr::opts_chunk$set(widgetframe_self_contained = TRUE)
frameWidget(l2)

13 Replies to “Who’s requesting mail-in ballots?”

Howard Bilofsky says:

May 13, 2020 at 5:33 pm

Any chance you can keep this uptodate every few days or so until
June 2? It will energize those of us working to get more applications completed.
Thanks
Howard
1. Jonathan Tannen says:
  
  May 16, 2020 at 10:55 pm
  
  Good idea! I can try manually, whenever I get data from the office.
Jerry Tanenbaum says:

May 26, 2020 at 10:40 am

What is the basis upon which to dub the more wealthy Democrats as “Progressives”??

Aren’t they more likely to be Liberals (moderates)?
1. Jonathan Tannen says:
  
  June 1, 2020 at 12:05 pm
  
  That’s a good question, and I spent some time debating it.
  
  These voting blocs all come directly from which candidates did particularly well, and in this case those candidates are Gym, Brooks, and to a lesser extent Krasner. There are definitely more nuances at a finer grain, but at this crude four-bloc level, the candidates that do especially well in Center City and (especially) the ring around it actually do tend to be the progressives.
Pingback: In Pa., officials prepare for coronavirus, civil unrest to disrupt Tuesday primary | Seeing2020
Pingback: In Pa., officials prepare for coronavirus, civil unrest to disrupt Tuesday primary | TheTop10News | Breaking world news, photos & videos.
Pingback: Pa. officials prepare for coronavirus, civil unrest to disrupt Tuesday primary -
Pingback: In Pennsylvania, officials prepare for coronavirus, civil unrest to disrupt Tuesday primary | TheTop10News | Breaking world news, photos & videos.
Pingback: Some voters in 8 statewide primaries and D.C. face confusion today about where to vote, long lines and poor social distancing – one fast2host
Pingback: Primary election day in 8 state and D.C. marked by confusion | Corona Virus 19
Pingback: Primary election day in 8 state and D.C. marked by confusion - The Washington Post
Pingback: Some voters in 8 statewide primaries and D.C. face confusion today about where to vote, long lines and poor social distancing -
Pingback: Primary election day in 8 state and D.C. marked by confusion | piinions

Comments are closed.

Covid-19 and PA’s first-ever mail-in ballot election.

Who is requesting ballots?

By Age

By Neighborhood

What this means for the election

Appendix: Division Maps

Related

13 Replies to “Who’s requesting mail-in ballots?”