Who will win the Court of Common Pleas?

On May 21st, Philadelphia won’t just be voting for Mayor, City Council, and a few “row offices”. Besides those, we will also choose nine judges: two for the Superior Court, one for Municipal Court, and six for the Court of Common Pleas. (Really, this is just the primary. But the Common Pleas and Municipal nominees will almost certainly win in November).

I’ve spent time here before looking at the Court of Common Pleas. The court is responsible for the city’s major civil and criminal trials. Its judges are elected to ten-year terms. And we elect them by drawing out of a coffee can.

The result is that Philadelphia often elects judges who are unfit for the office. In 2015, Scott Diclaudio won; months later he would be censured for misconduct and negligence, and then get caught having given illegal donations in the DA Seth Williams probe. He was in the first position on the ballot. Lyris Younge was at the bottom of the first column that year and won. She has since been removed from Family Court for violating family rights and made headlines by evicting a full building of residents with less than a week’s notice.

I’ve looked before at the effect of ballot position on the Court’s elections: being in the first column nearly triples your votes. Today, I’ll use that model to simulate who will win in the upcoming race.

It’s easy to predict the Common Pleas Election

Predicting elections is hard, especially without surveys. When I tried it for November’s state house election, I could only make imprecise predictions, and even then had mixed results. Why would this time be any different?

The key is that voters know nothing about the race. In May, voters are selecting six Common Pleas judges from among twenty-five candidates. The median voter will know the name of exactly zero of them before they enter the booth.

This lack of knowledge means that structural components end up mattering a lot. What column your name is listed in, whether you’re endorsed in the Inquirer, or how many polling places your name is handed out on a piece of paper outside of, all dictate who will win. We can observe or guess each of those, and come up with pretty accurate predictions.

When I did this exercise two years ago, I got the number of winners from the first column, and the number endorsed by the DCC, exactly right (yes, it’s that easy).

Electing qualified judges

These races matter. Philadelphia regularly elects judges who should not be judges, granting them the authority over a courtroom that decides the city’s most important cases.

As a measure of judicial quality, I use the recommendations from the Judicial Commission of the Philadelphia Bar Association. The commission evaluates candidates by an interview, a questionnaire, and interviews with people who work with them. It then rates candidates as Recommended or Not Recommended. Usually, it recommends about 2/3 of the candidates–many more than can win–and is useful as a lower-bar measure of candidate quality. My understanding is that when a candidate isn’t recommended, there’s a significant reason, though the Commission’s exact findings are kept confidential.

The ratings are so useful that in 2015 the Philadelphia Inquirer stopped endorsing judicial candidates on its own, and began printing the Commission’s recommendations (this also makes the ratings much more important for candidates).

Recently, the Commission introduced a Highly Recommended category. Unfortunately, it’s too early to know how effective it’s been. In 2015 there were three Highly Recommended candidates, and all three won. But they didn’t do statistically significantly better than the plain old Recommended Candidates in terms of votes (albeit with just three observations). In 2017, there were no Highly Recommended candidates.

This time around, there are four Highly Recommended candidates: James Crulish, Anthony Kyriakakis, Chris Hall, and Tiffany Palmer (a fifth, Michelle Hangley, dropped out because of her unlucky ballot position). None of those four are in the first column, so this year could prove a useful measure of the Bar’s impact.

One note: candidates that do not submit questionnaires are not rated Recommended. Rather than reward the perverse incentive for candidates to not submit, I will consider candidates who have not yet submitted paperwork as Not Recommended.

Where will ballot position matter most?

When I analysed the determinants of Common Pleas voting, being in the first column nearly tripled a candidate’s votes. Endorsements from the Democratic City Committee (DCC) and the Inquirer doubled the votes (though the causal direction here is more dubious). Remember that the Inquirer has recently just adopted the Bar’s recommendations, so the importance of the Inquirer will be transferred to the Philadelphia Bar.

The ballot this year is wide. There are just four rows and seven columns. With 25 candidates vying for 6 spots, a number of later column candidates will almost certainly win.

Two years ago when I simulated the race, I did so at the city-level, ignoring neighborhood patterns. But we might see vastly disproportionate turnout in some neighborhoods, and it happens that those are the neighborhoods where recommended candidates do best. So let’s be more careful. First, how much does each determinant of candidates’ votes vary by neighborhood?

View code
library(ggplot2)
library(dplyr)
library(tidyr)
library(tibble)
library(readr)
library(forcats)

source("../../admin_scripts/util.R")

ballot <- read.csv("../../data/common_pleas/judicial_ballot_position.csv")
ballot$name <- tolower(ballot$name)
ballot$name <- gsub("[[:punct:]]", " ", ballot$name)
ballot$name <- trimws(ballot$name)

years <- seq(2009, 2017, 2)
dfs <- list()
for(y in years){
dfs[[as.character(y)]] <- read_csv(paste0("../../data/raw_election_data/", y, "_primary.csv")) %>% 
mutate(
year = y,
CANDIDATE = tolower(CANDIDATE),
CANDIDATE = gsub("\\s+", " ", CANDIDATE)
) %>%
filter(grepl("JUDGE OF THE COURT OF COMMON PLEAS-D", OFFICE))
print(y)
}

df <- bind_rows(dfs)

df <- df %>% 
mutate(WARD = sprintf("%02d", WARD)) %>%
group_by(WARD, year, CANDIDATE) %>% 
summarise(VOTES = sum(VOTES))

df_total <- df %>% 
group_by(year, CANDIDATE) %>% 
summarise(VOTES = sum(VOTES))

election <- data.frame(
year = c(2009, 2011, 2013, 2015, 2017),
votefor = c(7, 10, 6, 12, 9)
)

election <- election %>% left_join(
ballot %>% group_by(year) %>% 
summarise(
nrows = max(rownumber),
ncols = max(colnumber), 
ncand = n(),
n_philacomm = sum(philacommrec),
n_inq = sum(inq),
n_dcc = sum(dcc)
)
)

df_total <- df_total %>% 
left_join(election) %>%
group_by(year) %>%
arrange(desc(year), desc(VOTES)) %>%
mutate(finish = 1:n()) %>%
mutate(winner = finish <= votefor)

df_total <- df_total %>% inner_join(
ballot,
by = c("CANDIDATE" = "name", "year" = "year")
)

df_total <- df_total %>%
group_by(year) %>%
mutate(pvote = VOTES / sum(VOTES))

df_total <- df_total %>%
filter(CANDIDATE != "write in") 

prep_df_for_lm <- function(df, use_candidate=TRUE){
df <- df %>% mutate(
rownumber = fct_relevel(factor(as.character(rownumber)), "3"),
colnumber = fct_relevel(factor(as.character(colnumber)), "3"),
col1 = colnumber == 1,
col2 = colnumber == 2,
col3 = colnumber == 3,
row1 = rownumber == 1,
row2 = rownumber == 2,
is_rec = philacommrec > 0,
is_highly_rec = philacommrec==2,
inq=inq>0
)
if(use_candidate)
df <- df %>% mutate(
candidate_year = paste(CANDIDATE, year, sep="::")
)
return(df)
}

df_complemented <- df %>% 
filter(CANDIDATE != "write in") %>%
group_by(WARD) %>%
mutate(pvote = VOTES / sum(VOTES)) %>%
inner_join(
df_total %>% prep_df_for_lm(),
by = c("year", "CANDIDATE"),
suffix = c("", ".total")
) 

# fit_model <- function(df){
#   lmfit <- lm(
#     log(pvote + 0.001) ~ 
#       row1 + row2 +
#       # col1*I(votefor - nrows) + 
#       # col2*I(votefor - nrows) + 
#       # col3*I(votefor - nrows) +
#       I(gender == "F") +
#       col1 + col2 + col3 +
#       inq + dcc + 
#       is_rec + is_highly_rec +
#       factor(year),
#     data = df_total %>% prep_df_for_lm()
#   )
#   return(lmfit)
# }
# 
# lmfit <- fit_model(df_total)
# summary(lmfit)
View code
library(lme4)

## better opt: https://github.com/lme4/lme4/issues/98
library(nloptr)
defaultControl <- list(
algorithm="NLOPT_LN_BOBYQA",xtol_rel=1e-6,maxeval=1e5
)
nloptwrap2 <- function(fn,par,lower,upper,control=list(),...) {
for (n in names(defaultControl)) 
if (is.null(control[[n]])) control[[n]] <- defaultControl[[n]]
res <- nloptr(x0=par,eval_f=fn,lb=lower,ub=upper,opts=control,...)
with(res,list(par=solution,
fval=objective,
feval=iterations,
conv=if (status>0) 0 else status,
message=message))
}

rfit <- lmer(
log(pvote + 0.001) ~ 
(1 | candidate_year)+
row1 + row2 +
I(gender == "F") +
col1 + col2 +
dcc + 
is_rec + is_highly_rec +
factor(year) +
(
# row1 + row2 +
# col1*I(votefor - nrows) + 
# col2*I(votefor - nrows) + 
# col3*I(votefor - nrows) +
I(gender == "F") +
col1 + col2 + #col3 +
dcc +
is_rec + is_highly_rec 
# factor(year)
| WARD
),
df_complemented
)

ranef <- as.data.frame(ranef(rfit)$WARD) %>% 
rownames_to_column("WARD")  %>%
gather("variable", "random_effect", -WARD) %>%
mutate(
fixed_effect = fixef(rfit)[variable],
effect = random_effect + fixed_effect
)

Recommended candidates receive about 1.8 times as many votes on average, drawing almost all of that advantage from Center City and Chestnut Hill & Mount Airy. While overall we didn’t see a benefit to being Highly Recommended, in the neighborhood drill-down, we do see tentative evidence that those candidates did even better in the wealthier wards (a highly recommended candidate would receive the sum of the Recommended + Highly Recommended effects below).

View code
library(sf)

wards <- read_sf("../../data/gis/2016/2016_Wards.shp")

ward_effects <- wards %>% 
mutate(WARD = sprintf("%02d", WARD)) %>%  
left_join(
ranef,
by=c("WARD" = "WARD")
)

format_effect <- function(x){
paste0("x", round(exp(x), 1))
}

fill_min <- ward_effects %>%
filter(
variable %in% c(
"col1TRUE", "col2TRUE", "dcc", "is_recTRUE", "is_highly_recTRUE"
)
)  %>%
with(c(min(effect), max(effect)))

format_variables <- c(
is_recTRUE="Recommended",
is_highly_recTRUE="Highly Recommended",
dcc = "Dem. City Committee Endorsement",
col1TRUE = "First Column",
col2TRUE = "Second Column"
)

ward_effects$variable_name <- factor(
format_variables[ward_effects$variable],
levels = format_variables
)

ggplot(
ward_effects %>% 
filter(variable_name %in% c(format_variables[1:2]))
) + 
geom_sf(aes(fill=effect), color = NA) +
facet_wrap(~variable_name) +
scale_fill_viridis_c(
"Multiplicative\nDifference in Votes", 
labels=format_effect, 
breaks = seq(-2, 3, 0.4)
) +
theme_map_sixtysix() %+replace%
theme(legend.position="right") +
expand_limits(fill = fill_min) +
ggtitle("Recommended candidates do better\n  in wealthier wards") 

What’s going on in the other wards? The Democratic Party is especially important, especially in the traditionally-strong Black wards. (Note, I can’t identify here if that’s because they strongly adopt the party’s endorsement, or if the party endorses the candidates who would already do well). Interestingly, the party wasn’t so important in the Hispanic wards of North Philly or the Northeast.

View code
ggplot(
ward_effects %>% 
filter(variable_name %in% c(format_variables[3]))
) + 
geom_sf(aes(fill=effect), color = NA) +
facet_wrap(~variable_name) +
scale_fill_viridis_c(
"Multiplicative\nDifference in Votes", 
labels=format_effect, 
breaks = seq(-2, 3, 0.4)
) +
theme_map_sixtysix() %+replace%
theme(legend.position="right") +
expand_limits(fill = fill_min) +
ggtitle("Party-endorsed candidates\ndo better in predominantly-Black wards")

Unfortunately, all of these effects are swamped by ballot position. Candidates in the first column receive twice as many votes in every single type of ward, but especially many in lower-income wards.

View code
ggplot(
ward_effects %>% 
filter(variable_name %in% c(format_variables[4:5]))
) + 
geom_sf(aes(fill=effect), color = NA) +
facet_wrap(~variable_name) +
scale_fill_viridis_c(
"Multiplicative\nDifference in Votes", 
labels=format_effect, 
breaks = seq(-2, 3, 0.4)
) +
theme_map_sixtysix() %+replace%
theme(legend.position="right") +
expand_limits(fill = fill_min) +
ggtitle(
"First-column candidates do better everywhere", 
"Relative to third column or later"
)

Simulating the election

The task of predicting the election comes down to using these correlations, and then randomly sampling uncertainty of the correct size.

I use each candidate’s ballot position and endorsements to come up with a baseline estimate of how they’ll do in each ward. There is a lot of uncertainty for a given candidate, so I add random noise to each candidate (candidate-level effects that aren’t explained by my model have a standard deviation of about +/- 30% of their votes.)

I scale up the ward performance by my turnout projection. I’m using my high-turnout projections, which assume that the post-2016 surge continues in Center City and its ring, and will in general help recommended candidates, who do better in those wealthier wards.

View code
turnout_2019 <- read.csv(
"../turnout_2019_primary/turnout_projections_2019.csv"
) %>%
mutate(WARD = sprintf("%02d", WARD16)) %>%
group_by(WARD) %>%
summarise(
high_projection = sum(high_projection, na.rm = TRUE),
low_projection = sum(low_projection, na.rm = TRUE)
)

replace_na <- function(x, r=0) ifelse(is.na(x), r, x)

df_2019 <- ballot %>% 
filter(year == 2019) %>%
mutate(
philacommrec = replace_na(philacommrec),
dcc = replace_na(dcc),
inq = (philacommrec > 0),
year = 2017  ## fake year to trick lm
) %>%
prep_df_for_lm(use_candidate = FALSE) %>%
left_join(
expand.grid(
name = unique(ballot$name),
WARD = unique(turnout_2019$WARD)
)
) %>% left_join(turnout_2019)


## pretend it's one candidate, but then marginalize over candidates
df_2019$log_pvote <- predict(
rfit,
newdata = df_2019 %>% 
mutate(candidate_year = df_complemented$candidate_year[1])
)

df_2019 <- df_2019 %>%
mutate(pvote_prop = exp(log_pvote))

sd_cand <- sd(ranef(rfit)$candidate_year$`(Intercept)`)
simdf <- expand.grid(
sim = 1:1000,
name = unique(df_2019$name)
) %>%
mutate(cand_re = rnorm(n(), sd = sd_cand))

## https://econsultsolutions.com/simulating-the-court-of-common-pleas-election/
votes_per_voter <- 4.5

simdf <- df_2019 %>%
left_join(simdf) %>%
mutate(pvote_prop_sim = pvote_prop * exp(cand_re)) %>%
group_by(WARD, sim) %>%
mutate(pvote = pvote_prop_sim / sum(pvote_prop_sim)) %>%
group_by() %>%
mutate(votes = high_projection * votes_per_voter * pvote) %>%
group_by(sim, name) %>%
summarise(votes = sum(votes)) %>%
left_join(ballot %>% filter(year == 2019)) %>%
prep_df_for_lm(use_candidate = FALSE)

simdf <- simdf %>%
group_by(sim) %>%
mutate(
vote_rank = rank(desc(votes)),
winner = rank(vote_rank) <= 6
)

remove_na <- function(x, r=0) return(ifelse(is.na(x), r, x))

winner_df <- simdf %>% 
group_by(sim) %>%
summarise(
winners_rec = sum(is_rec * winner),
winners_highly_rec = sum(is_highly_rec * winner),
winners_col1 = sum(col1 * winner),
winners_col2 = sum(col2 * winner),
winners_col3 = sum(col3 * winner),
winners_dcc = sum(remove_na(dcc) * winner),
winners_women = sum((gender == "F") * winner)
)

Under the hood, the model has an estimate for each candidate. But I’m not totally comfortable with blasting those out (and what feedback loops that might cause), so let’s look at the high-level predictions instead.

View code
col_sim <- winner_df %>%
select(winners_col1, winners_col2) %>%
mutate(
`Third Column or later` = 6 - winners_col1 - winners_col2
) %>%
rename(
`First Column` = winners_col1,
`Second Column` = winners_col2
)

rec_sim <- winner_df %>%
select(winners_rec, winners_highly_rec, winners_dcc) %>%
mutate(
`Not Recommended` = 6 - winners_rec
) %>%
rename(
`All Recommended` = winners_rec,
`Highly Recommended` = winners_highly_rec,
`DCC Endorsed` = winners_dcc
)

gender_sim <- winner_df %>%
select(winners_women) %>%
mutate(
`Men` = 6 - winners_women
) %>%
rename(
`Women` = winners_women
)

plot_winners <- function(
sim_df, 
title, 
facet_order,
colors
){
gathered_df <- sim_df %>%
gather("facet", "n_winners") %>%
mutate(
facet = factor(
facet,
facet_order
)
) %>%
group_by(facet, n_winners) %>%
count() %>%
group_by(facet) %>%
mutate(prop = n / sum(n))

facet_lev <- levels(gathered_df$facet)
names(colors) <- facet_lev
ggplot(
gathered_df,
aes(x=n_winners)
) +
geom_bar(aes(y = prop, fill = facet), stat="identity") +
theme_sixtysix() +
expand_limits(x=c(0,7)) +
scale_x_continuous("Count of winners", breaks = 0:7) +
ylab("Proportion of simulations") +
scale_fill_manual(values = colors, guide=FALSE)+
facet_grid(facet~.) +
geom_vline(xintercept=6, linetype="dashed") +
ggtitle(title)
}

The model is really optimistic about how many Recommended candidates win, mostly because there’s only one Not Recommended candidate in the first two columns. In 66% of simulations all six winners are Recommended, and in 33% all but one are (Jon Marshall at the bottom of the first column is usually the lone Not Recommended winner).

View code
plot_winners(
rec_sim, 
"Simulations by Recommendation", 
c("Highly Recommended", "All Recommended", "Not Recommended", "DCC Endorsed"),
c(strong_blue, strong_green, strong_red, strong_grey)
)

Highly Recommended candidates do less well; we get no Highly Recommended winners in 46% of simulations, and only one in another 47%. Remember that the model doesn’t think that being Highly Recommended helps more than just regular Recommended, and this year’s candidates have bad ballot position. Their performance this year will be a barometer for the power of the Bar’s endorsements; getting two winners (let alone three or four) would be a huge achievement (and presumably good for the citizens of Philadelphia, too).

DCC endorsees win an average of 3.1 of the six seats.

Of course, the true determinant is the first column.

View code
plot_winners(
col_sim, 
"Simulations by Column Position", 
c("First Column", "Second Column", "Third Column or later"),
c(strong_purple, strong_orange, strong_grey)
)

The first column produces 2.5 winners on average; the most likely outcome is three of the first-column candidates winning (45% of simulations), the second most likely is two (40%). The second column still produces 1.0 winners on average, with the remaining 2.4 winners coming from the final five columns.

View code
wincount_df <- simdf %>% 
group_by(sim) %>%
mutate(pvote= votes/sum(votes)) %>%
filter(vote_rank == 6)

How many votes will it take to win? The average sixth-place winner wins 5.1% of the vote (remember that candidates can vote for multiple candidates). Assuming that 218,000 people vote, and an average of 4.5 candidates selected per voter, that comes out to 50,000 votes.

View code
ggplot(
wincount_df,
aes(x = pvote * 100)
) +
geom_histogram(
aes(y=stat(count) / sum(stat(count))),
boundary=5,
binwidth=0.2,
fill = strong_green
) +
ylab("Proportion of Simulations") +
xlab("Percent of vote received by sixth place") +
geom_vline(xintercept = 100 * mean(wincount_df$pvote), color = "black") +
annotate(
"text", 
label = sprintf("Mean = %.1f %%", 100 * mean(wincount_df$pvote)),
x = 100 * mean(wincount_df$pvote),
y = 0.05,
angle = 90,
vjust = 1.1
)+ 
theme_sixtysix() +
ggtitle("Win Count for Common Pleas") 

This year may be… not too bad?

In 2015, three Not Recommended candidates became ten year judges. In 2017, three more did. This year, probably at worst only one will. Why? Mostly luck; all six of those unqualified winners were in the first column, and this year seven of the eight candidates in the first two columns are Recommended.

Instead, the open question is just how qualified our judges will be. Will we stay on par with the past, which would see my model’s predicted zero or only one Highly Recommended candidate win? Or will the Bar’s Highly Recommended ratings assert themselves, and prove a bigger player this year?

The neighborhoods that decide Council District 2

[Note 2019-03-09: this post has been heavily updated thanks to an insightful suggestion from @DanthePHLman]

Could Kenyatta Lose?

Kenyatta Johnson, the two term councilmember from Southwest and South Philly’s District 2, is being challenged by Lauren Vidas, the former assistant finance director under Mayor Nutter. Johnson dominated a challenge from developer Ori Feibush four years ago, but has since been mired in land deal scandals. In Wednesday’s post, I claimed District 3’s challenger faced a plausible but steep path. How about for District 2? What would it take for Vidas to win?

Johnson’s District 2 is quite different from West Philly’s District 3. The gentrification has covered less ground, and Graduate Hospital didn’t take to Bernie Sanders and Larry Krasner in the same way that University City did. On the other hand, Johnson’s recent scandals will likely kneecap his 2015 popularity, and Vidas occupies a quite different lane than developer Feibush.

What are the neighborhood cohorts that will decide District 2? If Johnson holds, what neighborhoods will he have done well in? If Vidas’s challenge is successful, which neighborhoods’ vote will she have monopolized?

District 2’s voting blocks

The voting blocks for District 2 are less distinct than for District 3: there’s the pro-Kenyatta base of Point Breeze, the challenger base of Grad Hospital and a nub of East Passyunk, and then there’s Southwest Philly, which is somewhere in between.

View code
library(tidyverse)
library(rgdal)
library(rgeos)
library(sp)
library(ggmap)

sp_council <- readOGR("../../../data/gis/city_council/Council_Districts_2016.shp", verbose = FALSE)
sp_council <- spChFIDs(sp_council, as.character(sp_council$DISTRICT))

sp_divs <- readOGR("../../../data/gis/2016/2016_Ward_Divisions.shp", verbose = FALSE)
sp_divs <- spChFIDs(sp_divs, as.character(sp_divs$WARD_DIVSN))
sp_divs <- spTransform(sp_divs, CRS(proj4string(sp_council)))

load("../../../data/processed_data/df_major_2017_12_01.Rda")

ggcouncil <- fortify(sp_council) %>% mutate(council_district = id)
ggdivs <- fortify(sp_divs) %>% mutate(WARD_DIVSN = id)
View code
## Need to add District 2 election from 2015
raw_d2 <-  read.csv("../../../data/raw_election_data/2015_primary.csv") 
raw_d2 <- raw_d2 %>% 
  filter(OFFICE == "DISTRICT COUNCIL-2ND DISTRICT-DEM") %>%
  mutate(
    WARD = sprintf("%02d", asnum(WARD)),
    DIV = sprintf("%02d", asnum(DIVISION))
  )

load('../../../data/gis_crosswalks/div_crosswalk_2013_to_2016.Rda')
crosswalk_to_16 <- crosswalk_to_16 %>% group_by() %>%
  mutate(
    WARD = sprintf("%02s", as.character(WARD)),
    DIV = sprintf("%02s", as.character(DIV))
  )

d2 <- raw_d2 %>% 
  left_join(crosswalk_to_16) %>%
  group_by(WARD16, DIV16, OFFICE, CANDIDATE) %>%
  summarise(VOTES = sum(VOTES * weight_to_16)) %>%
  mutate(PARTY="DEMOCRATIC", year="2015", election="primary")
df_major <- bind_rows(df_major, d2)
View code
races <- tribble(
  ~year, ~OFFICE, ~office_name,
  "2015", "MAYOR", "Mayor",
  "2015", "DISTRICT COUNCIL-2ND DISTRICT-DEM", "Council 2nd District",
  "2015", "COUNCIL AT LARGE", "City Council At Large",
  "2016", "PRESIDENT OF THE UNITED STATES", "President",
  "2017", "DISTRICT ATTORNEY", "District Attorney"
) %>% mutate(election_name = paste(year, office_name))

candidate_votes <- df_major %>% 
  filter(election == "primary" & PARTY == "DEMOCRATIC") %>%
  inner_join(races %>% select(year, OFFICE)) %>%
  mutate(WARD_DIVSN = paste0(WARD16, DIV16)) %>%
  group_by(WARD_DIVSN, OFFICE, year, election) %>%
  mutate(
    total_votes = sum(VOTES),
    pvote = VOTES / sum(VOTES)
  ) %>% 
  group_by()

turnout_df <- candidate_votes %>%
  filter(!grepl("COUNCIL", OFFICE)) %>% 
  group_by(WARD_DIVSN, OFFICE, year, election) %>%
  summarise(total_votes = sum(VOTES)) %>%
  left_join(
    sp_divs@data %>% select(WARD_DIVSN, AREA_SFT)
  )

turnout_df$AREA_SFT <- asnum(turnout_df$AREA_SFT)

The second council district covers Southwest Philly, and parts of South Philly including Point Breeze and Graduate Hospital.

View code
get_labpt_df <- function(sp){
  mat <- sapply(sp@polygons, slot, "labpt")
  df <- data.frame(x = mat[1,], y=mat[2,])
  return(
    cbind(sp@data, df)
  )
}

ggplot(ggcouncil, aes(x=long, y=lat)) +
  geom_polygon(
    aes(group=group),
    fill = strong_green, color = "white", size = 1
  ) +
  geom_text(
    data = get_labpt_df(sp_council),
    aes(x=x,y=y,label=DISTRICT)
  ) +
  theme_map_sixtysix() +
  coord_map() +
  ggtitle("Council Districts")

plot of chunk council_map

View code
DISTRICT <- "2"
sp_district <- sp_council[row.names(sp_council) == DISTRICT,]

bbox <- sp_district@bbox
## expand the bbox 20%for mapping
bbox <- rowMeans(bbox) + 1.2 * sweep(bbox, 1, rowMeans(bbox))

basemap <- get_map(bbox, maptype="toner-lite")

district_map <- ggmap(
  basemap, 
  extent="normal", 
  base_layer=ggplot(ggcouncil, aes(x=long, y=lat, group=group)),
  maprange = FALSE
) 
## without basemap:
# district_map <- ggplot(ggcouncil, aes(x=long, y=lat, group=group))

district_map <- district_map +
  theme_map_sixtysix() +
  coord_map(xlim=bbox[1,], ylim=bbox[2,])


sp_divs$council_district <- over(
  gCentroid(sp_divs, byid = TRUE), 
  sp_council
)$DISTRICT

sp_divs$in_bbox <- sapply(
  sp_divs@polygons,
  function(p) {
    coords <- p@Polygons[[1]]@coords
    any(
      coords[,1] > bbox[1,1] &
      coords[,1] < bbox[1,2] &
      coords[,2] > bbox[2,1] &
      coords[,2] < bbox[2,2] 
    )
  }
)

ggdivs <- ggdivs %>% 
  left_join(
    sp_divs@data %>% select(WARD_DIVSN, in_bbox)
  )

district_map +
  geom_polygon(
    aes(alpha = (id == DISTRICT)),
    fill="black",
    color = "grey50",
    size=2
  ) +
  scale_alpha_manual(values = c(`TRUE` = 0.2, `FALSE` = 0), guide = FALSE) +
  ggtitle(sprintf("Council District %s", DISTRICT))

plot of chunk district_map
Despite the large expanse of land, the vast majority of the district’s votes come from Center City and northern South Philly.

View code
# hist(turnout_df$total_votes / turnout_df$AREA_SFT, breaks = 1000)

turnout_df <- turnout_df %>%
  left_join(races)

district_map +
  geom_polygon(
    data = ggdivs %>%
      filter(in_bbox) %>%
      left_join(turnout_df, by =c("id" = "WARD_DIVSN")),
    aes(fill = pmin(total_votes / AREA_SFT, 0.0005))
  ) +
  scale_fill_viridis_c(guide = FALSE) +
  geom_polygon(
    fill=NA,
    color = "white",
    size=1
  ) +
  facet_wrap(~ election_name) +
  ggtitle(
    "Votes per mile in the Democratic Primary", 
    sprintf("Council District %s", DISTRICT)
  )

plot of chunk turnout_map
In fact, so few votes come from the industrial Southernmost tip of the city that let’s drop it from the maps. Sorry Navy Yard, but you’re ruining my scale.

View code
d2_subset <- sp_divs[sp_divs$council_district == DISTRICT,]
d2_subset <- d2_subset[
  d2_subset$WARD_DIVSN %in% 
    turnout_df$WARD_DIVSN[turnout_df$total_votes / turnout_df$AREA_SFT > 0.0001],
]

bbox <- gUnionCascaded(d2_subset)@bbox
## expand the bbox 20%for mapping
bbox <- rowMeans(bbox) + 1.2 * sweep(bbox, 1, rowMeans(bbox))

basemap <- get_map(bbox, maptype="toner-lite")

district_map <- ggmap(
  basemap, 
  extent="normal", 
  base_layer=ggplot(ggcouncil, aes(x=long, y=lat, group=group)),
  maprange = FALSE
) 
## without basemap:
# district_map <- ggplot(ggcouncil, aes(x=long, y=lat, group=group))

district_map <- district_map +
  theme_map_sixtysix() +
  coord_map(xlim=bbox[1,], ylim=bbox[2,])


sp_divs$council_district <- over(
  gCentroid(sp_divs, byid = TRUE), 
  sp_council
)$DISTRICT

First, let’s look at the results from five recent, compelling Democratic Primary races: 2015 City Council At Large, City Council District 2, and Mayor; 2016 President; and 2017 District Attorney. The maps below show the vote for the top two candidates in District 2 (except for City Council in 2015, where I use Helen Gym and Isaiah Thomas, who were 4th and 5th in the district, and 5th and 6th citywide.)

View code
candidate_votes <- candidate_votes %>%
  left_join(sp_divs@data %>% select(WARD_DIVSN, council_district))

## Choose the top two candidates in district 3
## Except for city council, where we choose Gym and Thomas
# candidate_votes %>%
#   group_by(OFFICE, year, CANDIDATE) %>%
#   summarise(
#     city_votes = sum(VOTES),
#     district_votes = sum(VOTES * (council_district == DISTRICT))
#   ) %>%
#   arrange(desc(city_votes)) %>%
#   filter(OFFICE == "DISTRICT ATTORNEY")

candidates_to_compare <- tribble(
  ~year, ~OFFICE, ~CANDIDATE, ~candidate_name, ~row,
  "2015", "COUNCIL AT LARGE", "HELEN GYM", "Helen Gym", 2,
  "2015", "COUNCIL AT LARGE", "ISAIAH THOMAS", "Isaiah Thomas", 1,
  "2015", "DISTRICT COUNCIL-2ND DISTRICT-DEM", "KENYATTA JOHNSON", "Kenyatta Johnson", 1,
  "2015", "DISTRICT COUNCIL-2ND DISTRICT-DEM", "ORI C FEIBUSH", "Ori Feibush", 2,
  "2015", "MAYOR", "JIM KENNEY", "Jim Kenney",  2,
  "2015", "MAYOR", "ANTHONY HARDY WILLIAMS", "Anthony Hardy Williams", 1,
  "2016", "PRESIDENT OF THE UNITED STATES", "BERNIE SANDERS", "Bernie Sanders", 2,
  "2016", "PRESIDENT OF THE UNITED STATES", "HILLARY CLINTON", "Hillary Clinton", 1,
  "2017", "DISTRICT ATTORNEY", "LAWRENCE S KRASNER", "Larry Krasner", 2,
  "2017", "DISTRICT ATTORNEY", "JOE KHAN","Joe Khan", 1
)

candidate_votes <- candidate_votes %>%
  left_join(races) %>%
  left_join(candidates_to_compare)

vote_adjustment <- function(pct_vote, office){
  ifelse(office == "COUNCIL AT LARGE", pct_vote * 4, pct_vote)
}

district_map +
  geom_polygon(
    data = ggdivs %>%
      filter(in_bbox) %>%
      left_join(
        candidate_votes %>% filter(!is.na(row))
      ),
    aes(fill = 100 * vote_adjustment(pvote, OFFICE))
  ) +
  scale_fill_viridis_c("Percent of Vote") +
  theme(
    legend.position =  "bottom",
    legend.direction = "horizontal",
    legend.justification = "center"
  ) +
  geom_polygon(
    fill=NA,
    color = "white",
    size=1
  ) +
  geom_label(
    data=candidates_to_compare %>% left_join(races),
    aes(label = candidate_name),
    group=NA,
    hjust=0, vjust=1,
    x=bbox[1,1],
    y=bbox[2,2]
  ) +
  facet_grid(row ~ election_name) +
  theme(strip.text.y = element_blank()) +
  ggtitle(
    sprintf("Candidate performance in District %s", DISTRICT), 
    "Percent of vote (times 4 for Council, times 1 for other offices)"
  )

plot of chunk proportion
Notice two things. First, the section of Point Breeze that dominated for Kenyatta Johnson in 2015, but also voted disproportionately for Isaiah Thomas, Anthony Hardy Williams, and Hillary Clinton. These are predominantly Black neighborhoods that didn’t bite on Helen Gym, Jim Kenney, or Bernie Sanders. Unlike in West Philly, Krasner did even better in Black Point Breeze than he did in the White, gentrified Graduate Hospital, where Joe Khan did unusually well. East Passyunk exhibited similar Krasner excitement to University City.

Second, note that Washington Avenue provides the stark boundary between pro-Kenyatta Point Breeze and pro-Gym, Feibush, and Kenney Graduate Hospital (Interested in this emergent boundary? Boy, have I got a dissertation for you!) Above Washington (along with the nub of East Passyunk that extends into the East of the district) both support the farther left challengers and turn out in force, although they didn’t support Sanders and Krasner as sharply as other gentrified parts of the city.

The district had one more coalition, hidden by these maps: Trump supporters.

View code
usp_2016 <- df_major %>%
  filter(
    election=="general"&
      year == 2016 &
      OFFICE == "PRESIDENT OF THE UNITED STATES" &
      CANDIDATE %in% c("DONALD J TRUMP", "HILLARY CLINTON")
    ) %>%
  mutate(WARD_DIVSN = paste0(WARD16, DIV16)) %>%
  group_by(WARD_DIVSN, CANDIDATE) %>%
  summarise(VOTES = sum(VOTES)) %>%
  group_by(WARD_DIVSN) %>%
  summarise(
    turnout = sum(VOTES),
    pdem = sum(VOTES * (CANDIDATE == "HILLARY CLINTON")) / sum(VOTES)
  )

district_map +
  geom_polygon(
    data = ggdivs %>%
      filter(in_bbox) %>%
      left_join(
        usp_2016
      ),
    aes(fill = 100 * (1-pdem))
  ) +
  scale_fill_gradient2(
    "Percent for Donald Trump",
    low = strong_blue, mid = "white", high = strong_red, midpoint = 50
  )+
  theme(
    legend.position =  "bottom",
    legend.direction = "horizontal",
    legend.justification = "center"
  ) +
  geom_polygon(
    fill=NA,
    color = "white",
    size=1
  ) +
  expand_limits(fill = 80) +
  ggtitle("South of Passyunk went for Trump", "Percent of two-party vote in the 2016 Presidential election")

plot of trump support

South of Passyunk voted Trump, with up to 60% of the vote! Coupled with parts of the Northeast, this represents Philadelphia’s Trump Democrats. We’ll treat them separately.

To simplify the analysis, let’s divide the District into coalitions. We’ll use four: “Johnson’s Base” of Point Breeze, “Gentrified Challengers” of Graduate Hospital and East Passyunk, “Southwest Philly”, which supported Johnson but not homogenously, and “Trumpist South Philly”, below Passyunk.

View code

xcand <- "Kenyatta Johnson"
ycand <- "Larry Krasner"

## Everything west of the Schuylkill call Southwest.
div_centroids <- gCentroid(sp_divs[sp_divs$council_district == DISTRICT,], byid=TRUE)
sw_divs <- attr(div_centroids@coords, "dimnames")[[1]][div_centroids@coords[,1] < -75.20486]

## Pull out the places trump won
trump_winners <- usp_2016 %>%
  inner_join(sp_divs@data %>% filter(council_district == DISTRICT)) %>%
  filter(pdem < 0.5)

district_categories <- candidate_votes %>%
  filter(!is.na(candidate_name)) %>%
  group_by(WARD_DIVSN) %>%
  mutate(votes_2016 = total_votes[candidate_name == 'Bernie Sanders']) %>%
  group_by() %>%
    filter(
      council_district == DISTRICT & 
        candidate_name %in% c(xcand, ycand)
    ) %>%
    group_by(WARD_DIVSN, votes_2016) %>%
    summarise(
      x_pvote = pvote[candidate_name == xcand],
      y_pvote = pvote[candidate_name == ycand]
    ) %>%
  mutate(
    is_sw = WARD_DIVSN %in% sw_divs,
    trump_winner = WARD_DIVSN %in% trump_winners$WARD_DIVSN,
    cat = ifelse(is_sw, "Southwest", ifelse(trump_winner, "Trumpists", "East"))
  ) 

# district_categories <- district_categories %>% left_join(turnout_wide, by = "WARD_DIVSN")

ggplot(
  district_categories,
  aes(x = 100 * x_pvote, y = 100 * y_pvote)
  # aes(x = 100 * x_pvote, y = (votes_2017 - votes_2015) * 5280^2)
) +
  geom_point(aes(size = votes_2016), alpha = 0.3) +
  scale_size_area("Total Votes in 2016")+
  theme_sixtysix() +
  xlab(sprintf("Percent of Vote for %s", xcand)) +
  ylab("Change in Votes/Mile, 2015 - 2017") + 
  ylab(sprintf("Percent of Vote for %s", ycand)) +
  coord_fixed() +
  geom_abline(slope = 3, intercept =  -130) +
  # geom_hline(yintercept=0) +
  # geom_vline(xintercept=60, linetype="dashed") +
  # geom_abline(slope = 100, intercept =  -7000) +
  geom_text(
    data = data.frame(cat = rep("East", 2)),
    x = c(28, 80),
    y = c(70, 10),
    hjust = 0.5,
    label = c("Challenger\nBase", "Johnson\nBase"),
    color = c(strong_green, strong_purple),
    fontface="bold"
  ) +
  facet_wrap(~cat)+
  ggtitle("Divisions' vote", sprintf("District %s Democratic Primary", DISTRICT))

plot of chunk scatter_bernie_gym
The Vote for Krasner is only weakly negatively correlated with the vote for Johnson, surprisingly. I’ve drawn an arbitrary line that appears to divide the clusters. We’ll call the divisions in the East above the line the Gentrified Challengers, and those below the line Johnson’s Base.

Here’s the map of the cohorts that this categorization gives us.

View code
district_categories$category <- with(
  district_categories,
  ifelse(
    cat != "East", cat,
    ifelse(y_pvote > 3 * x_pvote - 1.30, "Gentrified Challengers", "Johnson Base")
  )
)

cohort_colors <- c(
      "Johnson Base" = strong_purple,
      "Gentrified Challengers" = strong_green,
      "Southwest" = strong_orange,
      "Trumpists" = strong_red
    )

district_map + 
  geom_polygon(
    data = ggdivs %>% 
      left_join(district_categories) %>% 
      filter(!is.na(category)),
    aes(fill = category)
  ) +
  scale_fill_manual(
    "Cohort",
    values=cohort_colors
  ) +
  ggtitle(sprintf("District %s neighborhood divisions", DISTRICT))+
  theme(legend.position = c(0,1), legend.justification = c(0,1))

plot of chunk category_map
Looks reasonable.

How did the candidates do in each of the sections? The boundaries separate drastic performance splits.

View code
neighborhood_summary <- candidate_votes %>% 
  inner_join(candidates_to_compare) %>%
  group_by(candidate_name, election_name) %>%
  mutate(
    citywide_votes = sum(VOTES),
    citywide_pvote = 100 * sum(VOTES) / sum(total_votes)
  ) %>%
  filter(council_district == DISTRICT) %>%
  left_join(district_categories) %>%
  group_by(candidate_name, citywide_votes, citywide_pvote, election_name, category) %>%
  summarise(
    votes = sum(VOTES),
    pvote = 100 * sum(VOTES) / sum(total_votes),
    total_votes = sum(total_votes)
  ) %>%
  group_by(candidate_name, election_name) %>%
  mutate(
    district_votes = sum(votes),
    district_pvote = 100 * sum(votes) / sum(total_votes)
  ) %>% select(
    election_name, candidate_name, citywide_pvote, district_pvote, category, pvote, total_votes
  ) %>%
  gather(key="key", value="value", pvote, total_votes) %>%
  unite("key", category, key) %>%
  spread(key, value)
  

neighborhood_summary %>%
  knitr::kable(
    digits=0, 
    format.args=list(big.mark=','),
    col.names=c("Election", "Candidate", "Citywide %", sprintf("District %s %%", DISTRICT), "Gentrified Challengers %", "Gentrified Challengers Turnout", "Johnson Base %", "Johnson Base Turnout", "Southwest %", "Southwest Turnout", "Trumpist %", "Trumpist Turnout")
  )
Election Candidate Citywide % District 2 % Gentrified Challengers % Gentrified Challengers Turnout Johnson Base % Johnson Base Turnout Southwest % Southwest Turnout Trumpist % Trumpist Turnout
2015 City Council At Large Helen Gym 8 9 14 24,471 7 25,139 4 15,569 5 5,338
2015 City Council At Large Isaiah Thomas 7 7 5 24,471 10 25,139 8 15,569 2 5,338
2015 Council 2nd District Kenyatta Johnson 62 62 44 6,669 79 9,508 64 5,894 36 2,057
2015 Council 2nd District Ori Feibush 38 38 55 6,669 21 9,508 36 5,894 64 2,057
2015 Mayor Anthony Hardy Williams 26 30 11 7,606 41 10,127 45 6,740 2 2,580
2015 Mayor Jim Kenney 56 56 70 7,606 46 10,127 44 6,740 89 2,580
2016 President Bernie Sanders 37 37 39 11,702 38 14,293 29 9,224 47 2,051
2016 President Hillary Clinton 63 63 60 11,702 62 14,293 71 9,224 52 2,051
2017 District Attorney Joe Khan 20 23 33 6,985 18 6,569 13 3,354 14 982
2017 District Attorney Larry Krasner 38 39 43 6,985 41 6,569 32 3,354 18 982

The turnout splits are fascinating. The Johnson Base represented a consistent 37ish percent of the votes, dominating the election in 2015 and 2016, but surpassed by the Gentrified Challengers’ 39% in 2017. Still, the Southwest typically represents 25% of the votes (this fell to 19% in 2017), so Johnson’s Base combined with the Southwest made up a strong 63% of the 2016 vote, and 58% of the 2017 vote.

View code
cohort_turnout <- neighborhood_summary %>%
  group_by() %>%
  filter(election_name %in% c("2015 Mayor", "2016 President", "2017 District Attorney")) %>%
  select(election_name, ends_with("_total_votes")) %>%
  gather("cohort", "turnout", -election_name) %>%
  unique() %>%
  mutate(
    year = substr(election_name, 1, 4),
    cohort = gsub("^(.*)_total_votes", "\\1", cohort)
  ) %>%
  group_by(year) %>%
  mutate(pct_turnout = turnout / sum(turnout))
  
ggplot(cohort_turnout, aes(x=year, y=100*pct_turnout)) +
  geom_line(aes(group=cohort, color=cohort), size=2) +
  geom_point(aes(color=cohort), size=4) +
  scale_color_manual(values=cohort_colors, guide=FALSE) +
  theme_sixtysix() +
  expand_limits(y=0) +
  expand_limits(x=4)+
  geom_text(
    data = cohort_turnout %>% filter(year == 2017),
    aes(label = cohort, color = cohort),
    x = 3.05,
    fontface="bold",
    hjust = 0
  ) +
  ylab("Percent of District 2's votes") +
  xlab("") +
  ggtitle(
    "Cohorts' electoral strength", "Percent of District 2's votes in the Democratic Primary"
  )

plot of chunk turnout by cohort

How does the combination of (a) Johnson’s Base sheer size but (b) the Gentrifiers’ surge in voting impact the election? It comes down to percent of the vote in each region. In 2015, Kenyatta won 44% in the Challenger Base even as he dominated his own Base and the Southwest, 79 and 64%. Feibush won 64% from the South Philly Trumpists. Vidas, who is a very different candidate from Feibush (to put it mildly), would have to do much, much better in the Gentrified regions, and hope Johnson’s dominance of Point Breeze has fallen.

The relative power of West and Southwest and University City

How much does the power shift between the two cohorts? Let’s do some math.

How much does a candidate need from each of the sections to win? Let t_i be the relative turnout in section i, defined as the proportion of total votes. So in the 2017 District Attorney Race, t_i was 0.39 for the Gentrified Challengers, and 0.37 for the Johnson Base. Let p_ic be the proportion of the vote received by candidate c in section i, so in 2017, p is 0.41 for Krasner in the Johnson Base.

Then a candidate wins a two-way race whenever the turnout-weighted proportion of their vote is greater than 0.5: sum_over_i(t_i p_ic) > 0.5.

Since we’ve divided District 2 into four sections, it’s hard to plot on a two-way axis. For simplicity, I’ll combine the Johnson Base with Southwest Philly, and the Gentrified Challengers with the Trumpists (these are in my opinion the likely race-correlated dynamics that will play out). On the x-axis, let’s map a candidate’s percent of the vote in the Gentrifiers + Trumpists, and on the y, a candidate’s percent of the vote in Southwest + the Johnson Base (assuming a two-person race). The candidate wins whenever the average of their proportions, weighted by t, is greater than 50%. The dashed lines show the win boundaries; candidates to the top-right of the lines win. Turnout matters less than in District 2 than in District 3 because it swings less; they didn’t experience the Krasner bump in 2017.

I’ll plot only the two-candidate vote for the top two candidates in the district for each race, to emulate a two-person race. (For City Council in 2015, I use Helen Gym and Isaiah Thomas, who were 4th and 5th in the district, and 5th and 6th citywide.)

View code
get_line <- function(x_total_votes, y_total_votes){
  ## solve p_x t_x+ p_y t_y > 50
  tot <- x_total_votes + y_total_votes
  tx <- x_total_votes / tot
  ty <- y_total_votes / tot

  slope <- -tx / ty
  intercept <- 50 / ty  # use 50 since proportions are x100
  c(intercept, slope)
}

line_2017 <- with(
  neighborhood_summary,
  get_line(
    (`Gentrified Challengers_total_votes` + Trumpists_total_votes)[candidate_name == "Larry Krasner"],
    (`Johnson Base_total_votes` + `Southwest_total_votes`)[candidate_name == "Larry Krasner"]
  )
)

line_2015 <- with(
  neighborhood_summary,
  get_line(
    (`Gentrified Challengers_total_votes` + Trumpists_total_votes)[candidate_name == "Jim Kenney"],
    (`Johnson Base_total_votes` + `Southwest_total_votes`)[candidate_name == "Jim Kenney"]
  )
)

## get the two-candidate vote
neighborhood_summary <- neighborhood_summary %>%
  group_by(election_name)  %>% 
  mutate(
    challenger_pvote_2cand = (
      `Gentrified Challengers_pvote` + Trumpists_pvote
      ) / sum(`Gentrified Challengers_pvote` + Trumpists_pvote),
    kenyatta_pvote_2cand = (`Southwest_pvote` + `Johnson Base_pvote`)/sum(`Southwest_pvote` + `Johnson Base_pvote`)
  )


library(ggrepel)

ggplot(
  neighborhood_summary,
  aes(
    x=100*challenger_pvote_2cand,
    y=100*kenyatta_pvote_2cand
  )
) +
  geom_point() +
  geom_text_repel(aes(label=candidate_name)) +
  geom_abline(
    intercept = c(line_2015[1], line_2017[1]),
    slope = c(line_2015[2], line_2017[2]),
    linetype="dashed"
  ) +
  coord_fixed() + 
  scale_x_continuous(
    "Gentrified Challenger + Trumpist percent of vote",
    breaks = seq(0,100,10)
  ) +
  scale_y_continuous(
    "Johnson Base + Southwest percent of vote",
    breaks = seq(0, 100, 10)
  ) +
  annotate(
    geom="text",
    label=paste(c(2015, 2017), "turnout"),
    x=c(10, 8),
    y=c(
      line_2015[1] + 10 * line_2015[2],
      line_2017[1] + 8 * line_2017[2]
    ),
    hjust=0,
    vjust=-0.2,
    angle = atan(c(line_2015[2], line_2017[2])) / pi * 180,
    color="grey40"
  )+
  annotate(
    geom="text",
    x = 80,
    y=75,
    label="Candidate wins",
    fontface="bold",
    color = strong_green
  ) +
  geom_hline(yintercept = 50, color="grey50") +
  geom_vline(xintercept = 50, color="grey50")+
  expand_limits(x=100, y=80)+
  theme_sixtysix() +
  ggtitle(
    "The relative strengths of District 2 neighborhoods",
    "Candidates to the top-right of the lines win. Points are two-candidate vote."
  )

plot of chunk win_scatter

Hillary Clinton, Larry Krasner, and Jim Kenney won the two-way votes in all sections. Kenyatta lost the Gentrified + Trumpist vote 59-41, but dominated Point Breeze and Southwest Philly. (Notice the points don’t match the table above because these are two-candidate votes.)

What would be Vidas’s path to victory? Helen Gym looks like a prototype (remember that there were actually 16 candidates for five spots, so this head-to-head analysis is hypothetical). Developer Ori Feibush didn’t do nearly well enough in Grad Hospital and East Passyunk to win. If Vidas burnishes more progressive credentials, and pushes that percentage up to 80%, then she could win even if Johnson doesn’t lose any support in his base.

Looking to May

We’re left in a grey area. There are reasons to believe that the recent scandals could drastically change Johnson’s support from 2015, but without polling, we have no way to tell exactly how much. It would take a huge change from 2015 for him to lose, but the combination of scandal and not running against Feibush could be that change.

Up next, I’ll stick with scandal-plagued incumbents and look at Henon’s District 6. Stay tuned!