View code
### load required libraries
library(tidyverse)
library(magrittr)
library(extrafont)
library(rphl)
library(ggiraph)
### load required data
## connect to base ArcGIS Rest API for citywide mortality metrics
death_url <- httr::parse_url("https://services.arcgis.com/fLeGjb7u4uXqeF9q/ArcGIS/rest/services/Vital_Mortality_Cty/FeatureServer/0/query")
## build query for ArcGIS Rest API
death_url$query <- list(where = "METRIC_NAME = 'life_expectancy' AND AGE_CATEGORY = 'At birth'",
# allows > 2,000 records to be loaded
resultType = "standard",
# returns all fields
outFields = "*",
returnGeometry = FALSE,
# specifies format of data as JSON
f = "json")
## load and format data frame of citywide mortality metrics
death_metrics <- jsonlite::fromJSON(httr::build_url(death_url))$features$attributes %>%
janitor::clean_names() %>%
select(-c(objectid,rank)) %>%
mutate(metric_name = "Life expectancy at birth",
quality_flag = case_when(race_ethnicity %in% c("Asian (NH)","Hispanic") ~ "Some concerns due to small population size",
TRUE ~ "None"),
year = as.character(year))
### create functions for figures
## create general ggplot themes
theme_cdip_data_lab <- function() {
theme_bw(base_family = "Open Sans") +
theme(
panel.grid.minor = element_blank(),
panel.border = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_text(size=16),
axis.ticks = element_blank(),
legend.title = element_blank(),
legend.key = element_rect(colour = "transparent", fill = "white"),
plot.title = element_text(size=15,face = "bold"),
plot.caption = element_text(size=13,colour = "#7a8489"),
axis.text = element_text(size=14),
legend.text = element_text(size=14),
strip.background = element_rect(color = "#999690", fill = "#999690"),
strip.text = element_text(face = "bold", color = "#ffffff", size = 14),
panel.spacing.y = unit(1.5,"lines")
)
}
## function to format numbers nicely
comma <- function(x) format(x, big.mark = ",",scientific=FALSE)
## set race/ethnicity colors for all plots
race_eth_colors <- c("All races/ethnicities"="#2176d2",
"Asian (NH)"="#96c9ff",
"Black (NH)"= "#f99300",
"Hispanic"="#58c04d",
"White (NH)"="#f3c613")
## set sex colors for all plots
sex_colors <- c("All sexes"="#2176d2",
"Female"="#f3c613",
"Male"="#96c9ff")
## function to make life expectancy figures
e_0_figure <- function(race_eth_input,fill_variable,fill_colors) {
base_plot <- death_metrics %>%
# filter data to life expectancy metric in 2019
filter(race_ethnicity %in% race_eth_input,
year == 2019,
metric_name == "Life expectancy at birth") %>%
# set race_ethnicity as x-axis variable
ggplot(aes(x=race_ethnicity,
# set metric_value (i.e., life expectancy) as y-axis variable
y=metric_value,
# pipe in desired fill variable (sex or race_ethnicity)
fill = {{fill_variable}},
# define text in tooltip
tooltip = str_c("Value: ",comma(round(metric_value,2)),
"\nMetric: ",metric_name,
"\nYear: ",year,
"\nRace/ethnicity: ",race_ethnicity,
"\nQuality concerns: ",quality_flag))) +
# manually set color scheme
scale_fill_manual(values=fill_colors) +
# specify column geometry
geom_col_interactive() +
# facet by sex
facet_wrap(~sex) +
# apply pre-defined ggplot theme
theme_cdip_data_lab() +
# set figure and axis titles
labs(title = "Life expectancy at birth in Philadelphia, PA (2019)",
caption = "Source: PA Vital Registration System; Notes: NH = non-Hispanic",
y = "Years")
# if race/ethnicity input is longer than 1, set x-axis labels to 45 degree angle
if (length(race_eth_input) > 1) {
base_plot <- base_plot +
guides(x = guide_axis(angle = 56))
} else {base_plot}
# convert ggplot object to girafe object
girafe_plot <- girafe(ggobj = base_plot,
width_svg = 12,
height_svg = 6)
# make tooltip color match color of corresponding point
girafe_options(x = girafe_plot,
opts_tooltip(use_fill = TRUE,
css = "font-family: 'Open Sans';font-size: 14px;color:white;"))
}
In part two of this series, we will explore life expectancy at birth
in Philadelphia. Missed part one? Catch up here.
How does life expectancy at birth differ by sex and
race/ethnicity?
Life expectancy at birth is typically estimated in a given year as
the number of years a person born in that year would live, assuming
mortality rates remain constant throughout that person’s life. This
assumption is false, of course; a person born in 2022 would not
experience the same mortality rates at age 65 as those who were 65 in
2022. Instead, they would experience mortality rates at age 65 in 2087,
which we might expect to be lower than the 2022 rates given recent
trends in mortality reduction over time.
Because of this assumption, life expectancy is sensitive to mortality
shocks. For example, life expectancy plummeted during the 1918 influenza
pandemic; however, life expectancy was calculated for a hypothetical
person who would live every year of their life in 1918, which very few
people actually experienced. The same is true for the declines in life
expectancy we expect to see during the COVID-19 pandemic when 2020 data
are released: most people (except for infants who die the same year
they’re born) will not actually live their entire life subject to these
elevated mortality rates.
View code
e0_sex <- e_0_figure(race_eth_input = "All races/ethnicities",
fill_variable = sex,
fill_colors = sex_colors)
Click here
to expand and visit PhilaStats
to customize.
Life expectancy in Philadelphia was stable at around 76 years between
2012 and 2019. Mirroring a trend we see pretty much everywhere in the
world, life expectancy in 2019 was higher for women (80.1 years)
than it was for men (73.0 years). Life expectancy in Philadelphia was
lower than in the US as a whole
– only by a few months for women, but by more than three years for
men.
View code
e0_race <- e_0_figure(race_eth_input = c("All races/ethnicities","Asian (NH)","Black (NH)","Hispanic","White (NH)"),
fill_variable = race_ethnicity,
fill_colors = race_eth_colors)
Click here
to expand and visit PhilaStats
to customize.
In addition to differences by sex, there was substantial variation in
life expectancy by race/ethnicity as well. Of any sex and race/ethnicity
combination, life expectancy in 2019 was lowest for non-Hispanic Black
men at just 69.0 years. In comparison, life expectancy for non-Hispanic
white men and Hispanic men was 74.8 years and 75.8 years, respectively.
There was a similar trend among women in 2019: non-Hispanic Black women
had a life expectancy of 77.4 years, compared to 81.0 years for
non-Hispanic white women and 85.2 years for Hispanic women.
I haven’t yet mentioned life expectancy among non-Hispanic Asian
groups in Philadelphia. You’ll see from the above graph that
non-Hispanic Asian men and women have the highest life expectancy of any
race/ethnicity group, at 90.7 years for both women and men.
Interestingly, there’s no sex difference! This is unusual. Even more
striking is the size of the non-Hispanic Asian advantage in life
expectancy. Life expectancy for non-Hispanic Asian women is about 10.5
years higher than life expectancy for all women in Philadelphia. Life
expectancy for non-Hispanic Asian men is nearly 18 years higher
than it is for all men.
What could be accounting for the large non-Hispanic Asian life
expectancy advantage we see in Philadelphia’s vital statistics? Is this
difference real? There are two possibilities: either vital statistics
are accurately capturing longer lives lived by non-Hispanic Asian
Philadelphians compared to the rest of the population; or there could be
problems in our data that suggest an advantage that doesn’t exist or
overestimate the size of the advantage.
We have several reasons to believe that non-Hispanic Asian
Philadelphians truly do live longer than the rest of the population.
This pattern – albeit at a smaller scale – has been found for the US as
a whole.
Research has found
that Asian American populations have lower mortality rates than other
race/ethnicity groups in the US, no matter what specific cause of death
is considered. Another stream of research has found that foreign-born
Americans live longer than US-born
Americans, and Asian-born Americans have exceptionally
high longevity.
Are non-Hispanic Asian Philadelphians were more likely that other
Philadelphians to be foreign-born? If so, this could contribute to the
life expectancy advantage. This is a question we can answer relatively
easily using Kyle Walker’s brilliant R package tidycensus.
View code
### Step 1: Set up for accessing data -----------------------------
## load required packages for accessing and analyzing Census data
library(tidycensus)
library(srvyr)
## Set Census API key (can be obtained from http://api.census.gov/data/key_signup.html)
# census_api_key("enter API key here", install = T)
## pums_variables: searchable data catalog built into tidycensus
pums_vars <- pums_variables %>%
filter(year == 2019, survey == "acs5") %>%
distinct(var_code, var_label, data_type, level) %>%
filter(level == "person", !str_detect(var_label, "allocation flag"))
## list of Philadelphia pumas
phila_pumas <- c("03201", "03202", "03203", "03204", "03205", "03206", "03207", "03208","03209", "03210", "03211")
### Step 2: Importing data from tidycensus::get_pums() ---------------------
if(file.exists("cached/phila_pums_2019.RDS")){
phila_pums_2019 <- readRDS("cached/phila_pums_2019.RDS")
} else {
phila_pums_2019 <- get_pums(
state = "PA",
puma = phila_pumas,
survey = "acs5",
year = 2019,
recode = TRUE, # note: not available for 2020 yet
# select desired variables to include
variables = c("RAC1P","HISP","NATIVITY"),
# load person level weights for weighted survey object
rep_weights = "person") %>%
# recode NATIVITIY variable into binary native and foreign born variables
mutate(native_born = NATIVITY == 1,
foreign_born = NATIVITY == 2,
# create recoded race/ethnicity field based on RAC1P and HISP variables
race_ethnicity = factor(
case_when(HISP != "01" ~ "Hispanic",
RAC1P == 6 & HISP == "01" ~ "Asian, non-Hispanic",
RAC1P == 2 & HISP == "01" ~ "Black, non-Hispanic",
RAC1P == 1 & HISP == "01" ~ "White, non-Hispanic",
TRUE ~ "Other, non-Hispanic")
),
# rearrange levels of factor so "Other, non-Hispanic" is last
race_ethnicity = fct_relevel(race_ethnicity,"Other, non-Hispanic",after=4L))
}
### Step 3 - Create weighted survey object ----------------------------
## create weighted survey object with person-level weights
w_phila_pums_2019 <- to_survey(phila_pums_2019, type = "person")
### Step 4 - Generate table output ----------------------------
## create formatted output HTML table
w_phila_pums_2019 %>%
# group by race and Hispanic ethnicity variables
group_by(race_ethnicity) %>%
# select native_born and foreign_born columns to summarise
summarise(across(ends_with("_born"),
# create list of functions to apply to selected columns: survey mean and total
# vartype = NULL excludes uncertainty estimates from output
.fns = list(p = ~survey_mean(.x, vartype = NULL),
t = ~survey_total(.x, vartype = NULL)))) %>%
# add total population summary row
bind_rows(w_phila_pums_2019 %>%
# select native_born and foreign_born columns to summarise
summarise(across(ends_with("_born"),
# create list of functions to apply to selected columns: survey mean and total
# vartype = NULL excludes uncertainty estimates from output
.fns = list(p = ~survey_mean(.x, vartype = NULL),
t = ~survey_total(.x, vartype = NULL)))) %>%
mutate(race_ethnicity = "Total Population")) %>%
# format %s and totals nicely
mutate(across(ends_with("_p"),~str_c(round(.x*100),"%")),
across(ends_with("_t"),~comma(.x))) %>%
# convert table to HTML format
knitr::kable(format = "html",
# clean up column names
col.names = c("Race/Ethnicity",
"% Native Born",
"# Native Born",
"% Foreign Born",
"# Foreign Born"),
caption = "Nativity by Race/Ethnicity in Philadelphia, PA (2019)") %>%
# add footnote with data source to table
kableExtra::footnote(general = "Source: American Community Survey Public Use Microdata Series") %>%
# add nice styling to HTML table
kableExtra::kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = T)
Nativity by Race/Ethnicity in Philadelphia, PA (2019)
Race/Ethnicity
|
% Native Born
|
# Native Born
|
% Foreign Born
|
# Foreign Born
|
Asian, non-Hispanic
|
34%
|
38,492
|
66%
|
75,241
|
Black, non-Hispanic
|
93%
|
595,028
|
7%
|
47,731
|
Hispanic
|
81%
|
187,232
|
19%
|
45,000
|
White, non-Hispanic
|
91%
|
495,663
|
9%
|
48,834
|
Other, non-Hispanic
|
85%
|
39,645
|
15%
|
6,967
|
Total Population
|
86%
|
1,356,060
|
14%
|
223,773
|
Note:
|
Source: American Community Survey Public Use Microdata
Series
|
Click here
to expand.
The table above shows that 66% of non-Hispanic Asian Philadelphians
were born outside of the US, compared to only 14% for the city as a
whole. So, non-Hispanic Asian Philadelphians are disproportionately
foreign-born, which may explain why life expectancy for this group is so
much higher than any other race/ethnicity group.
Okay, so we hope we’ve convinced you that non-Hispanic Asian
Philadelphians really do have a life expectancy advantage over other
race/ethnicity groups. But the size of the advantage might be
overestimated because of inaccuracies in the data.
One documented
problem is that race/ethnicity is inconsistency recorded, especially for
Asian Americans. Estimates of mortality, which feed into life expectancy
calculations, typically use data from two sources: the race/ethnicity
recorded on death certificates is used for the numerator (number of
deaths) and the race/ethnicity from population estimates like the Census
is used for the denominator (number of people). When race/ethnicity is
recorded in the Census, it’s almost always reported by the person
themselves, or by another member of the household. Race/ethnicity on a
death certificate, however, is more likely to be reported by someone who
may not know the person’s race/ethnicity, such as a physician or medical
examiner.
Studies have compared how the race/ethnicity of specific individuals
is recorded in various data sources over time to assess the consistency
and accuracy of these data sources. For example, a person’s
race/ethnicity may be recorded in one way in a decennial Census (when
race/ethnicity is reported either by themselves or a household member),
and another way on their death certificate (which may be reported by
someone who does not know the person’s race/ethnicity). These studies
have found
that race/ethnicity is reported fairly accurately on the death
certificate – meaning, the same race/ethnicity is identified on the
death certificate as was self-selected on an earlier Census form or
survey – only for non-Hispanic white and non-Hispanic Black Americans.
Hispanic Americans, Asian Americans, and especially American Indians and
Alaska Natives (AIAN) were underreported on death certificates. That is,
someone who self-identified as Asian was more likely to be misclassified
as some other race/ethnicity on their death certificate.
If death certificates underestimate the number of Asian Americans (or
Asian Philadelphians) who die, this means the mortality rate for Asian
Americans is underestimated, and thus life expectancy for Asian
Americans is overestimated. The same is true for Hispanic Americans and
AIAN Americans. When you see standard life expectancy estimates based on
data from death certificates and the Census for Asian, Hispanic, and
AIAN populations in the US, they are likely too high. We don’t know
exactly how elevated these estimates are, but some recent research suggests
the overestimation is significant in statistical terms.
So our conclusion about non-Hispanic Asian life expectancy in
Philadelphia is the following. Non-Hispanic Asian Philadelphians live
quite a bit longer than non-Asian Philadelphians. This is likely due to
the fact that Non-Hispanic Asian Philadelphians are disproportionately
foreign-born compared to other groups. However, these data are not
perfect, and we think the true life expectancy advantage enjoyed by
non-Hispanic Asian Philadelphians may be slightly smaller than what we
see on PhilaStats.
Thanks for exploring PhilaStats
with us!
We would love to hear how you’re using PhilaStats. You can reach
the Data Lab by email, or find Megan,
the Director of the Data Lab, on Twitter.