library(tigris)
library(tidycensus)
library(tidyverse)
library(acs) # only used for lists of counties
library(leaflet)
Interactive Choropleths of Census Data with R
In this short blog post, I’ll illustrate how to acquire and visualize US Census data choropleths using R. Since the last time I worked on spatial data, the R ecosystem surrounding Census data and geospatial visualization has evolved considerably. As a result, while I make an effort here to show the use of the latest packages, I’m limited a bit in my knowledge of what’s going on behind the scenes.
Our aim in this blog post will be to construct a choropleth map showing the percentage of Black residents in each of the block groups in Providence, with selected locations highlighted on top. We are just going to do a minimal version of this task: there are lots of customization and appearance possibilities for the map that we won’t mess with.
We’ll use the following R packages:
Next we’ll acquire a shapefile of Census blockgroups in Providence:
<- fips.county %>%
counties filter(State == "RI") %>%
filter(County.Name == "Providence County") %>%
select(County.ANSI) %>%
unlist()
<- tigris::block_groups(state = "RI", county = counties, cb=TRUE) %>%
bg ::st_transform('+proj=longlat +datum=WGS84') sf
Now we’ll acquire Census demographic data, in this case from the 2015-2020 American Community Survey (ACS). We’ll acquire the ACS’ basic breakdown of race and ethnicity. Census Reporter is a useful site for finding the table codes for various topics of interest. We are using the tidycensus package to obtain the data.
# if you haven't done so before, you need to acquire and
# log a Census API key prior to the get_acs call.
# API keys can be obtained here:
# http://api.census.gov/data/key_signup.html
# once you have your key, run
# census_api_key("text-of-key")
<- load_variables(2020, "acs5", cache = TRUE) %>%
vars filter(grepl("B03002", name),
str_sub(label, start=-1) != ":")
<- get_acs(geography = "block group",
race_ethnicity variables = vars$name,
year = 2020,
state = "RI")
The race_ethnicity
data frame now contains the ACS estimates for each column (representing a racial group) and blockgroup (representing a geographic area). From this, we compute the percentage of Black residents in each blockgroup. We use both the columns B03002_014
(Black non-Hispanic) and B03002_004
(Black Hispanic).
<- race_ethnicity %>%
pct_black group_by(GEOID) %>%
summarize(pct_black = sum(estimate*(variable %in% c("B03002_014","B03002_004")))/sum(estimate))
Now we can join the results to the blockgroups, joining on the unique GEOID
identifier for each area.
<- bg %>%
bg left_join(pct_black, by = c("GEOID"))
Our final data source is a CSV file of locations. This file contains simply latitude, longitude, and the name of the location, although richer information could also be provided and displayed.
<- read_csv("data/locations.csv")
locations locations
# A tibble: 2 × 3
lat lon name
<dbl> <dbl> <chr>
1 41.8 -71.4 ICERM
2 41.8 -71.4 Den Den
Ok, plot time!
# for colors
<- colorNumeric( palette="inferno", domain=bg$pct_black, na.color="transparent")
mypalette
# construct the map
<- leaflet(bg) %>%
pvd_map # background street map
addProviderTiles("CartoDB.Positron") %>%
# choropleth: % Black per blockgroup
addPolygons(fillColor = ~mypalette(pct_black),
stroke=FALSE,
weight = 2,
fillOpacity = 0.7,
smoothFactor = 0) %>%
# locations
addCircleMarkers(lng = ~lon,
lat = ~lat,
popup = ~name,
data = locations,
color = "white",
fill = "black",
radius = 3) %>%
setView(lng = -71.40702260001015, lat = 41.82458541770865, zoom = 13)
Lighter colors correspond to higher proportions of Black residents, while darker colors correspond to lower proportions. This map is draggable, zoomable, and clickable.
There are some issues here with the polygons not being totally correctly aligned. I believe that this is an issue with map projection that can be easily solved, but I haven’t looked into it further.