Interactive Choropleths of Census Data with R


Phil Chodrow


June 19, 2022

In this short blog post, I’ll illustrate how to acquire and visualize US Census data choropleths using R. Since the last time I worked on spatial data, the R ecosystem surrounding Census data and geospatial visualization has evolved considerably. As a result, while I make an effort here to show the use of the latest packages, I’m limited a bit in my knowledge of what’s going on behind the scenes.

Our aim in this blog post will be to construct a choropleth map showing the percentage of Black residents in each of the block groups in Providence, with selected locations highlighted on top. We are just going to do a minimal version of this task: there are lots of customization and appearance possibilities for the map that we won’t mess with.

We’ll use the following R packages:

library(acs)       # only used for lists of counties

Next we’ll acquire a shapefile of Census blockgroups in Providence:

counties <- fips.county %>% 
    filter(State == "RI") %>% 
    filter(County.Name == "Providence County") %>%
    select(County.ANSI) %>% 

bg <- tigris::block_groups(state = "RI", county = counties, cb=TRUE) %>%
  sf::st_transform('+proj=longlat +datum=WGS84')

Now we’ll acquire Census demographic data, in this case from the 2015-2020 American Community Survey (ACS). We’ll acquire the ACS’ basic breakdown of race and ethnicity. Census Reporter is a useful site for finding the table codes for various topics of interest. We are using the tidycensus package to obtain the data.

# if you haven't done so before, you need to acquire and 
# log a Census API key prior to the get_acs call. 
# API keys can be obtained here: 
# once you have your key, run 
# census_api_key("text-of-key")

vars <- load_variables(2020, "acs5", cache = TRUE) %>% 
    filter(grepl("B03002", name),
           str_sub(label, start=-1) != ":") 

race_ethnicity <- get_acs(geography = "block group", 
                variables = vars$name, 
                year = 2020, 
                state = "RI")

The race_ethnicity data frame now contains the ACS estimates for each column (representing a racial group) and blockgroup (representing a geographic area). From this, we compute the percentage of Black residents in each blockgroup. We use both the columns B03002_014 (Black non-Hispanic) and B03002_004 (Black Hispanic).

pct_black <- race_ethnicity %>% 
    group_by(GEOID) %>% 
    summarize(pct_black = sum(estimate*(variable %in% c("B03002_014","B03002_004")))/sum(estimate))

Now we can join the results to the blockgroups, joining on the unique GEOID identifier for each area.

bg <- bg %>% 
    left_join(pct_black, by = c("GEOID"))

Our final data source is a CSV file of locations. This file contains simply latitude, longitude, and the name of the location, although richer information could also be provided and displayed.

locations <- read_csv("data/locations.csv")
# A tibble: 2 × 3
    lat   lon name   
  <dbl> <dbl> <chr>  
1  41.8 -71.4 ICERM  
2  41.8 -71.4 Den Den

Ok, plot time!

# for colors
mypalette <- colorNumeric( palette="inferno", domain=bg$pct_black, na.color="transparent")

# construct the map
pvd_map <- leaflet(bg) %>% 
    # background street map
    addProviderTiles("CartoDB.Positron") %>%  
    # choropleth: % Black per blockgroup
    addPolygons(fillColor = ~mypalette(pct_black), 
                weight = 2,
                fillOpacity = 0.7,
                smoothFactor = 0) %>% 
    # locations
    addCircleMarkers(lng = ~lon, 
                     lat = ~lat, 
                     popup = ~name, 
                     data = locations, 
                     color = "white",
                     fill = "black", 
                     radius = 3) %>% 
    setView(lng = -71.40702260001015, lat = 41.82458541770865, zoom = 13)

Lighter colors correspond to higher proportions of Black residents, while darker colors correspond to lower proportions. This map is draggable, zoomable, and clickable.

There are some issues here with the polygons not being totally correctly aligned. I believe that this is an issue with map projection that can be easily solved, but I haven’t looked into it further.