27  Making maps

There are many ways to make maps with R. The two general types are: vector maps where regions are represented by a set of points and lines around region boundaries and tile maps where a pre-drawn map is downloaded from a cloud service such as Google maps. In both cases, points, lines, and colour tiles can be added to display data on the map. Vector maps are drawn from a series of points and so can be drawn using many different projections, giving you the freedom to choose the projection most suitable for your map. Tile maps are images and can’t be reprojected, but can have a lot of information on them in the form of colours for terrain, labels, and points of interest. Tile maps can be used in a pan-and-zoom mode like many familiar online mapping tools.

In this lesson we will look at drawing vector maps.

27.1 Vector map

Here is a map of the 48 continental US states, with a quantitative variable used to shade each region. To change the variable used to colour the states, simply provide a new dataset with a numeric column and a text column called “state”. The map is drawn with ggplot, so the other features of ggplot including annotation, setting colour scales, labeling axes, etc., are all available to you and work the same way as for other visualizations we have created. The function expand_limits sets the range of latitudes and longitudes that will be displayed. Here I’m using a base-R syntax (states_map$long) to access these data and get the range (smallest and largest) longitude and latitude. Finally, the coord_map function creates a projection from a piece of a sphere onto a flat page. Other projections you can try include Mollweide (coord_map("mollweide")) and azimuthal equal area (coord_map("azequalarea")). There are many projections. See the help for coord_map for more information.

states_map <- map_data("state")
crimes <- as_tibble(USArrests, rownames="state") |> mutate(state = tolower(state))
ggplot(crimes, aes(map_id = state)) +
    geom_map(aes(fill = Murder), map = states_map) +
    expand_limits(x = states_map$long, y = states_map$lat) +
  coord_map("albers", 40, 100)

The map_data function works with maps from the maps package, including two world maps (world and world2) and detailed maps of France, Italy, New Zealand, the USA and its states. The maps package has several other datasets including a list of Canadian cities with population greater than about 1000. The world is a large and complex place and you will often need to obtain data and map boundaries for regions which are not readily available in this package. Some guidance appears at the end of the lesson, but this can be a challenging task.

The surface of the Earth is curved, so choices need to be made when plotting it on a flat surface. These choices are called projections. Here’s a map of France using an azimuthal equidistant projection (see mapproj::mapproject() for more examples):

map_data <- map_data('france') 
ggplot() + 
       geom_map(map = map_data,
                data = map_data,
                aes(map_id=region),
                fill = "white", colour = "#7f7f7f")  +
  expand_limits(x = map_data$long, y = map_data$lat) +
  coord_map("azequidistant")

Not all maps are political: here is a map of large lakes throughout the world.

lake_data <- map_data('lakes') 
ggplot() + 
       geom_map(data = lake_data, 
                aes(map_id=region),
                map = lake_data,
                fill = "blue", colour = NA, alpha = 1.0)  +
  expand_limits(x = c(-130, 120), y = c(-50, 70)) + 
  coord_map("albers", 0, 0) +
  labs(x= "", y = "")

Incidentally, a frequently used projection for the USA is the Bonne. Revise the USA map to use that projection by adding the following code + coord_map("bonne", 45). (Albers: coord_map("albers", 40, 100) and Lambert: coord_map("lambert", 40, 100) are also used, although Lambert makes the USA look very wide in the North.)

Political boundaries for the world are available as “world” (centred on the Atlantic Ocean) or “world2” (centered on the Pacific Ocean.)

WorldData <- map_data('world')  
ggplot() + 
  geom_map(data = WorldData, 
           aes(map_id=region),
           map = WorldData,
           fill = "white", colour = "#7f7f7f", alpha = 1, linewidth=0.25) +
  expand_limits(x = c(-180, 180), y = c(-80, 80)) + 
  theme_bw() +
  theme(panel.background = element_rect(fill = "darkblue")) +
  labs(x="", y="")

You can select a specific country if you want, for example, Canada. This is a reasonable map of Canada made up of 141 regions (main land mass plus islands), but it doesn’t have any provincial boundaries, lakes, or other detail:

my_map <- map_data('world', region='Canada')  
ggplot() + 
  geom_map(data = my_map, aes(map_id=region),
           map = my_map,
           fill = "#5C4033", colour = "white") +
  theme_bw() +
  expand_limits(x = my_map$long, y = my_map$lat) + 
  theme(panel.background = element_rect(fill = "#00AAAA"),
        panel.border = element_blank(),
        axis.text = element_blank(),
        axis.ticks = element_blank()) +
  labs(x="", y="") +
  coord_map("albers", 60, 90)

Here is a list of 252 regions available in the world map (abbreviated here to the first 40).

WorldData |> pull(region) |> unique() |> sort() |> head(40) |> cat(sep = ", ")
Afghanistan, Albania, Algeria, American Samoa, Andorra, Angola, Anguilla, Antarctica, Antigua, Argentina, Armenia, Aruba, Ascension Island, Australia, Austria, Azerbaijan, Azores, Bahamas, Bahrain, Bangladesh, Barbados, Barbuda, Belarus, Belgium, Belize, Benin, Bermuda, Bhutan, Bolivia, Bonaire, Bosnia and Herzegovina, Botswana, Brazil, Brunei, Bulgaria, Burkina Faso, Burundi, Cambodia, Cameroon, Canada

The map of the globe and even of Canada does not look good at high latitudes, especially if either pole is included. Here is a projection that is a bit more suitable for those regions. The geom_map function is not perfect; it creates stray lines when a region is clipped by the projection.

p1 <- ggplot() +
  geom_map(data = WorldData, 
           map = WorldData, 
           aes(map_id = region),
           fill = "gray80", colour = "#7f7f7f", alpha = 0.5, linewidth=0.5) +
  labs(x="", y="") + theme_bw() +
  theme(axis.text = element_blank(), 
        axis.ticks = element_blank(), 
        rect = element_blank()) +
  expand_limits(x = c(-180, 180), y = c(-90, 90))
p2a <- p1 + coord_map("perspective", 2.5, 
                      orientation=c(60, -100, 0))
      # observer distance 2.5 Earth radii 
p2b <- p1 + coord_map("perspective", 2.5, 
                      orientation=c(-60, 80, 0)) 
      # try also orthographic, with no observer distance
p2a + p2b

You should experiment with the geom_map function. Here are the essential details you must include for each map:

  • You need a data frame with map data, including columns called long and lat and a categorical variable for each polygon usually called region
  • You need to provide geom_map with the name of the data frame (data =) and the name of the map data (we’ve used the same for both here, but see below for other examples)
  • You need to provide the aesthetic mapping from the variable region to the aesthetic map_id
  • If you are just showing a map (not plotting data – see below) you need to set the limits of the plot (expand_limits is one way) or you will get an empty plot

27.2 Detailed maps of Canada

Detailed maps of Canada are not part of the maps package in R, but map data are available in many places on the internet. I obtained a “shapefile” from Statistics Canada and transformed it to work with R. The packages needed to do this have been reworked recently, so lots of advice on the internet is out of date. Look for the packages sf, stars, terra, geojsonio and rmapshaper.

I used shape files for the 2011 census divisions in Canada. There are many other options. Choose ARCGis .shp file format. Pick the cartographic boundary file. You should get a zip file called gpr_000b11a_e. The shapefiles are very detailed and need to be simplified before being plotted with R. You can obtain the simplified files here.

27.2.1 Draw the map

Download the file, read the map data into R and get to mapmaking.

canada_cd <- st_read("static/canada_cd_sim.geojson", quiet = TRUE) 

The official projection for maps of Canada is the Lambert conformal conic (“lcc”) with the following latitude and longitude parameters. These complicated strings are now considered the old way of making projections. A function crs in the package terra can now be used instead – details not shown!

crs_string = "+proj=lcc +lat_1=49 +lat_2=77 +lon_0=-91.52 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs"

Here is a theme for the map to remove axes, ticks, borders and more.

theme_map <- function(base_size=9, base_family="") { 
    theme_bw(base_size=base_size, base_family=base_family) %+replace%
        theme(axis.line=element_blank(),
              axis.text=element_blank(),
              axis.ticks=element_blank(),
              axis.title=element_blank(),
              panel.background=element_blank(),
              panel.border=element_blank(),
              panel.grid=element_blank(),
              panel.spacing=unit(0, "lines"),
              plot.background=element_blank(),
              legend.justification = c(0,0),
              legend.position = c(0,0)
        )
}

Now we draw the map with some colours from a palette.

map_colors <- RColorBrewer::brewer.pal(9, "Pastel1") |> rep(2)   # 18 colours, 9 repeated

ggplot() +
    geom_sf(aes(fill = PRUID), color = "gray60", linewidth = 0.1, data = canada_cd ) +
    coord_sf(crs = crs_string) + 
  scale_fill_manual(values = map_colors) +
    guides(fill = FALSE) +
    theme_map() +
    theme(panel.grid.major = element_line(color = "white"))
Warning: The `<scale>` argument of `guides()` cannot be `FALSE`. Use "none" instead as
of ggplot2 3.3.4.

If we have a list of latitudes and longitudes, we can add points to the map.

city_coords <- tribble( ~city, ~lat, ~long,
"Vancouver",    49.2827,    -123.1207,
"Calgary",  51.0447,    -114.0719,
"Edmonton", 53.5461,    -113.4938,
"Toronto",  43.6532,    -79.3832,
"Ottawa",   45.4215,    -75.6972,
"Montreal", 45.5017,    -73.5673,
"Halifax", 44.6488, -63.5752)

Since our axes on our plot are not just longitude and latitude, we need to convert (project) the longitudes and latitudes to map coordinates. There are several steps: select only the numerical latitude and longitude coordinates, convert the data frame to a matrix, convert to a “simple feature” consisting of a set of points, make it back into a table, and add a projection onto to the data.

sf_cities = city_coords |>
    select(long, lat) |> 
    as.matrix() |> 
    st_multipoint(dim = 'XY') |> 
    st_sfc() |> 
    st_set_crs(4326) # 4269 also works

Make the map with projected points.

ggplot() +
    geom_sf(aes(fill = PRUID), color = "gray60", linewidth = 0.1, data = canada_cd) +
    geom_sf(data = sf_cities, color = '#001e73', alpha = 0.5, size = 3) + # 17
    coord_sf(crs = crs_string) +
    scale_fill_manual(values = map_colors) +
    guides(fill = FALSE) +
    theme_map() +
    theme(panel.grid.major = element_line(color = "white"),
          legend.key = element_rect(color = "gray40", size = 0.1))
Warning: The `size` argument of `element_rect()` is deprecated as of ggplot2 3.4.0.
ℹ Please use the `linewidth` argument instead.

It’s easy to focus in on the Maritimes region of Canada – just filter the data to include only the province or census districts you want. Here are some of the names in the map data that you can use for filtering.

canada_cd |> pull(PRNAME) |> unique()
 [1] "Manitoba"                                           
 [2] "British Columbia / Colombie-Britannique"            
 [3] "Alberta"                                            
 [4] "Saskatchewan"                                       
 [5] "Ontario"                                            
 [6] "Quebec / Québec"                                    
 [7] "Newfoundland and Labrador / Terre-Neuve-et-Labrador"
 [8] "Nova Scotia / Nouvelle-Écosse"                      
 [9] "New Brunswick / Nouveau-Brunswick"                  
[10] "Prince Edward Island / Île-du-Prince-Édouard"       
[11] "Northwest Territories / Territoires du Nord-Ouest"  
[12] "Nunavut"                                            
[13] "Yukon"                                              
canada_cd |> pull(CDNAME) |> unique() |> sample(10)
 [1] "Le Domaine-du-Roy"              "L'Islet"                       
 [3] "L'Assomption"                   "Region 5"                      
 [5] "Cumberland"                     "Stormont, Dundas and Glengarry"
 [7] "Les Jardins-de-Napierville"     "Division No. 20"               
 [9] "Prince Edward"                  "Rimouski-Neigette"             

I’ve changed the latitude and longitude parameters for the projection to values more suitable for this part of Canada.

crs_string2 = "+proj=lcc +lat_1=40 +lat_2=50 +lon_0=-75 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs"
ggplot() +
    # geom_sf(data = sf_cities, color = '#001e73', alpha = 0.5, size = 3) +
    geom_sf(aes(fill = PRUID), color = "gray60", linewidth = 0.1, 
            data = canada_cd |> 
              filter(PRNAME %in% c("Nova Scotia / Nouvelle-Écosse", "New Brunswick / Nouveau-Brunswick", "Prince Edward Island / Île-du-Prince-Édouard"))) +
    coord_sf(crs = crs_string2) +
    scale_fill_manual(values = map_colors) +
    guides(fill = FALSE) +
    theme_map() +
    theme(panel.grid.major = element_line(color = "white"),
          legend.key = element_rect(color = "gray40", size = 0.1))

You can also crop the data to be drawn or the projected map. This is a bit more complex than you might expect since you need to be sure you are specifying the area to be plotted and the actual projected coordinates in the same coordinate system. You can easily get errors (invalid points in a projection), empty maps, or croppings that don’t look right. See the link at the start of this paragraph for several approaches.

crs_string2 = "+proj=lcc +lat_1=40 +lat_2=50 +lon_0=-75 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs"
zoom_to <- c(-64.3683, 45.8979) # Sackville, New Brunswick; could try c(-63.5752, 44.6488)  # Halifax
zoom_level <- 6
C <- 40075016.686   # ~ circumference of Earth in meters
x_span <- C / 2^zoom_level
y_span <- C / 2^(zoom_level)
zoom_to_xy <- st_transform(st_sfc(st_point(zoom_to), crs = 4326),
                           crs = crs_string2)
disp_window <- st_sfc(
    st_point(st_coordinates(zoom_to_xy - c(x_span / 2, y_span / 2))),
    st_point(st_coordinates(zoom_to_xy + c(x_span / 2, y_span / 2))),
    crs = crs_string2
)
ggplot() +
    geom_sf(aes(fill = PRUID), color = "gray60", size = 0.1, 
            data = canada_cd) +
  # geom_sf(data = sf_cities, color = '#001e73', alpha = 0.5, size = 3) + # 17
  geom_sf(data = zoom_to_xy, color = 'red') +
  coord_sf(xlim = st_coordinates(disp_window)[,'X'],
           ylim = st_coordinates(disp_window)[,'Y'],
           crs = crs_string2, datum = crs_string2) +
    scale_fill_manual(values = map_colors) +
    guides(fill = FALSE) +
    theme_map() +
    theme(panel.grid.major = element_line(color = "white"),
          legend.key = element_rect(color = "gray40", size = 0.1))

27.2.2 Just the map, please

Here is a file of province boundaries that can be used without showing the census district regions. I’ve added a latitude-longitude grid to the map.

canada_prov <- geojson_read("https://gist.githubusercontent.com/mikelotis/2156d7c170d10d2c77cb79424fe2137d/raw/7a13748ed7ea5ba64876c77c53b6cb64dd5c3ab0/canada-province.geojson", what="sp") |> st_as_sf()

ggplot() +
    geom_sf(aes(fill = name), 
            color = "gray60", size = 0.1, 
            data = canada_prov ) +
    coord_sf(crs = crs_string) + 
  scale_fill_manual(values = map_colors) +
    guides(fill = FALSE) +
    theme_map() +
    theme(panel.grid.major = element_line(color = "lightgray"),
          panel.grid.minor = element_line(color = "lightgray")) +
  scale_x_continuous(breaks = seq(-160, 0, 10)) +
  scale_y_continuous(breaks = seq(40, 85, 5))

27.3 Summary

Map making is complex for at least two reasons: obtaining the data to describe complex political boundaries and using suitable projections for your data. This lesson introduced you to some simple solutions to both problems and gave a starting point for learning more about the complexity of making customized maps.

27.4 Packages used

In addition to tidyverse which has ggplot and the geom_map function, I’ve used

  • mapproj for map projections
  • sf for working with “simple features”, meaning points, lines, regions and other features to draw on maps, e.g., functions st_read, st_as_sf
  • geojsonio for working with some map data you might find on the internet
  • RColorBrewer for making colour scales; more on this in a later lesson
  • rmapshaper
  • patchwork for combining ggplots together

If you are not sure a package is needed, don’t include it in your R markdown document, but if something doesn’t work, you have this list to help you figure out what package is required.

27.5 Further reading

27.5.1 Maps of Canada

  • The blog I used as source material for the detailed maps of Canada
  • Province and census division shape files from Statistics Canada
  • Election data for boundaries and chloropleths and populated weighted chloropleths
  • A map showing wind turbines in Maritimes