Here is a brief introduction to drawing maps and working with map coordinates in R.



Read me

This page uses maps in rnaturalearth and uses the sf package to handle spatial data. The previous standard rgdal was retired in 2023. Here is an introduction to sf objects. Here is a longer explanation.

The maps use latitude and longitude coordinates by default (= coordinate reference system WGS84; its EPSG code is 4326). Greenwich is the 0 longitude point (prime meridian). The same coordinate reference system is used by GPS.

Packages used here

library(ggplot2)
library(ggrepel)
library(sf)
library(rnaturalearth)
library(rnaturalearthdata)
library(measurements)



Draw a map

Extract preexisting maps from rnaturalearth. Three scales are available: 110 (“small”), 50 (“medium”), or 10 (“large”). 110 is the coarsest scale, and 10 is the finest scale.

In ggplot(), theme_bw() includes grid lines on the map whereas theme_classic() leaves them out.


World coastlines

coord_sf() adds numbered axes for latitude and longitude.

By default, the international date line at -180/180 degrees forms the left and right edges of this plot.

To include country borders, use ne_countries() instead of ne_coastline().

world <- ne_coastline(scale = "medium", returnclass = "sf")
ggplot(world) +
  geom_sf(lwd = 0.2, fill = "white") +
  coord_sf(expand = FALSE) + 
  theme_bw()


Zoom in

Make a map of an area defined by latitude and longitude. For example, the Western hemisphere north of the Tropic of Capricorn.

Here I’m showing the use of ne_countries() to include country borders.

world <- ne_countries(scale = "medium", returnclass = "sf")
ggplot(world) +
  geom_sf(lwd = 0.2) +
  scale_x_continuous(limits = c(-179, -40)) +
  scale_y_continuous(limits = c(23.4, 90)) +
  coord_sf(expand = FALSE) +
  theme_classic()


A continent

africa <- ne_countries(scale = "large", returnclass = "sf", continent = "Africa")
ggplot(africa) +
  geom_sf(lwd = 0.2) +
  theme_classic()

To see all the continents available, use the following.

z <- ne_countries(scale = 110, returnclass = "sf", type = "countries")
unique(z$continent)


A country

canada <- ne_countries(country = "Canada", scale = "large", returnclass = "sf")
ggplot(canada) +
  geom_sf(lwd = 0.2) +
  theme_classic()

To see all the countries available, enter the following.

z <- ne_countries(scale = 110, returnclass = "sf", type = "countries")
unique(z$admin)


A province or state

canada <- ne_states(country = "Canada", returnclass = "sf")
bc <- canada[canada$name == "British Columbia", ]
ggplot(bc) +
  geom_sf(lwd = 0.2) +
  theme_classic()

usa <- ne_states(country = "United States of America", returnclass = "sf")
ca <- usa[usa$name == "California", ]
ggplot(ca) +
  geom_sf(lwd = 0.2) +
  theme_bw()


Equal area projection

Use st_transform() to convert map coordinates to R’s equal area projection, which retains the approximate relative size of geographic areas.

Use scale_x_continuous() and scale_y_continuous() to control the axis ticks, which might otherwise tend to be sparse.

canadaEA <- st_transform(canada, crs = "+proj=eqearth +wktext")
ggplot(canadaEA) +
  geom_sf(lwd = 0.2) +
  scale_x_continuous(breaks = seq(-120, 60, by = 20)) +
  theme_classic()


Span international date line

By default, the world map is centred at Greenwich (longitude 0 degrees) and the edge of maps occurs at the international date line in the mid-Pacific (-180/180 degrees). Re-centering a map from a Pacific perspective (by added 360 to all negative longitudes) doesn’t work in all cases. Map components (polygons) are drawn from one side to the other, causing horizontal lines to appear. Run the following to see what I mean.

# Not run
world = ne_coastline(scale = "medium", returnclass = "sf")
world_2 = st_shift_longitude(world)
ggplot(world_2) +
  geom_sf(lwd = 0.2, fill = "white") +
  coord_sf(expand = FALSE) + 
  theme_classic()

The best solution is to change the coordinate reference system so that the meridian is set at a new location (e.g., to the international date line at 180 degrees).

The following map of the North Pacific splits the map components (polygons) at the new meridian before projecting to a new re-centered coordinate system. Code is based on this page.

world = ne_coastline(scale = "medium", returnclass = "sf")

# Decide new meridian
meridian_2 <- -180

# Split world at new meridian
world_2 <- st_break_antimeridian(world, lon_0 = meridian_2)
# Warning: attribute variables are assumed to be spatially constant throughout all geometries

# Now reproject to new meridian center
world_3 <- st_transform(world_2, 
  crs = paste("+proj=latlong +datum=WGS84 +lon_0=", meridian_2) )
ggplot(world_3) +
  geom_sf(lwd = 0.2, fill = "white") +
  # coord_sf(xlim = c(-100, 100), ylim = c(0, 90), expand = FALSE) +
  coord_sf(expand = FALSE) +
  scale_x_continuous(limits = c(-100, 100)) +
  scale_y_continuous(limits = c(0, 90)) +
  theme_bw()



Plot points on map

As an example, we’ll plot the ignition locations of all recorded forest fires in British Columbia in 1990. The data were downloaded from the Canadian National Fire DataBase.

BCfires1990 <- read.csv("https://www.zoology.ubc.ca/~schluter/R/csv/BCfires1990.csv")
head(BCfires1990)
##   SRC_AGENCY     FIRE_ID FIRENAME LATITUDE LONGITUDE YEAR MONTH DAY   REP_DATE ATTK_DATE OUT_DATE    DECADE SIZE_HA CAUSE PROTZONE
## 1         BC 1990-R90003       NA   59.929  -128.200 1990     3  15 1990-03-15        NA       NA 1990-1999     0.1     H       NA
## 2         BC 1990-R90004       NA   59.911  -128.431 1990     5   2 1990-05-02        NA       NA 1990-1999     0.1     H       NA
## 3         BC 1990-R90005       NA   59.920  -128.484 1990     6  16 1990-06-16        NA       NA 1990-1999      NA     H       NA
## 4         BC 1990-R90012       NA   59.848  -128.168 1990     7  26 1990-07-26        NA       NA 1990-1999     1.0     H       NA
## 5         BC 1990-R90014       NA   59.661  -128.718 1990     7  30 1990-07-30        NA       NA 1990-1999     0.1     L       NA
## 6         BC 1990-R90027       NA   59.321  -129.056 1990     9  18 1990-09-18        NA       NA 1990-1999     0.2     H       NA
##   FIRE_TYPE MORE_INFO          CFS_REF_ID CFS_NOTE1 CFS_NOTE2   ACQ_DATE SRC_AGY2 ECOZONE ECOZ_REF         ECOZ_NAME           ECOZ_NOM
## 1      Fire        NA BC-1990-1990-R90003        NA        NA 2020-05-05       BC      12       12 Boreal Cordillera CordillCre boreale
## 2      Fire        NA BC-1990-1990-R90004        NA        NA 2020-05-05       BC      12       12 Boreal Cordillera CordillCre boreale
## 3      Fire        NA BC-1990-1990-R90005        NA        NA 2020-05-05       BC      12       12 Boreal Cordillera CordillCre boreale
## 4      Fire        NA BC-1990-1990-R90012        NA        NA 2020-05-05       BC      12       12 Boreal Cordillera CordillCre boreale
## 5      Fire        NA BC-1990-1990-R90014        NA        NA 2020-05-05       BC      12       12 Boreal Cordillera CordillCre boreale
## 6      Fire        NA BC-1990-1990-R90027        NA        NA 2020-05-05       BC      12       12 Boreal Cordillera CordillCre boreale


Plot points

Extract the latitude and longitude of every fire and place in a data frame. Then use geom_point() to add these points to a map of the province.

# Data frame of x and y coordinates of all 1990 fires
dat <- data.frame(longitude = BCfires1990$LONGITUDE, 
          latitude = BCfires1990$LATITUDE)

# Obtain the map of BC from rnaturalearth
canada <- ne_states(country = "Canada", returnclass = "sf")
bc <- canada[canada$name == "British Columbia", ]
# Make the plot with fire points
ggplot(bc) +
  geom_sf(lwd = 0.2) +
  geom_point(data = dat, aes(x = longitude, y = latitude), 
    size = 0.2, color = "firebrick") +
  theme_classic()


Remove out-of-bounds points

You can see that some of the points lie outside the province, indicating errors in the database. For now, delete the points that are out of bounds.

First, convert the data frame of points to an sf object (4326 is the EPSG code for WGS84, which uses latitude/longitude). Then identify the points that are within bounds (indicated by an sfdat$in_bounds value of 1).

sfdat <- st_as_sf(dat, coords = c("longitude", "latitude"), crs = 4326)
head(sfdat)
## Simple feature collection with 6 features and 0 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -129.056 ymin: 59.321 xmax: -128.168 ymax: 59.929
## Geodetic CRS:  WGS 84
##                  geometry
## 1   POINT (-128.2 59.929)
## 2 POINT (-128.431 59.911)
## 3  POINT (-128.484 59.92)
## 4 POINT (-128.168 59.848)
## 5 POINT (-128.718 59.661)
## 6 POINT (-129.056 59.321)
sfdat$in_bounds = lengths(st_within(sfdat, bc))
table(sfdat$in_bounds)
## 
##    0    1 
##   71 3184

Finally, drop the out-of-bounds points and replot.

BCfires1990_fixed <- BCfires1990[sfdat$in_bounds == 1, ]
dat <- data.frame(longitude = BCfires1990_fixed$LONGITUDE, 
          latitude = BCfires1990_fixed$LATITUDE)
ggplot(bc) +
  geom_sf(lwd = 0.2) +
  geom_point(data = dat, aes(x = longitude, y = latitude), 
    size = 0.2, color = "firebrick") +
  theme_classic()



Add labels to points

Add site labels and data points to the map.

The following example file has sampling latitude and longitude of locations where marine threespine stickleback were collected.

mydata <- read.csv("https://www.zoology.ubc.ca/~schluter/R/csv/sampleSites.csv")
head(mydata)
##             Site   ocean latitude longitude  N
## 1     Bering Sea Pacific  58.0890 -166.6430  1
## 2 Barrow, Alaska Pacific  71.3350 -156.6060  1
## 3 Seward, Alaksa Pacific  60.1240 -149.3950  1
## 4  Rabbit Slough Pacific  61.5132 -149.3454 43
## 5       Mud Lake Pacific  61.4975 -149.2590 51
## 6       Auke Bay Pacific  58.3841 -134.6461  4

The function geom_text_repel() reduces overlap between labels but it also drops labels if crowding is too great. In this case you’ll need to zoom in to a finer scale to see them all.


The world

Plot the locations and names of sites on a world map.

world = ne_coastline(scale = "medium", returnclass = "sf")
ggplot(world) +
  geom_sf(lwd = 0.2) +
  geom_point(data = mydata, aes(x = longitude, y = latitude), 
      size = 0.4, color = "firebrick") +
  geom_text_repel(data = mydata, 
      aes(x = longitude, y = latitude, label = Site), 
      size = 1.5, max.overlaps = 20) +
  coord_sf(expand = FALSE) +
  theme_classic()


Zoom in

The second example zooms in to include only latitudes and longitudes that cover the eastern Pacific Ocean.

world = ne_coastline(scale = "medium", returnclass = "sf")
ggplot(world) +
  geom_sf(lwd = 0.2) +
  geom_point(data = mydata, aes(x = longitude, y = latitude), 
      size = 0.4, color = "firebrick") +
  geom_text_repel(data = mydata, 
      aes(x = longitude, y = latitude, label = Site), 
      size = 1.5, max.overlaps = 50) +
  coord_sf(expand = FALSE) +
  scale_x_continuous(limits = c(-179, -110)) +
  scale_y_continuous(limits = c(30, 80)) +
  theme_classic()



Plot geographic ranges

Range data can be in the form of point occurrences or as range map shapefiles.


Download GBIF data

The Global Biodiversity Information Facility provides geographical records of occurrence of species drawn from various sources ranging from museum collections to smartphone identifications. Access to the dataset is enabled using the rgbif package. to download more than 100K records you’ll need to register (free) at the GBIF site.

I downloaded more than 200K occurrences of the threespine stickleback with the following steps.

First, find the taxonKey values (also called usageKey – confusing) for the species. The name_backbone() command found 5 keys, of which one was associated with the vast majority of occurrences. I used that one only. This did not find all the taxonKey values because of alternate names and synonyms in the literature. The rgbif vignettes introduces this and other ways to find occurrence keys.

library(rgbif)

# Find taxonKey (= usageKey) associated with the species name
backbone <- name_backbone(name = "Gasterosteus aculeatus", verbose = TRUE)
backbone$usageKey

## 4286327 11306040 4352560 9166461 4286348

# Number of occurrences associated with each taxonKey
sapply(backbone$usageKey, function(x){occ_count(taxonKey = x)})

## 208608 3 5 10 0


The next step is to download the records. Here is the code I used – I saved the results to a file that you can download.

# For now, download only occurrences associated with the first backbone$usageKey,
# since it is associated with the vast majority of records.
# I used the USERNAME, PASSWORD, and EMAIL associated with my GBIF account.
occ_download(pred_in("taxonKey", backbone$usageKey[1] ), 
             format = "SIMPLE_CSV", user = "USERNAME", 
             pwd = "PASSWORD", email = "EMAIL")

# Import the downloaded data -- DOWNLOAD KEY is provided by occ_download()
d <- occ_download_get('DOWNLOAD KEY') %>% occ_download_import()

# Save a subset of variables to a .csv file
write.csv(d[, c("species", "verbatimScientificName", "individualCount", 
      "decimalLatitude", "decimalLongitude", "eventDate")], 
      "~/Desktop/rgbif_stickleback_occurrences.csv", row.names = FALSE)

The file can be downloaded here.

stickleData <- 
  read.csv("https://www.zoology.ubc.ca/~schluter/R/csv/rgbif_stickleback_occurrences.csv")
head(stickleData)
##                  species                verbatimScientificName individualCount decimalLatitude decimalLongitude  eventDate
## 1 Gasterosteus aculeatus Gasterosteus aculeatus Linnaeus, 1758              NA        48.66301         6.668630 2010-06-15
## 2 Gasterosteus aculeatus                Gasterosteus aculeatus              NA        53.51938        -3.011979 1992-07-14
## 3 Gasterosteus aculeatus                Gasterosteus aculeatus              NA        53.37632        -2.918391 2011-05-19
## 4 Gasterosteus aculeatus                Gasterosteus aculeatus              NA        53.41370        -2.708566 2019-04-29
## 5 Gasterosteus aculeatus                Gasterosteus aculeatus              NA        53.43168        -2.708865 2004-02-05
## 6 Gasterosteus aculeatus                Gasterosteus aculeatus              NA        53.39474        -2.858634 2017-03-27


Plot occurrences

Extract the latitude and longitude of every record and place in a data frame. Then use geom_point() to plot these points on a map. Each record is here given a single point regardless of how many individuals of the species were observed.

# Data frame of latitude and longitude
dat <- data.frame(longitude = stickleData$decimalLongitude, 
          latitude = stickleData$decimalLatitude)

# Obtain the map of the world from rnaturalearth
world <- ne_coastline(scale = "medium", returnclass = "sf")

Finally, plot the occurrences on the world map. Some of the occurrences are in unlikely locations (e.g., Africa). This site gives some tips on how to clean species occurrence data.

ggplot(world) +
  geom_sf(lwd = 0.2) +
  geom_point(data = dat, aes(x = longitude, y = latitude), 
    size = 0.2, color = "firebrick") +
  coord_sf(expand = FALSE) +
  theme_classic()


Plot range map

Range maps for given species are often available online as a database containing a shapefile. For example, I downloaded the range map for the Patagonian rockfish from IUCN\(^1\) and overlay it on a map of South America.

First, read the shapefile into R. A shapefile is a folder of files (with extensions like .shp, .shx, .dbf) that together define geographic features. To read the shapefile you can give st_read() the name of the whole folder. Or, you can give it the name of the .shp file as long as the associated files are in the same folder.

rockfish_shapefile <- st_read("csv/IUCN_Patagonian_Rockfish")
# or
rockfish_shapefile <- st_read("csv/IUCN_Patagonian_Rockfish/data_0.shp")

Next, overlay the shapefile on your map.

samerica <- ne_countries(scale = "large", returnclass = "sf", continent = "South America")
ggplot(samerica) +
  geom_sf(data = rockfish_shapefile, fill = "goldenrod1", color = "goldenrod1") +
  geom_sf(lwd = 0.2) +
  theme_classic()



Convert coordinates

Different programs and agencies might use different coordinate systems for locating points on a map, but you can convert between them. The default coordinate reference system of rnaturalearth is WGS84 (EPSG code 4326) in decimal degrees.


Convert to decimal degrees

Units of latitude and longitude might be given in degrees, minutes, and seconds, or in degrees decimal minutes instead of decimal degrees. I use the measurements package to convert to decimal degrees.

For example, a spot along the shoreline of Paxton Lake, BC is located at latitude 49\(\circ\)42’23.298”N and longitude 124\(\huge\circ\)31’19.236”W in degrees, minutes and seconds. Convert to decimal degrees as follows (degrees S or W are entered as negative numbers).

conv_unit(c("49 42 23.298", "-124 31 19.236"), from = "deg_min_sec", to = "dec_deg")
## [1] "49.7064716666667" "-124.52201"

Similarly, the coordinates 49\(\circ\)42.3883N’, 124\(\circ\)31.3206W in degrees decimal minutes can be converted to decimal degrees as follows.

conv_unit(c("49 42.3883", "-124 31.3206"), from = "deg_dec_min", to ="dec_deg")
## [1] "49.7064716666667" "-124.52201"


Convert UTM to WGS84

The BC government uses UTM (Universal Transverse Mercator) coordinates for the latitude and longitude of lakes (EPSG code is prefix 326 with 2-digit zone number appended). Coordinates are referred to as Eastings and Northings and they come with a specific zone number. Zone 10 is for most of the Pacific Northwest (EPSG code 32610). Parts of Vancouver Island are Zone 09 (EPSG code 32609).

For example, here are the BC government UTM coordinates for the point locations of Paxton and Priest lakes on Texada Island. Convert to WGS84 as follows.

# Put points into a data frame
x <- data.frame(easting = c(390347,387959), northing = c(5507797, 5511150), 
        row.names = c("Paxton", "Priest"))
x
##        easting northing
## Paxton  390347  5507797
## Priest  387959  5511150
# Convert data frame to an sf object with UTM coordinates
x.sf <- st_as_sf(x, coords = c("easting", "northing"))
x.sf <- st_set_crs(x.sf, value = 32610)

# Convert coordinates to WGS84 (EPSG code 4326)
y <- st_transform(x.sf, crs = st_crs(4326))
y
## Simple feature collection with 2 features and 0 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -124.5551 ymin: 49.71269 xmax: -124.521 ymax: 49.74239
## Geodetic CRS:  WGS 84
##                     geometry
## 1  POINT (-124.521 49.71269)
## 2 POINT (-124.5551 49.74239)

\(^1\)IUCN (International Union for Conservation of Nature) 2023. Sebastes oculatus. The IUCN Red List of Threatened Species Version 2023-1. https://www.iucnredlist.org. Downloaded on 17 December 2023.

 

© 2009-2024 Dolph Schluter