We can map almost anything (even Pokemons or Middle Earth)
There’s no such thing as “a map”!
Maps can serve different purposes
There are many types of maps
Maps are more than routing maps (usually powered by Google)
Maps are more than what Google maps or political maps
Maps can be static (images) or interactive (web, apps)
Maps can visualise information in different ways (points, choropleth, facets…)
Maps are subjective (aka political) (i.e. north, center, projections…)
But there’s something they all have in common…
Spatial data
Spatial data: overview
Spatial data situates something in the space (world*). Because of Earth is round and maps are usually flat, spatial data needs to be projected using Coordinate Reference Systems (CRS)
Spatial data is different than regular data (i.e. dataframe) because it contains:
Geometry: coordinates defining and situating a point, line or polygon
Attributes for every geometry (variables and values)
Coordinate Reference System (CRS)
Currently, R’s data structure to stores spatial data is a sf object (or spatial data frame), provided by {sf} - Simple Features for R.
* But it could be something else, like the Moon, Middle Earth…
Types of spatial data
According to how spatial information is stored:
Vector: geometry is defined by their coordinates
Raster: made of pixels. Are thematic (i.e. land cover)
According to their geometry (Vectors) :
Points expressed as pairs of lat, long, or x, y (i.e. markers).
Note that the R ecosystem has changed a lot over time, and older tutorials may talk about deprecated packages such as sp, ggmap, terra…
Case: Geocoding with {tidygeocoder}
Translating addresses to geographical coordinates
locations <-c("Coventry", "University of Warwick", "Centre for Interdisciplinary Methodologies" )# Tidycoder needs a data frame structure.locations_df <-as.data.frame(locations)library(tidygeocoder)# geocode the addresseslat_longs <- locations_df |>geocode(locations, method ='osm', lat = latitude , long = longitude)lat_longs
# A tibble: 3 × 3
locations latitude longitude
<chr> <dbl> <dbl>
1 Coventry 52.4 -1.51
2 University of Warwick 52.4 -1.56
3 Centre for Interdisciplinary Methodologies 52.4 -1.56
We will read a csv file containing the latitude and longitude of cities in the world:
cities_world <-read.csv("data/worldcities.csv")# Check the classclass(cities_world)
[1] "data.frame"
head(cities_world, 3)
city city_ascii lat lng country iso2 iso3 admin_name capital
1 Tokyo Tokyo 35.6897 139.6922 Japan JP JPN Tōkyō primary
2 Jakarta Jakarta -6.1750 106.8275 Indonesia ID IDN Jakarta primary
3 Delhi Delhi 28.6100 77.2300 India IN IND Delhi admin
population id
1 37732000 1392685764
2 33756000 1360771077
3 32226000 1356872604
Using base R to visualise the “spatial” data
Latitude and longitude are numbers that can position a point in the world, in a similar way than x and y work in cartesian space. -> We can use base R’s plot():
# Plotting a categorical attributeplot(world["continent"])
# Plotting a categorical attributeplot(world["lifeExp"], main ="Life Expectancy per country")
Admittedly, we could use our expertise in data visualisation and base R graphics to dramatically improve this to better communicate our findings, i.e., colour scale, cutters, legend, title…but it’s good enough. And quick!
Visualising with {ggplot}
Some may be more familiar with {ggplot}, so we can use it, too (using “new” geom_ functions):
library(ggplot2)plot_ggchoro <-ggplot(world) +# geom_sf is used to visualise sf geometriesgeom_sf(color ="white", aes(fill = lifeExp)) +# geom_sf_text visualises the value of a column based on their geometriesgeom_sf_text(aes(label = iso_a2), size =2) +# Add a title and hide x and y labels.labs(title ="Life expectancy by country",x =NULL, y =NULL) +theme_minimal()# Result in next slideplot_ggchoro
But there is more!
Because we are working with an sf, we can visualise it in a different Coordinate Reference System (CRS)
plot_ggchoro +# We are reusing the ggplot from the previous codecoord_sf(crs =st_crs(3035)) #st_crs is provided by sf
plot_ggchoro +# We are reusing the ggplot from the previous codecoord_sf(crs ="+proj=moll") #st_crs is provided by sf
We can use plotly::ggplotly() to turn a ggplot map into an interactive map:
plotly::ggplotly(plot_ggchoro)
Recap
In this example we have used the visualisation tools and methods we are familiar with to viusalise spatial data (as opposed to regular data in the previous example).
But… we have also seen some of the particularities of using spatial data
In the next examples, we will be using specialised packages for visualising geospatial data
Case: Interactive map using {leaflet}
We will create a couple of (basic) interactive maps like the ones you’re used to see in many websites.
map_demo <-tm_shape(World) +tm_polygons("well_being", legend.title ="Happy Planet Index", palette ="viridis") +tm_dots(size ="inequality", alpha =0.5, col ="red") +tm_scale_bar() +#Tmap tries to position scale and compass in an empty area, but we can control their positiontm_compass(position =c("left", "top")) +tm_layout(main.title ="Tmap showcase")map_demo
Case: Basic spatial manipulation and visualisation with {mapsf}
Overview
We are going to create a slightly more complex choropleth.
We will be joining two datasets (an sf and a data frame) and overlaying another layer with points.
To do so, we will be using a specialised library: {mapsf}.
Loading data
englandHealthStats.csv: A regular dataframe with National Statistics Health Index for the Upper Tier Local Authority and Regions over the 2015-2018 (no geometry)
# Read the csv englandHealthStats <-read.csv("data/England_all_geog_aggregated_2018.csv")head(englandHealthStats, 4)
library(mapsf)mf_map(x = sdf, type ="choro", var ="low_pay")
We can improve it further. {mapsf} follows the same logic (And parameters) as base R graphics:
mf_map(x = sdf, type ="choro", var ="unempl",leg_title="Unemployment", breaks ="jenks")# Highlight region with top valuemf_map(x =head(sdf[order(sdf$unempl), ],1), var ="ctyua19nm",type ="base", border ="red", lwd =1, col =NULL, alpha =0, add =TRUE )# Labels for the top valuemf_label(x =head(sdf[order(sdf$unempl), ],1), var ="ctyua19nm",cex =0.5, halo =TRUE, r =0.15)# Inset mapmf_inset_on(x ="worldmap", pos ="topleft")mf_worldmap(sdf)mf_inset_off()# Layout (title, scale, north and credits)mf_layout(title ="Health indicators per Counties and Unitary Authorities",credits ="Sources: National Statistics Health Index 2015-2018, (December 2019)",frame =TRUE)
We have a better understanding about what geospatial data and maps are (and what they entail!)
We have been introduced to a number of packages, each one performing different functions
We have been exposed to code showing affordances and potentialities of packages, to inform our decisions
More…
Of course, there’s way more to know about working with geospatial data in R. If you’re interested, please refer to Geocomputation with R, by Robin Lovelace, Jakub Nowosad and Jannes Muenchow.
…or wait to hear Robin sharing that in an upcoming WRUG session (stay tunned!)