A geographic bubble chart is a straightforward method to visualise quantitative information with a geospatial relationship. Last week I was in Vietnam helping the Phú Thọ Water Supply Joint Stock Company with their data science. They asked me to create a map of a sample of their water consumption data. In this post, I share this little ditty to explain how to plot a bubble chart over a map using the
In this post, I share this little ditty to explain how to plot a bubble chart over a map using the ggmap package.
Load and Explore the Data
The sample data contains a list of just over 100 readings from water meters in the city of Việt Trì in Vietnam, plus their geospatial location. This data uses the World Geodetic System of 1984 (WGS84), which is compatible with Google Maps and similar systems.
# Load the data water <- read.csv("PhuTho/MeterReads.csv") water$Consumption <- water$read_new - water$read_old # Summarise the data head(water) summary(water$Consumption)
The consumption at each connection is between 0 and 529 cubic metres, with a mean consumption of 23.45 cubic metres.
Visualise the data with a geographic bubble chart
With the ggmap extension of the ggplot package, we can visualise any spatial data set on a map. The only condition is that the spatial coordinates are in the WGS84 datum. The ggmap package adds a geographical layer to ggplot by adding a Google Maps or Open Street Map canvas.
The first step is to download the map canvas. To do this, you need to know the centre coordinates and the zoom factor. To determine the perfect zoon factor requires some trial and error. The ggmap package provides for various map types, which are described in detail in the documentation.
# Load map library library(ggmap) # Find the middle of the points centre <- c(mean(range(water$lon)), mean(range(water$lat))) # Download the satellite image viettri <- get_map(centre, zoom = 17, maptype = "hybrid") g <- ggmap(viettri)
The ggmap package follows the same conventions as ggplot. We first call the map layer and then add any required geom. The point geom creates a nice bubble chart when used in combination with the
scale_size_area option. This option scales the points to a maximum size so that they are easily visible. The transparency (alpha) minimises problems with overplotting. This last code snippet plots the map with water consumption.
# Add the points g + geom_point(data = reads, aes(x = lon, y = lat, size = Consumption), shape = 21, colour = "dodgerblue4", fill = "dodgerblue", alpha = .5) + scale_size_area(max_size = 20) + # Size of the biggest point ggtitle("Việt Trì sự tiêu thụ nước")
You can find the code and data for this article on my GitHub repository. With thanks to Ms Quy and Mr Tuyen of Phu Tho water for their permission to use this data.
This map visualises water consumption in the targeted area of Việt Trì. The larger the bubble, the larger the consumption. It is no surprise that two commercial customers used the most water. Ggplot automatically adds the legend for the consumption variable.