5 checkpoints for GGPLOT Maps in R
Covid cases in Australia
Hi! I am a noob to R and Data Viz. This is me trying to share what I figured out while trying out the cool stuff taught at R Ladies Meetup in Sri Lanka, conducted by Data Analyst, Stephanie Kobakian on “Around the world in 30 minutes” (Map creation with R). Refer: https://srkcolombo.netlify.app/#1
This article provides ggplot manipulations with,
01 | Reversing palette
02 | Labelling
03 | Gradients
04 | Rainbow or no rainbow palette
05 | Color deficiencies
Context: Covid cases in Australian regions
Code looks like this and the full code is available in the link,
Important part of code to focus is only the code in bold below.
world <- map_data(“world”)
Australia <- world %>% filter(region == “Australia”)
Australiaggplot(Australia) +
geom_polygon(aes(x = long, y = lat, group = group))sf_oz <- ozmap_data(“country”)
sf_oz %>% kable()
ggplot(sf_oz) + geom_sf()sf_states <- ozmap_data(“states”)
sf_states %>% kable()
ggplot(sf_states) + geom_sf()covid_url <- “https://covidlive.com.au/report/cases"
covid_data <- bow(covid_url) %>%
scrape() %>%
html_table() %>%
purrr::pluck(2) %>%
as_tibble()covid_data <- covid_data %>%
mutate(STATE = case_when(
STATE == “NSW” ~ “New South Wales”,
STATE == “WA” ~ “Western Australia”,
STATE == “SA” ~ “South Australia”,
STATE == “NT” ~ “Northern Territory”,
STATE == “ACT” ~ “Australian Capital Territory”,
TRUE ~ STATE )) %>%
mutate(CASES = parse_number(CASES))covid_states <- left_join(ozmap_states, covid_data,
by = c(“NAME” = “STATE”))covid_states <- covid_states %>%
filter(!(NAME == “Other Territories”))ggplot(covid_states) +
geom_sf(aes(fill = CASES))
The resulting map looks like this,
Since ggplot2 package was used, I got a map as above with its default color range (Dark2 palette).
01 | Color palette reverse? change?
Hmm, I felt like higher the value is, more darker the blue should be (varying luminance/hue), so I did a small change like this (bold code shows change).
ggplot(covid_states, aes(order = CASES)) +
geom_sf(aes(fill = CASES))+
scale_fill_gradient(high = “#132B43”,
low = “#56B1F7”)
Then I got a graph as follows, now I really can notice where the Covid cases should be noticed! (PS: There are other ways to reverse but I am not smart enough expert to apply it :P)
Huh! But then what am I explaining?? I can’t say which looks high or less unless I am an Australian or knows firmly which is what right (:P)!
PS: I am from Sri Lanka in case if you did not read the title!
02 | Monica’s label maker
So here I tried labelling (bold code shows change),
ggplot(covid_states) +
geom_sf(aes(fill = CASES))+
scale_fill_gradient(high = “#132B43”,low = “#56B1F7”)+
geom_sf_label(aes(label = NAME), colour = “black”, size = 2.5)
Reference: https://yutani.rbind.io/post/geom-sf-text-and-geom-sf-label-are-coming/
PS: Both geom_sf_label()
and geom_sf_text()
are good options.
Okay, I see that Victoria has Covid cases around 20,000 while New South Wales having cases between 5,000- 10,000 but then how about others?
Am I the only one curious about other regions? I mean those are less than 5,000 but is it 0 or 4999? So I tried using the following,
03 | Gradients for a better clear picture
ggplot(covid_states) +
geom_sf(aes(fill = CASES))+
scale_fill_gradient(low = “white”, high = “black”)+
geom_sf_label(aes(label = NAME), colour = “black”, size = 2.5)
or may be one like this,
ggplot(covid_states) +
geom_sf(aes(fill = CASES))+
scale_fill_gradientn(
colours = rainbow(5),
values = NULL,
space = “Lab”,
na.value = “grey50”,
guide = “colourbar”,
aesthetics = “fill”)+
geom_sf_label(aes(label = NAME), colour = “black”, size = 2.5)
Reference on color gradients: https://www.datanovia.com/en/blog/ggplot-colors-best-tricks-you-will-love/
It can be understood that Queensland, Western Australia and South Australia is not clearly zero with the orange-ish shade but how about red colored regions? Well, I am the noob, you tell me! I mean should we try to indicate the value there or does it really matter to know exact amount of Covid cases in all regions with just a color palette?
04 | I hate Ross, I love Ross (Ross = Rainbows)
And oh by the way, it is not wise to use rainbow palette for continuous data as it might mislead due to non-uniformity of the spectrum(plus many other perceptual reasons) which results in confusion, so don’t be mad at me for using it here, I just experimented ;)
BUT then again, if you read the reference paper closely, it states that,
While the RGB rainbow() is very unbalanced, the HCL rainbow_hcl() (or also qualitative_hcl()) is (by design) balanced with respect to luminance.
Meaning, if you really want to use rainbow now you have a better version, rainbow_hcl(). It looks like this,
Reference: Paper on “colorspace: A Toolbox for Manipulating and Assessing Colors and Palettes” https://arxiv.org/pdf/1903.06490.pdf
05 | Traffic light is not for everyone
Finally and most importantly (since I am passionate on inclusive design), I tried something like this, a different color palette helping blind deficiencies.
R provides simulate_cvd() package,
which can take any vector of valid R colors and transform them according to a certain CVD transformation matrix and transformation equation. The convenience interfaces deutan(), protan(), and tritan() are the high-level functions for simulating the corresponding kind of color blindness with a given severity (calling simulate_cvd() internally)
Reference: http://colorspace.r-forge.r project.org/articles/color_vision_deficiency.html
So if somebody used the usual RGB rainbow palette it would be as follows for people with the color deficiencies,
library(cowplot)
library(colorspace)
library(colorblindr)gcovid <- ggplot(covid_states) +
cvd_grid(gcovid)
Again, the HCL rainbow palette is FTW (For The Win)!! It provides a better meaning for many color deficiencies. The HCL palette Geyser gives the following,
scale_fill_gradientn(
colors = divergingx_hcl(11, “Geyser”, rev = TRUE),
values = NULL,
space = “Lab”,
na.value = “grey50”,
guide = “colourbar”,
aesthetics = “fill”)+
So, again if we demo how it looks like with the color deficiencies,
So what do you think? I know you got more to add here! Let me know :)
Reference on the package: https://www.rdocumentation.org/packages/colorblindr/versions/0.1.0
R ladies meetup event held: https://www.meetup.com/rladies-colombo/events/276103634/
Further info on the code by Stephanie: https://github.com/srkobakian/R-ladies-colombo