Lesson 11-13: Data visualization

Mark Sibbald, Jurre Hageman

2025-10-17


Go back to the main page
Go back to the R overview page



This file can be downloaded here.

Lesson 11-13: Data visualization

Once the data is read/loaded and cleaned up nicely, it is time start analyzing and presenting the data. In these two lessons, we will look at the visualization part. We will use different plots to show the analysis and what it takes to make the data sets usable for each different plot.

First, let’s load a data set that we can work with which has been cleaned up already. Of course we start with the make up of the tibbles we create during this part of the lessons, like we did before in previous lessons using the tidyverse and kableExtra libraries.

library(tidyverse)
library(kableExtra)
library(knitr)
library(pillar)
formatted_table <- function(df) {
  col_types <- sapply(df, pillar::type_sum)
  new_col_names <- paste0(names(df), "<br>", "<span style='font-weight: normal;'>", col_types, "</span>")
  kbl(df, col.names = new_col_names, escape = F, format = "html") %>%
    kable_styling(bootstrap_options = c("striped", "hoover", "responsive"))
}

Download the file dinoDatasetCSV.csv and check in a text editor what is the delimiter in the file. Read the file into R.

# Read the data on dinosaurs.
dino_data <- read_csv2("./files_13_data_visualization_exercises/add_exercises/dinoDatasetCSV.csv")
# Replace any missing data with NA values. 
# Hint: check which columns are of character type, but contains numbers.
tibble1 <- tibble(dino_data) %>%
  replace(.== "?", NA) %>%
  mutate(length_m = as.numeric(length_m)) %>%
  mutate(weight_kg = as.numeric(weight_kg)) %>%
  mutate(height_m = as.numeric(height_m))
formatted_table(head(tibble1))
scientific_name
chr
common_name
chr
meaning
chr
diet
chr
length_m
dbl
weight_kg
dbl
height_m
dbl
locomotion
chr
period
chr
lived_in
chr
behavior_notes
chr
first_discovered
chr
fossil_location
chr
notable_features
chr
intelligence_level
chr
source_link
chr
row_index
dbl
Abelisaurus Abelisaurus Abel’s lizard Carnivore 7.0 1500 2.4 Bipedal Late Cretaceous Argentina Large theropod 1985 Argentina Short arms Medium https://en.wikipedia.org/wiki/Abelisaurus 0
Abrictosaurus Abrictosaurus Wakeful lizard Herbivore 1.5 15 0.5 Bipedal Early Jurassic South Africa Small herbivore 1974 South Africa Unique teeth Medium https://en.wikipedia.org/wiki/Abrictosaurus 1
Abrosaurus Abrosaurus Delicate lizard Herbivore 9.0 2000 4.5 Quadrupedal Middle Jurassic China Delicate skull 1959 China Delicate skull Medium https://en.wikipedia.org/wiki/Abrosaurus 2
Abydosaurus Abydosaurus Abydos lizard Herbivore 18.0 30000 6.0 Quadrupedal Early Cretaceous USA Basal sauropod 2010 USA Complete skull Medium https://en.wikipedia.org/wiki/Abydosaurus 3
Acantholipan Acantholipan Spiny shield Herbivore 5.0 2500 1.5 Quadrupedal Late Cretaceous Mexico Armored nodosaur 2011 Mexico Clubless armored tail Medium https://en.wikipedia.org/wiki/Acantholipan 4
Acanthopholis Acanthopholis Spiny scales Herbivore 4.0 1000 1.2 Quadrupedal Early Cretaceous UK Spiny armor 1865 UK Dermal armor Medium https://en.wikipedia.org/wiki/Acanthopholis 5


Summarize each

Let’s create a summary of the dinosaurs that lived in the Cretaceous period. We are only interested in the scientific name, length, weight and height of the animals.

# Select the dinosaurs from the Middle Cretaceous period and sort them on scientific name and drop the rows that have NA values.
cretaceous <- tibble1 %>%
  filter(period == "Middle Cretaceous") %>%
  drop_na()
 
# Select only the columns containing the period, scientific name, length, weight and height.
sel_data <- cretaceous %>%
  arrange(scientific_name) %>%
  select(period, scientific_name, length_m, weight_kg, height_m)
# Change the colnames.
colnames(sel_data) <- c("Period", "Scientific name", "Length (m)", 
                       "Weight (kg)", "Height (m)")
formatted_table(head(sel_data))
Period
chr
Scientific name
chr
Length (m)
dbl
Weight (kg)
dbl
Height (m)
dbl
Middle Cretaceous Dongyangosaurus 15 12000 4.0
Middle Cretaceous Elaltitan 21 15000 5.0
Middle Cretaceous Epachthosaurus 17 14000 4.5
Middle Cretaceous Ichthyovenator 9 3500 3.5
Middle Cretaceous Ornithomimoides 4 100 1.5
Middle Cretaceous Stegosaurides 7 3000 3.0

We will use this data to make different plots.


Bar chart

Create a bar chart with ggplot of the weight of the dinosaurs of the Cretaceous period. Create a title (main and axis titles) and give the bars a steelblue color. Make sure that the labels on the x-axis are placed at a 45 degree angle.

# Create a bar chart with ggplot of the weight.
bar_dino <- ggplot(data = sel_data, aes(x = `Scientific name`, y = `Weight (kg)`)) +
  geom_bar(stat="identity", fill="steelblue") +
  labs(title="Weight of dinosaurs from the Cretaceous period") +
  theme(axis.text.x = element_text(angle = 45, hjust=1))
bar_dino

From the graph is clear that the Udelartitan, Volgatitan and Zhuchengtitan were the heaviest dinosaurs in the Cretaceous period (it is also clear from the name ‘titan’).

Grouped Bar chart

Let’s compare the length to the height of the dinosaurs in one bar chart. First, you will have to make the data tidy with pivot_longer(). Then you can create the grouped bar chart.

# Make the data tidy. Check if the data is indeed tidy (length and height should be indicated in a column called dimension).
tidy_data <- sel_data %>%
  pivot_longer(c(`Length (m)`, `Height (m)`), names_to = "Dimension", 
               values_to = "Size (m)")
formatted_table(head(tidy_data))
Period
chr
Scientific name
chr
Weight (kg)
dbl
Dimension
chr
Size (m)
dbl
Middle Cretaceous Dongyangosaurus 12000 Length (m) 15.0
Middle Cretaceous Dongyangosaurus 12000 Height (m) 4.0
Middle Cretaceous Elaltitan 15000 Length (m) 21.0
Middle Cretaceous Elaltitan 15000 Height (m) 5.0
Middle Cretaceous Epachthosaurus 14000 Length (m) 17.0
Middle Cretaceous Epachthosaurus 14000 Height (m) 4.5
# Plot the grouped bar chart.
dino_size1 <- ggplot(tidy_data, aes(`Scientific name`, `Size (m)`, 
                                    fill = Dimension)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(title="Length and height for dinosaurs from the Cretaceous period", 
       y = "Size (m)") +
  theme(axis.text.x = element_text(angle = 45, hjust=1, size = 7),
        axis.title.x = element_blank()) # leave out title of the x-axis
dino_size1


Percent bar chart

You can create a percent bar chart to see what percentage is body length compared to body height.

# Create a percentage bar chart.
perc_dino_size1 <- ggplot(tidy_data, aes(`Scientific name`, `Size (m)`, 
                                        fill = `Dimension`)) +
  geom_bar(stat = "identity", position="fill") +
  labs(title="Length and height for dinosaurs from the Cretaceous period") +
  theme(axis.text.x = element_text(angle = 45, hjust=1, size = 7),
        axis.title.x = element_blank())
perc_dino_size1


Swithing orders in a group

If you would like to present the data in a different order, you need to change the column with the groups to the data type factor. You can put the length before the height in this way.

# First, change the column of 'Dimension' to a factor type of data.
tidy_data <- tidy_data %>%
  mutate(Dimension = factor(Dimension, 
                            levels = c("Length (m)", "Height (m)")))
levels(tidy_data$Dimension)
## [1] "Length (m)" "Height (m)"
# Second, plot the grouped bar chart.
dino_size2 <- ggplot(tidy_data, aes(`Scientific name`, `Size (m)`, 
                                    fill = Dimension)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(title="Length and height for dinosaurs from the Cretaceous period", 
       y = "Size (m)") +
  theme(axis.text.x = element_text(angle = 45, hjust=1, size = 7),
        axis.title.x = element_blank())
dino_size2

And use the same data to create a percentage bar chart.

# Create the percentage bar chart.
perc_dino_size2 <- ggplot(tidy_data, aes(`Scientific name`, `Size (m)`, 
                                        fill = `Dimension`)) +
  geom_bar(stat = "identity", position="fill") +
  labs(title="Length and height for dinosaurs from the Cretaceous period") +
  theme(axis.text.x = element_text(angle = 45, hjust=1, size = 8), 
        axis.title.x = element_blank())
perc_dino_size2


Pie Chart

For a pie chart we will use a selection of the dinosaurs from the Cretaceous period, otherwise the pie would be divided in too many pieces, representing all the dinosaurs from this period.

We will take the 7 longest dinosaurs and see ho the distribution of the weights is among these ‘long’ dinosaurs.

# Select the seven longest dinosaurs. Check your bar chart with the length and height of each Cretaceous dinosaur. Save these seven to a vector.
long_dino <- c("Dongyangosaurus", "Elaltitan", "Titanomachya", "Epachthosaurus", "Udelartitan",
               "Volgatitan", "Zhuchengtitan")
# Filter the data set using this vector.
df_long_dino1 <- sel_data %>%
  filter(`Scientific name`%in%long_dino)
formatted_table(df_long_dino1)
Period
chr
Scientific name
chr
Length (m)
dbl
Weight (kg)
dbl
Height (m)
dbl
Middle Cretaceous Dongyangosaurus 15 12000 4.0
Middle Cretaceous Elaltitan 21 15000 5.0
Middle Cretaceous Epachthosaurus 17 14000 4.5
Middle Cretaceous Titanomachya 15 15000 6.0
Middle Cretaceous Udelartitan 30 35000 7.0
Middle Cretaceous Volgatitan 25 35000 6.0
Middle Cretaceous Zhuchengtitan 25 30000 6.0
# OR select by using `slice_max()` and change the `Scientific name` to factor type of data (creates order in the pie chart):
df_long_dino2 <- sel_data %>%
  slice_max(order_by = `Length (m)`, n = 7) %>%
  mutate(`Scientific name` = factor(`Scientific name`, 
                                levels = `Scientific name`))
formatted_table(df_long_dino2)
Period
chr
Scientific name
fct
Length (m)
dbl
Weight (kg)
dbl
Height (m)
dbl
Middle Cretaceous Udelartitan 30 35000 7.0
Middle Cretaceous Volgatitan 25 35000 6.0
Middle Cretaceous Zhuchengtitan 25 30000 6.0
Middle Cretaceous Elaltitan 21 15000 5.0
Middle Cretaceous Epachthosaurus 17 14000 4.5
Middle Cretaceous Dongyangosaurus 15 12000 4.0
Middle Cretaceous Titanomachya 15 15000 6.0

Now create the pie chart.

# Create the pie chart based on the weight of these dinosaurs.
dino_pie1 <- ggplot(df_long_dino2, aes(x = "", y = `Length (m)`, 
                                    fill = `Scientific name`))+
  geom_bar(stat="identity", width = 1) +
  coord_polar("y", start=0, direction = -1) +
  labs(title="Length for the largest Cretaceous dinosaur species") +
  geom_text(aes(label = `Length (m)`), position = position_stack(vjust = 0.5)) + # add values to the pieces of the pie.
  theme_void() # remove background, grid, numeric labels
dino_pie1

It seems that the ‘titans’ are the largest and heaviest dinosaurs (except the Elaltitan).
If you really want to make it a fancy pie chart, you can try using the ggrepel library.

# First, the position of the labels outside the pie chart have to be determined
library(ggrepel)
fancy_pie1 <- df_long_dino2 %>%
  # position the labels outside the pie chart correctly
  mutate(csum = rev(cumsum(rev(`Length (m)`))),
         pos = `Length (m)`/2 + lead(csum, 1),
         pos = if_else(is.na(pos), `Length (m)`/2, pos))

# Now plot the pie chart and insert the labels with `geom_label_repel()`.
dino_pie2 <- ggplot(fancy_pie1, 
                    aes(x = "", y = `Length (m)`, 
                        fill = fct_inorder(`Scientific name`))) +
  geom_bar(stat="identity", width = 1) +
  coord_polar("y", start=0, direction = -1) +
  scale_fill_brewer(palette = "Set3") + # color set for the pie chart
  # Create labels outside the pie chart
  geom_label_repel(data = fancy_pie1, 
                   aes(y = pos, label = `Length (m)`), 
                   size = 3.5, nudge_x = 0.75, show.legend = FALSE) +

  labs(title="Length for the largest Cretaceous dinosaur species") +
  guides(fill = guide_legend(title = "Dinosaur")) + 
  theme_void() # remove background, grid, numeric labels
dino_pie2


Or with percentages:

# First add a column with the calculated percentages (of the total length).
fancy_pie2 <- fancy_pie1 %>%
  mutate(percentage = round(`Length (m)` / sum(`Length (m)`) * 100), 0)
# Then plot the same pie chart, but with the percentages instead of the length.
dino_pie3 <- ggplot(fancy_pie2, 
                    aes(x = "", y = `Length (m)`, 
                        fill = fct_inorder(`Scientific name`))) +
  geom_bar(stat="identity", width = 1) +
  coord_polar("y", start = 0, direction = -1) +
  scale_fill_brewer(palette = "Set3") + # color set for the pie chart
  # Create labels outside the pie chart
  geom_label_repel(data = fancy_pie2, 
                   aes(y = pos, label = paste0(percentage, "%")), 
                   size = 3.5, nudge_x = 0.75, show.legend = FALSE) +

  labs(title="Length for the largest Cretaceous dinosaur species") +
  guides(fill = guide_legend(title = "Dinosaur")) + 
  theme_void() # remove background, grid, numeric labels
dino_pie3


Boxplot

We can use the same data for the boxplot. Let’s look at the height of the dinosaurs in different periods of time.

# Exclude (use `filter()`on tibble1) the dinosaurs that are heavier than 10000 kg
# Drop any NA values.
# Select the columns period, scientific name, length, weight and height,
# Make the data in the column period factor type data: Triassic < Jurassic < Cretaceous,
# Save the data in tibble2.
tibble2 <- tibble1 %>%
  filter(weight_kg <= 10000) %>%
  drop_na() %>%
  select(period, scientific_name, length_m, weight_kg, height_m) %>%
  mutate(period = factor(period, levels = c("Early Triassic", "Middle Triassic", "Late Triassic", 
                                            "Early Jurassic", "Middle Jurassic", "Late Jurassic", 
                                            "Early Cretaceous", "Middle Cretaceous", "Late Cretaceous")))
colnames(tibble2) <- c("Period", "Scientific name", "Length (m)", 
                       "Weight (kg)", "Height (m)")
formatted_table(head(tibble2))
Period
fct
Scientific name
chr
Length (m)
dbl
Weight (kg)
dbl
Height (m)
dbl
Late Cretaceous Abelisaurus 7.0 1500 2.4
Early Jurassic Abrictosaurus 1.5 15 0.5
Middle Jurassic Abrosaurus 9.0 2000 4.5
Late Cretaceous Acantholipan 5.0 2500 1.5
Early Cretaceous Acanthopholis 4.0 1000 1.2
Late Cretaceous Achelousaurus 6.0 2500 2.0
# Create a boxplot for the height of the dinosaurs for each period in time.
height_period <- ggplot(tibble2, aes(x = `Period`, y = `Height (m)`)) + 
  geom_boxplot() +
  labs(title="Height of dinosaurs in different periods in time",
       x = "Period", y = "Height (m)") +
  theme(axis.text.x = element_text(angle = 45, hjust=1))
height_period

It seems that the dinosaurs were growing large towards the Jurassic period and maintained that size during the Cretaceous period.

Grouped boxplot

Of course it is also possible to create grouped boxplots. Let’s use the height and weight again and plot them against the period in time.

# First make a tidy tibble from tibble2.
tidy_period <- tibble2 %>%
  pivot_longer(c(`Length (m)`, `Height (m)`), names_to = "Dimension", 
               values_to = "Size (m)")
formatted_table(head(tidy_period))
Period
fct
Scientific name
chr
Weight (kg)
dbl
Dimension
chr
Size (m)
dbl
Late Cretaceous Abelisaurus 1500 Length (m) 7.0
Late Cretaceous Abelisaurus 1500 Height (m) 2.4
Early Jurassic Abrictosaurus 15 Length (m) 1.5
Early Jurassic Abrictosaurus 15 Height (m) 0.5
Middle Jurassic Abrosaurus 2000 Length (m) 9.0
Middle Jurassic Abrosaurus 2000 Height (m) 4.5
# Now create the boxplot.
size_period1 <- ggplot(tidy_period, aes(x = `Period`, y = `Size (m)`, fill = Dimension)) + 
  geom_boxplot() +
  labs(title="Length and height for dinosaurs in different time periods") +
  theme(axis.text.x = element_text(angle = 45, hjust=1))
size_period1


And if you like to change the order of the height and length, you will have to make the column for Dimension a factor type of data.

# Make the column for 'Dimension' a factor type.
tidy_period <- tidy_period %>%
  mutate(Dimension = factor(Dimension, levels = c("Length (m)", "Height (m)")))
formatted_table(head(tidy_period))
Period
fct
Scientific name
chr
Weight (kg)
dbl
Dimension
fct
Size (m)
dbl
Late Cretaceous Abelisaurus 1500 Length (m) 7.0
Late Cretaceous Abelisaurus 1500 Height (m) 2.4
Early Jurassic Abrictosaurus 15 Length (m) 1.5
Early Jurassic Abrictosaurus 15 Height (m) 0.5
Middle Jurassic Abrosaurus 2000 Length (m) 9.0
Middle Jurassic Abrosaurus 2000 Height (m) 4.5
# Create the boxplot.
size_period2 <- ggplot(tidy_period, aes(x = `Period`, y = `Size (m)`, fill = Dimension)) + 
  geom_boxplot() +
  labs(title="Length and height for dinosaurs in different time periods") +
  theme(axis.text.x = element_text(angle = 45, hjust=1))
size_period2


Violin Chart

Create with the same data a violin chart. Although these plots show more information about the data, it is more difficult to interpret the plots.

# Create a violin chart from the Height (m) for the different time periods.
length_period <- ggplot(tibble2, aes(x = `Period`, y = `Height (m)`)) + 
  geom_violin() +
  labs(title="Length of dinosaurs in different time periods") +
  theme(axis.text.x = element_text(angle = 45, hjust=1))
length_period


Line plots

For the following plots we will use the Climate disease dataset.

# Read the data from the file and save it in df1.
df1 <- read.csv2("./files_13_data_visualization_exercises/add_exercises/climate_disease_dataset.csv")
# Filter on the countries of the Netherlands, Sweden, Portugal and Hungary and the year 2023.
df2 <- df1 %>%
  filter(country == "Netherlands" | country == "Sweden" | country == "Portugal" |
           country == "Hungary") %>%
  slice_max(order_by = date, n = 480)
# Turn the first column to dates.
df2$date <- as.Date(df2$date, "%d/%m/%Y")
formatted_table(head(df2))
date
date
country
chr
region
chr
avg_temp_F
dbl
precipitation_mm
dbl
air_quality_index
dbl
uv_index
dbl
malaria_cases
int
dengue_cases
int
population_density
int
healthcare_budget
int
X
dbl
2023-12-01 Portugal West 81.31159 216.72804 8.030935 12.000000 136 134 394 2520 178.3609
2023-12-01 Sweden West 58.50859 318.78748 14.187193 9.076136 113 60 214 2681 137.3155
2023-12-01 Hungary Central 62.04476 56.07903 0.000000 9.390880 83 52 107 4705 143.6806
2023-12-01 Netherlands Central 83.97252 264.04019 71.588550 12.000000 92 66 421 1202 183.1505
2022-12-01 Portugal West 75.53251 166.15460 59.467505 9.694721 65 119 394 2520 167.9585
2022-12-01 Sweden West 60.34775 264.77139 0.000000 7.465901 67 102 214 2681 140.6259

Now we can create a line plot.

# Create a line plot of the average precipitation against the date for the four selected continents with tidy data from df2.
line_plot1 <- ggplot(df2, aes(x = date, y = precipitation_mm, group = country)) +
  geom_line() +
  labs(title="Average precipitation (in mm) per month", x = "Date", 
       y = "Precipitation (mm)")
line_plot1

This is not very clear since the data of each country is not visible. Let’s use some color to distinguish the data for the different countries.

# Use different types of lines to distinguish the data from the different countries.
line_plot2 <- ggplot(df2, aes(x = date, y = precipitation_mm, group = country)) +
  geom_line(aes(linetype = country)) +
  labs(title="Average precipitation (in mm) per month", x = "Date", 
       y = "Precipitation (mm)")
line_plot2


This looks a bit better, but for this plot it is better to use colors to distinguish the data from the different countries.

# Create the same line plot as before, but use colors to distinguish the data for the different countries.
line_plot3 <- ggplot(df2, aes(x = date, y = precipitation_mm, group = country)) +
  geom_line(aes(color = country)) +
  labs(title="Average precipitation (in mm) per month", x = "Date", 
       y = "Precipitation (mm)")
line_plot3


And add a trendline.

# Add a trendline to the plot.
line_plot4 <- ggplot(df2, aes(x = date, y = precipitation_mm)) +
  geom_line(aes(color = country)) +
  labs(title="Average precipitation (in mm) per month") +
  geom_smooth(method="lm")
line_plot4

Radar chart

For the radar chart we will use the Global Ecological Footprint data of 2023. To create radar charts you need to install the remotes package and load the ggradar library.

# REMOVE THE HASH TAGS IN THE NEXT TWO LINES IF YOU HAVE NOT INSTALLED THE REMOTES PACKAGE YET.
#install.packages("remotes")
#remotes::install_github("ricardo-bion/ggradar")
library(ggradar)

Read the data from the Global Ecological Footprint data file.

# Read the data and store it in a data frame.
footprint <- read_csv("./files_13_data_visualization_exercises/add_exercises/Global_Ecological_Footprint_2023.csv")
# Check which four European countries have the highest population.
# Select the columns for the country and the footprints (use the `ends_with()` function) for the four selected European countries.
big4 <- footprint %>%
  filter(Region == "EU-27") %>%
  slice_max(order_by = `Population (millions)`, n = 4) %>%
  arrange(Country) %>%
  select(Country, ends_with("Footprint"))
formatted_table(head(big4))
Country
chr
Cropland Footprint
dbl
Grazing Footprint
dbl
Forest Product Footprint
dbl
Carbon Footprint
dbl
Fish Footprint
dbl
France 1.0 0.3 0.5 2.2 0.2
Germany 0.9 0.2 0.5 2.7 0.1
Italy 0.8 0.3 0.5 2.0 0.2
Spain 1.2 0.2 0.2 1.8 0.5

Now create the radar chart.

# Create a radar chart from the data with the footprints for the 4 European countries.
big4_fp <- ggradar(big4,  legend.text.size = 8, values.radar = c("0", "1.5", "3.0"), axis.label.size = 2.5, grid.label.size = 3, legend.position = "right") +
  labs(title = "Ecological footprints of the 4 highest populated European countries") +
  theme(plot.title = element_text(size = 14, ))
big4_fp


Bubble chart

Bubble charts are useful when you have an extra dimension that you would like to show in the plot.

# Use the original data frame
# Check which four European countries have the highest population.
# Select the columns for the country and the footprints (use the `ends_with()` function) for the four selected European countries.
big4_fp <- footprint %>%
  filter(Region == "EU-27") %>%
  slice_max(order_by = `Population (millions)`, n = 4)
formatted_table(head(big4_fp))
Country
chr
Region
chr
SDGi
dbl
Life Exectancy
dbl
HDI
dbl
Per Capita GDP
chr
Income Group
chr
Population (millions)
dbl
Cropland Footprint
dbl
Grazing Footprint
dbl
Forest Product Footprint
dbl
Carbon Footprint
dbl
Fish Footprint
dbl
Built up land…14
dbl
Total Ecological Footprint (Consumption)
dbl
Cropland
dbl
Grazing land
dbl
Forest land
dbl
Fishing ground
dbl
Built up land…20
dbl
Total biocapacity
dbl
Ecological (Deficit) or Reserve
dbl
Number of Earths required
dbl
Number of Countries required
dbl
Germany EU-27 82.2 81 0.94 $54,192 HI 83.9 0.9 0.2 0.5 2.7 0.1 0.2 4.5 0.6 0.1 0.6994546 0.0745313 0.1780850 1.6145357 -2.884143 2.978833 2.786361
France EU-27 81.2 82 0.90 $47,995 HI 65.6 1.0 0.3 0.5 2.2 0.2 0.2 4.3 1.0 0.2 0.9738287 0.1110340 0.1509272 2.4583718 -1.854753 2.855967 1.754464
Italy EU-27 78.3 83 0.90 $43,010 HI 60.3 0.8 0.3 0.5 2.0 0.2 0.1 4.0 0.4 0.1 0.3408854 0.0657797 0.0912344 0.9711389 -2.980057 2.616313 4.068621
Spain EU-27 79.9 83 0.91 $39,753 HI 46.7 1.2 0.2 0.2 1.8 0.5 0.1 3.9 1.1 0.1 0.4113462 0.0606709 0.0573795 1.7221554 -2.193411 2.592720 2.273643
# Create a bubble chart for the total the 'number of countries required' vs 'biocapacity' and as third dimension the 'population'.
bubble_big4 <- ggplot(big4_fp, aes(x = `Total biocapacity`, y = `Number of Countries required`)) + 
  geom_point(aes(color = Country, size = `Population (millions)`), alpha = 0.5) +
  scale_size_area(max_size = 10)
bubble_big4


Learning outcomes

This lesson you have learned to:
- visualize data using ggplot for:
- creating a basic bar chart,
- creating a grouped bar chart,
- creating a percentage bar chart,
- creating a box plot,
- changing order in a grouped bar chart or box plot.
- creating a violin chart,
- creating a radar chart,
- creating a bubble chart.


— The end —




Go back to the main page
Go back to the R overview page
⬆️ Back to Top


This web page is distributed under the terms of the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Creative Commons License: CC BY-SA 4.0.