Go back to the main page
Go back to the R overview page
R
Data Cleaning Exercises
This file can be downloaded here
Exercise 1
This
first data set was not loaded well in Excel. Load the Excel file as a
tibble in R. Separate the columns correctly. Hint: use the
separate_wider_delim function from tidyverse. Consult the
information on this function with ?separate_wider_delim. On
the bottom you will find examples you can run. Take a close look at the
first example. This should give you a clue how to solve this
problem.
Exercise 2
The data in this file contains the units in the cells. This makes it impossible to perform calculations (as the data type is a text string). Remove the units in order to make calculations possible.
Exercise 3
This data set contains rows with duplicate data. Load the data and remove the duplicates from the data table.
Exercise 4
This
data set contains empty data. Make the empty elements more explicit in R
by converting them to NA. Count how many elements you
have:
- in total
- with missing data
- without missing data
Exercise 5
This
also contains missing data in many fields. However, instead of leaving
the cells empty, the author used a character to indicate a missing
value. Open the file in a text editor to find what character has been
used for missing data and replace this character with
NA.
Go back to the main page
Go back to the R overview page
⬆️ Back to Top
This web page is distributed under the terms of the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Creative Commons License: CC BY-SA 4.0.