A LinkedIn Learning course by Mike Chapple
Tidy data is a data format that provides a standardized way of organizing data values within a dataset. By leveraging tidy data principles, statisticians, analysts, and data scientists can spend less time cleaning data and more time tackling the more compelling aspects of data analysis. In this course, learn about the principles of tidy data, and discover how to create and manipulate data tibbles—transforming them from source data into tidy formats. Instructor Mike Chapple uses the R programming language and the tidyverse packages to teach the concept of data wrangling—the data cleaning and data transformation tasks that consume a substantial portion of analysts’ time. He wraps up with three hands-on case studies that help to reinforce the data wrangling principles and tactics covered in this course.
- What’s tidy data?
- Using the tidyverse
- Working with tibbles
- Subsetting and filtering tibbles
- Importing data into R
- Making wide datasets long with gather()
- Making long datasets wide with spread()
- Converting data types in R
- Detecting outliers
- Manipulating strings in R with stringr