A brief introduction to the 'Tidy Tuesday' project

A different dataset every Tuesday

A number of my posts source data via the ‘Tidy Tuesday’ project, so I thought it would make sense for me to provide some further information on this project. Every Tuesday a new dataset is provided and people are encouraged to wrangle the data and create a visualisation using the R tidyverse (although other code based methodologies are also welcome). People can post their code and output on twitter (#TidyTuesday). The project was originally co-founded by Thomas Mock in 2018.

The Tidy Tuesday github repository is an excellent starting point to learn more about the project. It contains some background to the project, participant guidelines and guidance, along with all the weekly datasets.

Owing to the popularity of the project, an R package, tidytuesdayR, was also developed. This allows for easy access to the datasets from within R. For example, if I wanted to access the original craft beer dataset I used in this post, I can bring back a list of all the tidy tuesday datasets and then call the relevant dataset using the appropriate tidy tuesday date.

# Load tidy tuesday library
library(tidytuesdayR)
# Obtain all the available tidy tuesday datasets
# Do a check first to make sure daily query limit has not been reached
# Note, I don't execute this code here as it returns many, many rows
if (rate_limit_check(quiet = TRUE) > 10) {
  all_available_datasets <- tt_available()
  print(all_available_datasets)
}

The craft beer dataset was used for Tidy Tuesday on July 10th 2018. Therefore, the dataset can be imported via the tidytuesdayR package using this date.

# Using the tidy tuesday date sourced from the table above load the dataset
# Do a check first to make sure daily query limit has not been reached
if (rate_limit_check(quiet = TRUE) > 10) {
  craft_beer_data <- tt_load("2018-07-10")
  head(craft_beer_data$week15_beers)
}
## 
##  Downloading file 1 of 1: `week15_beers.xlsx`
## # A tibble: 6 x 8
##   count   abv   ibu    id name                style            brewery_id ounces
##   <dbl> <dbl> <dbl> <dbl> <chr>               <chr>                 <dbl>  <dbl>
## 1     1 0.05     NA  1436 Pub Beer            American Pale L~        408     12
## 2     2 0.066    NA  2265 Devil's Cup         American Pale A~        177     12
## 3     3 0.071    NA  2264 Rise of the Phoenix American IPA            177     12
## 4     4 0.09     NA  2263 Sinister            American Double~        177     12
## 5     5 0.075    NA  2262 Sex and Candy       American IPA            177     12
## 6     6 0.077    NA  2261 Black Exodus        Oatmeal Stout           177     12

Also, I found this R Shiny App useful for browsing submissions people made under the #TidyTuesday twitter hashtag. It doesn’t look like it has been updated in a while, but it is still very interesting.

Another good resource is this list of youtube videos by David Robinson. In each video David takes a look at a Tidy Tuesday dataset livecoding his analysis and visualisations in R.

If you are searching for inspiration for a small personal data project, then I recommend looking through the datasets in the Tidy Tuesday github repository and checking out the submissions based on those datasets under the #TidyTuesday hashtag on twitter. If you are not on Twitter, then you can access the raw tweets via the github repository. See this article for details on how to do that.

Conor Buckley
Conor Buckley

My interests include data wrangling, using the R tidyverse, and making insightful charts.