3 Big-Data Lab (January 30): More plotting practice and beginning with Twitter data

3.1 Objectives of this lab

This tutorial will cover the following:

  • Practice more plots in ggplot;
  • Practice ggplots using the public health dataframe (obesity_cov_2013) from previous lab.
  • Begin to use Twitter data – from an archived dataset
  • If you want to access Twitter API yourself (not required), you’ll need your Twitter account (see below).

The R functions in this lab program should:

  • Find the maximum number of pages to be queried
  • Generate all the subpages that make up the reviews
  • Scrape the information from each of them
  • Combine the information into one comprehensive data frame

3.2 R CODE FOR TWITTER ASSIGNMENT

PACKAGES

3.5 MORE PRACTICE GGPLOT ON OBESITY AND COVARIATES

Here you will need to use data frame from obesity lab: obesity_cov_2013 (you may have to read it again. Feel free to include results on your assignment.

Replace x-axis label

Or just remove x-axis label

Too many states on the on x-axis? Try rotating the boxplot: with coord_flip()

Notice that the states are still addressed by xlab() when using coord_flip()

Want average values for each state? Get mean value of columns 7 through 13 for each state with this command

States_2013 now has values averaged across counties (not population) for each state. Is it useful? Note that a boxplot of these state-averaged data shows much less!