Course Outline

Course Overview

Modules

Assignments

Resources

Policies

Insurmountable Coding Problems


Some of my best friends are Zombies…





Must be submitted for Peer Commentary by 5:00 pm Wednesday, October 11th.


Peer Commentary and Final Homework DUE at 5:00 pm Wednesday, October 18th.



Create a new GitHub repo and git-referenced Rstudio Project called “AN588_Zombies_BUlogin”. Within that repo, create a new .Rmd file called “BUlogin_OriginalHomeworkCode_03”. Modules 03-08 will each have concepts and example code that will help you complete this assignment. Don’t forget to add your Peer Group and instructor as collaborators, and to accept their invitations to you. Making sure to push both the markdown and knitted .html files to your repository, do the following:

Load in the dataset “zombies.csv” from my GitHub repo. This data includes the first name, last name, and gender of the entire population of 1000 people who have survived the zombie apocalypse and are now ekeing out an existence somewhere on the East Coast, along with several other variables (height, weight, age, number of years of education, number of zombies they have killed, and college major; see here for info on important post-zombie apocalypse majors).

  1. Calculate the population mean and standard deviation for each quantitative random variable (height, weight, age, number of zombies killed, and years of education). NOTE: You will not want to use the built in var() and sd() commands as these are for samples.
  2. Use {ggplot} to make boxplots of each of these variables by gender.
  3. Use {ggplot} to make scatterplots of height and weight in relation to age. Do these variables seem to be related? In what way?
  4. Using histograms and Q-Q plots, check whether the quantitative variables seem to be drawn from a normal distribution. Which seem to be and which do not (hint: not all are drawn from the normal distribution)? For those that are not normal, can you determine from which common distribution they are drawn?
  5. Now use the sample() function to sample ONE subset of 30 zombie survivors (without replacement) from this population and calculate the mean and sample standard deviation for each variable. Also estimate the standard error for each variable, and construct the 95% confidence interval for each mean. Note that for the variables that are not drawn from the normal distribution, you may need to base your estimate of the CIs on slightly different code than for the normal…
  6. Now draw 99 more random samples of 30 zombie apocalypse survivors, and calculate the mean for each variable for each of these samples. Together with the first sample you drew, you now have a set of 100 means for each variable (each based on 30 observations), which constitutes a sampling distribution for each variable. What are the means and standard deviations of this distribution of means for each variable? How do the standard deviations of means compare to the standard errors estimated in [5]? What do these sampling distributions look like (a graph might help here)? Are they normally distributed? What about for those variables that you concluded were not originally drawn from a normal distribution?

Your Final Assignment for Homework 03

Your final assignment, due to me by 5:00 pm on October 18th, is to have in your AN588_Zombies_BUlogin repo only the following files (aside from repo basics like an .Rproj and README file):
    1. The FINAL PUSH of your Original Homework Code, including the five challenges you faced, as pushed to your repo by 5:00 pm on Wednesday October 18th, an R Markdown file named BUlogin_OriginalHomeworkCode_03.
    2. The FINAL PUSH of the Peer Commentary made on your code, an R Markdown file named Peerlogin_PeerCommentary_BUlogin_03 or PeerGroupX_PeerCommentary_BUlogin_03.
    3. The FINAL PUSH of your Final Homework Code, which has taken into account any changes recommended from the Peer Commentary and notes, an R Markdown file named BUlogin_FinalHomeworkCode_03.

Formatting Instructions: Please use the readthedown theme (from the {rmdformats} package) in your Final Homework Code, with a Header 2 title for each question you answer. These headers should be organized in a Table of Contents. Add to the top of the document, under your title, your favorite GIF, picture, or video related to zombies that you’re able to find online.


Homework 03 Solutions will be posted on October 18th.


NOTE: If you want your homework code to look nice (beyond being very well annotated and commented), and be easy to use by others, you can check out the relatively simple example R Markdown templates in the AN588_Week_3_caschmit repo.

Please also consider consulting the following helpful guidelines on how to write effective R Markdown documents (also available at the end of Module 03), which go well beyond the simple formatting of the templates.