ST314D F23-Data-Analysis-03

.docx

School

Oregon State University, Corvallis *

*We aren’t endorsed by this school

Course

314

Subject

Statistics

Date

Apr 3, 2024

Type

docx

Pages

5

Uploaded by lucekimb on coursehero.com

© F2023 Intellectual Property of Kelsi Espinoza ST314 Data Analysis #3 Question 1. (6 Points) Suppose a local Corvallis survey asks: “Do you think that adults that refuse to get the 2023’s flu vaccine should be subject to a fine?” a. (2 points) How might only offering “yes” or “no” as possible answers impact the responses the researchers receive? Some respondents might feel unsure or conflicted about the issue. By offering only two polar options, these respondents will choose a side, potentially skewing the results. b. (2 points) In general, what could be some issues with this question and how it is worded? I think the question presupposes that a fine is the primary or only solution to consider. This narrows the scope of the conversation and might not capture the full range of public opinion on how to address those who don't get the flu vaccine. c. (2 points) Suppose the survey first asked, “When was the last time that you got a flu vaccine?” How might the response change if this was asked before asking the question about fines? Respondents will first reflect on their personal behavior. Those who have recently gotten a flu vaccine might feel more justified or inclined to support fines for others who haven't. Conversely, those who haven't gotten the vaccine recently might feel defensive and be less likely to support fines. Question 2. (9 Points) Investigators from PublicOpinons.com want to explore national voter opinions on providing healthcare to all (US) Americans following the pandemic. They post, from the PublicOpinons.com’s Snapchat account, an advertisement asking followers the following question: “Should everyone, regardless of employment status, have access to healthcare? Visit our website at PublicOpinions.com and fill out the survey by 11:59pm (Pacific time) and we’ll post the results to Snapchat.” a. (3 points) What population are the investigators are trying to sample, and do they obtain a representative sample from this sampling scheme? Explain. The investigators are trying to sample national voter opinions on the issue of providing healthcare to all Americans following the pandemic. However, the sampling scheme does not necessarily provide a representative sample of the national voter population because Snapchat users may not be representative of all voters in terms of age, demographics, political leanings, technology usage, etc b. (2 points) Specifically, what type of sampling scheme has been used?
© F2023 Intellectual Property of Kelsi Espinoza Convenience sampling c. (2 points) What type of sampling bias(s) may have occur? Explain. Demographic bias: Snapchat users tend to be younger. This could result in an underrepresentation of older voters' opinions. Non-response bias : Not everyone who sees the Snapchat ad will respond. The opinions of those who choose not to respond might be systematically different from those who do. d. (2 points) What else could be a source of issue when it comes to this scenario? Explain. Time constraints: The survey has a deadline of 11:59 pm Pacific time. This time constraint might discourage some people from other time zones or those who see the ad late from participating. Use the following for Question 3 and Question 4: In this section, use the R script, DA3_One_Variable_Plots_and_Summary_Stats.R and the ST314 student survey dataset, ST314_SIS_F23.csv , to explore one categorical variable and one quantitative variable of your choice (excluding “subject” and “emails”). Download the R script and the dataset, open the R script and follow the command instructions. Check out the dataset legend to see what variables represent. Then answer the following questions: Question 3. (5 points) Categorical Variable a. (1 point) Choose a categorical variable to explore. Which variable did you choose? Note: “subject” is off-limits because this was my example in the R code. Choose a different categorical variable. temp_pref b. (2 point) Paste the table of counts and bar chart for the categorical variable of your choosing. Include color and appropriate title/labels.
© F2023 Intellectual Property of Kelsi Espinoza c. (2 point) Briefly, describe the distribution in context. Recall, categorical variables are summarized by counts and/or percents. The distribution between “ too cold” and “ too hot” is in favor of too cold. With too cold at about 70 votes, it’s more than double than too hot, which is at around 23 . Question 4. (15 points) Quantitative Variable a. (1 point) Choose a quantitative variable to explore. Which variable did you choose? Note: “emails” is off limits because it was the example in the R code. Choose a different quantitative variable. States b. (2 point) Create a histogram of the variable. Include color and an appropriate title on your plot. Paste plot. c. (2 point) Create a boxplot of the variable. Include color and an appropriate title on your plot. Paste plot. d. (1 point) Which plot do you prefer to visualize the variable? Why?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help