Concept explainers
Refer back to the data in Exercise 4, in which y = ammonium concentration (mg/L) and x = transpiration (ml/h). Summary quantities include n = 13, Σxi = 303.7, Σyi = 52.8, Sxx = 1585.230769, Sv = −341.959231. and Syy = 77.270769.
- a. Obtain the equation of the estimated regression line and use it to calculate a point prediction of ammonium concentration for a future observation made when ammonium concentration is 25 ml/h.
- b. What happens if the estimated regression line is used to calculate a point estimate of true average concentration when transpiration is 45 ml/h? Why does it not make sense to calculate this point estimate?
- c. Calculate and interpret s.
- d. Do you think the simple linear regression model does a good job of explaining observed variation in concentration? Explain.
a.
Find the interval estimate for the slope of the population regression.
Answer to Problem 35E
The 95% confidence interval for the slope of the population regression is
Explanation of Solution
Given info:
The summary statistics of the data correspond to the variables motion sickness dose
Calculation:
Linear regression model:
In a linear equation
A linear regression model is given as
Y-intercept:
In a linear equation
The general formula to obtain y-intercept is,
Slope:
In a linear equation
The general formula to obtain slope is,
The slope coefficient of the simple linear regression is,
Thus, the point estimate of the slope is
Total sum of square: (SST)
The total variation in the observed values of the response variable is defined as the total sum of squares. The formula for total sum of square is
The total sum of square is obtained as ,
Therefore, the total sum of squares is
Regression sum of square: (SSR)
The variation in the observed values of the response variable explained by the regression is defined as the regression sum of squares. The formula for regression sum of square is
The regression sum of squares is obtained as is,
Error sum of square: (SSE)
The variation in the observed values of the response variable which is not explained by the regression is defined as the error sum of squares. The formula for error sum of square is
The general formula to obtain error sum of square is,
The error sum of squares is obtained as,
Therefore, the error sum of squares is
Estimate of error standard deviation:
The general formula for the estimate of error standard deviation is,
The estimate of error standard deviation is obtained as,
Thus, the estimate of error standard deviation is
Error sum of square: (SSE)
The variation in the observed values of the response variable that is not explained by the regression is defined as the regression sum of squares. The formula for error sum of square is
Estimate of error standard deviation of slope coefficient:
The general formula for the estimate of error standard deviation of slope coefficient is,
The defining formula for
The estimate of error standard deviation of slope coefficient is,
Thus, the estimate of error standard deviation of slope coefficient is
Confidence interval:
The general formula for the confidence interval for the slope of the regression line is,
Where,
Since, the level of confidence is not specified. The prior confidence level 95% can be used.
Critical value:
For 95% confidence level,
Degrees of freedom:
The sample size is
The degrees of freedom is,
From Table A.5 of the t-distribution in Appendix A, the critical value corresponding to the right tail area 0.025 and 15 degrees of freedom is 2.131.
Thus, the critical value is
The 95% confidence interval is,
Thus, the 95% confidence interval for the slope of the population regression is
Interpretation:
There is 95% confident, that the expected change in % reported nausea associated with 1 unit increase in motion sickness dose lies between 0.632 and 2.440.
b.
Test whether there is enough evidence to conclude that the predictor variable motion sickness dose is useful for predicting the value of the response variable % reported nausea.
Answer to Problem 35E
There is sufficient evidence to conclude that the predictor variable motion sickness dose is useful for predicting the value of the response variable % reported nausea.
Explanation of Solution
Calculation:
From part (a), the slope coefficient of the regression line is
The test hypotheses are given below:
Null hypothesis:
That is, there is no useful relationship between the variables motion sickness dose
Alternative hypothesis:
That is, there is useful relationship between the variables motion sickness dose
T-test statistic:
The test statistic is,
Degrees of freedom:
The sample size is
The degrees of freedom is,
Thus, the degree of freedom is 15.
Level of significance:
Here, level of significance is not given.
So, the prior level of significance
For the level of significance
From Table A.5 of the t-distribution in Appendix A, the critical value corresponding to the right tail area 0.025 and 15 degrees of freedom is 2.131.
Thus, the critical value is
From part (a), the estimate of error standard deviation of slope coefficient is
Test statistic under null hypothesis:
Under the null hypothesis, the test statistic is obtained as follows:
Thus, the test statistic is 3.6226.
Decision criteria for the classical approach:
If
Conclusion:
Here, the test statistic is 3.6226 and critical value is 2.131.
The t statistic is greater than the critical value.
That is,
Based on the decision rule, the null hypothesis is rejected.
Hence, there is a linear relationship between the predictor variable % reported nausea and the response variable motion sickness dose.
Therefore, there is sufficient evidence to conclude that the predictor variable motion sickness dose is useful for predicting the value of the response variable % reported nausea.
c.
Check whether it is plausible to estimate the expected % reported nausea when the motion sickness dose is 5.0 using the obtained regression line.
Answer to Problem 35E
No, it is not plausible to estimate the expected % reported nausea when the motion sickness dose is 5.0 using the obtained regression line.
Explanation of Solution
Calculation:
Linear regression model:
A linear regression model is given as
Y-intercept:
In a linear equation
The general formula to obtain y-intercept is,
The y-intercept of the regression model is obtained as follows:
Thus, the y-intercept of the regression model is
From part (a), the slope coefficient of the regression line is
Therefore, the regression equation of the variables motion sickness dose
Predicted value of % reported nausea when the motion sickness dose is 5.0:
The predicted value of % reported nausea when the motion sickness dose is 5.0 is obtained as follows:
Thus, the predicted value of % reported nausea for 5.0 motion sickness dose is –7.947.
Here, the % reported nausea is resulted as a negative value, which is not possible in reality.
Thus, the predicted value is a flaw.
Moreover, it is given that the range of the values of the variable motion sickness dose is 6.0 to 17.6.
The value 5.0 is outside the range of the variable motion sickness dose. That is, the observation 5.0 is not available.
Hence, the regression line may not give good estimate of expected % reported nausea when the motion sickness dose is 5.0.
Therefore, it is not plausible to estimate the expected % reported nausea when the motion sickness dose is 5.0 using the obtained regression line.
d.
Find the interval estimate for the slope of the population regression after eliminating the observation
Comment whether the observation
Answer to Problem 35E
The 95% confidence interval for the slope of the population regression after eliminating the observation
Yes, the observation
Explanation of Solution
Calculation:
Linear regression model:
In a linear equation
A linear regression model is given as
Here, the observation
That is, the value 6.0 has to be removed from the variable motion sickness dose
The results of the summary statistics after eliminating the observation
Sample size:
Sum of the variable:
Sum of squares of the variable:
Y-intercept:
In a linear equation
The general formula to obtain y-intercept is,
Slope:
In a linear equation
The general formula to obtain slope is,
The slope coefficient of the simple linear regression is,
Thus, the point estimate of the slope is
Total sum of square: (SST)
The total variation in the observed values of the response variable is defined as the total sum of squares. The formula for total sum of square is
The total sum of square is obtained as ,
Therefore, the total sum of squares is
Regression sum of square: (SSR)
The variation in the observed values of the response variable explained by the regression is defined as the regression sum of squares. The formula for regression sum of square is
The regression sum of squares is obtained as is,
Error sum of square: (SSE)
The variation in the observed values of the response variable which is not explained by the regression is defined as the error sum of squares. The formula for error sum of square is
The general formula to obtain error sum of square is,
The error sum of squares is obtained as,
Therefore, the error sum of squares is
Estimate of error standard deviation:
The general formula for the estimate of error standard deviation is,
The estimate of error standard deviation is obtained as,
Thus, the estimate of error standard deviation is
Error sum of square: (SSE)
The variation in the observed values of the response variable that is not explained by the regression is defined as the regression sum of squares. The formula for error sum of square is
Estimate of error standard deviation of slope coefficient:
The general formula for the estimate of error standard deviation of slope coefficient is,
The defining formula for
The estimate of error standard deviation of slope coefficient is,
Thus, the estimate of error standard deviation of slope coefficient is
Confidence interval:
The general formula for the confidence interval for the slope of the regression line is,
Where,
Since, the level of confidence is not specified. The prior confidence level 95% can be used.
Critical value:
For 95% confidence level,
Degrees of freedom:
The sample size is
The degrees of freedom is,
From Table A.5 of the t-distribution in Appendix A, the critical value corresponding to the right tail area 0.025 and 14 degrees of freedom is 2.145.
Thus, the critical value is
The 95% confidence interval is,
Thus, the 95% confidence interval for the slope of the population regression is
Interpretation:
There is 95% confident, that the expected change in % reported nausea associated with 1 unit increase in motion sickness dose lies between 0.3719 and 2..7301.
Comparison:
The 95% confidence interval for the slope of the population regression with the observation
The 95% confidence interval for the slope of the population regression after eliminating the observation
Here, by observing both the intervals it is clear that the
Want to see more full solutions like this?
Chapter 12 Solutions
WEBASSIGN ACCESS FOR PROBABILITY & STATS
- The following table provides values of the function f(x,y). However, because of potential; errors in measurement, the functional values may be slightly inaccurately. Using the statistical package included with a graphical calculator or spreadsheet and critical thinking skills, find the function f(x,y)=a+bx+cy that best estimate the table where a, b and c are integers. Hint: Do a linear regression on each column with the value of y fixed and then use these four regression equations to determine the coefficient c. x y 0 1 2 3 0 4.02 7.04 9.98 13.00 1 6.01 9.06 11.98 14.96 2 7.99 10.95 14.02 17.09 3 9.99 13.01 16.01 19.02arrow_forwardRespiratory Rate Researchers have found that the 95 th percentile the value at which 95% of the data are at or below for respiratory rates in breath per minute during the first 3 years of infancy are given by y=101.82411-0.0125995x+0.00013401x2 for awake infants and y=101.72858-0.0139928x+0.00017646x2 for sleeping infants, where x is the age in months. Source: Pediatrics. a. What is the domain for each function? b. For each respiratory rate, is the rate decreasing or increasing over the first 3 years of life? Hint: Is the graph of the quadratic in the exponent opening upward or downward? Where is the vertex? c. Verify your answer to part b using a graphing calculator. d. For a 1- year-old infant in the 95 th percentile, how much higher is the walking respiratory rate then the sleeping respiratory rate? e. f.arrow_forwardA regression was run to determine if there is a relationship between hours of TV watched per day (x) and number of situps a person can do (y). The results of the regression were: уах+ b a = -1.098 b = 37.154 r2 = 0.444889 r = -0.667 Use this to predict the number of situps a person who watches 11 hours of TV can do. situps = [one decimal accuracy]arrow_forward
- Run a regression analysis on the following data set, where yyis the final grade in a math class and xxis the average number of hours the student spent working on math each week. hours/weekxGradey445.6548661.4107710771284.81376.21699.41910020100 State the regression equation y=m⋅x+by=m⋅x+b, with constants accurate to two decimal places. What is the predicted value for the final grade when a student spends an average of 15 hours each week on math? Grade = Round to 1 decimal place.arrow_forwardA regression was run to determine if there is a relationship between hours of TV watched per day (x) and number of situps a person can do (y). The results of the regression were: y=ax+b a=-0.926 b=28.863 r²=0.470596 r=-0.686 Use this to predict the number of situps a person who watches 10.5 hours of TV can do (to one decimal place)arrow_forwardA researcher records age in years (x) and systolic blood pressure (y) for volunteers. They perform a regression analysis was performed, and a portion of the computer output is as follows: ŷ = 4.3 14.9x Coefficients (Intercept) X Estimate St 4.3 Ho: B₁ = 0 Ha: B₁ > 0 B1 O Ho: B₁ Ha: B₁ <0 = 0 14.9 B1 O Ho: B₁ = 0 0 Ha: B1 Std. Error Test statistic P-value 2.9 5.1 1.48 Specify the null and the alternative hypotheses that you would use in order to test whether a negative linear relationship exists between x and y. 2.92 0.08 0.01arrow_forward
- A kinesiology major wanted to predict VO2max based on the one-mile run. To develop the regression equation, she obtained VO2max values (ml/kg/min) in the exercise physiology laboratory on 18 students. Two days later, she measured the same 18 students on the one-mile run with scores reported as total time in seconds. The data are as follows: X Y r = -.94 Subject One-mile run time (seconds) VO2 max (ml/kg/min) 1 250 60.3 2 315 57.2 3 420 55.4 4 410 51.4 5 436 52.5 6 511 45.6 7 460 38.4 8 510 41.5 9 530 39.6 10 586 33.2 11 591 37.7 12 600 40.1 13 626 32.0 14 643 35.4 15 650 33.7 16 675 35.9 17 710 27.4 18 720 25.3 Calculate means for one-mile run and VO2 = = Calculate the slope of the line (b) using this formula: Interpret your answer: Calculate the y-intercept (a) using this formula: What is…arrow_forwardAn agent for a real estate company in a large city would like to be able to predict the monthly rental cost for apartments, based on the size of the apartment, as defined by square footage. A sample of eight apartments in a neighborhood was selected, and the information gathered revealed the data shown below. For these data, the regression coefficients are b, = 89.7175 and b, = 1.0703. Complete parts (a) through (d). Monthly Rent (S) Size (Square Feet) 900 1,450 850 1,500 2,000 900 1,825 1,300 o 850 1,350 950 1,200 1,900 700 1.350 1.050 ..... a. Determine the coefficient of determination, r, and interpret its meaning. 2= 0.843 (Round to three decimal places as needed.) What is the meaning of ? O A. r measures the proportion of variation in apartment size that can be explained by the variation in monthly rent. O B. r measures the proportion of variation in apartment size that cannot be explained by the variation in monthly rent. O C. measures the proportion of variation in monthly rent…arrow_forwardWe collected teacher ratings for 25 courses taught by an instructor over a six-vear period. The students' ratings of the instructor are on a scale of 1 to 9. We found that The linear regression equation is: Average Rating = 7,88 -0.068 Numher of Students 1. Interpret the slope of this model including units: The average rating decreases per each additional student 0.068 ed teacher rating for a class size of 15 students using the given 7.88 prediction equation is Next page CS Scanned with CamScannerarrow_forward
- A set of X and Y scores has MX=4, SSX=10, MY=5, SSY=40 and SP=20. What is the regression equation for predicitng Y from X?arrow_forwardQ1) Interpret the following regression line y = 10.50 – 0.18xarrow_forwardThe table shows the total square footage (in billions) of retailing space at shopping centers and their sales (in billions of dollars) for 10 years. The equation of the regression line is ModifyingAbove y=589.637x−2143.147. Complete parts a and b. Total Square Footage, x 5.1 5.2 5.3 5.4 5.6 5.7 5.9 5.9 6.1 6.1 Sales, y 858.5 940.1 992.7 1064.6 1122.8 1203.5 1275.1 1339.3 1435.2 1533.3 (a) Find the coefficient of determination and interpret the result. nothing (Round to three decimal places as needed.)arrow_forward
- Calculus For The Life SciencesCalculusISBN:9780321964038Author:GREENWELL, Raymond N., RITCHEY, Nathan P., Lial, Margaret L.Publisher:Pearson Addison Wesley,Algebra & Trigonometry with Analytic GeometryAlgebraISBN:9781133382119Author:SwokowskiPublisher:CengageFunctions and Change: A Modeling Approach to Coll...AlgebraISBN:9781337111348Author:Bruce Crauder, Benny Evans, Alan NoellPublisher:Cengage Learning