Examlex
One of the most frequently used summary statistics for the performance of a baseball hitter is the so-called batting average. In essence, it calculates the percentage of hits in the number of opportunities to hit (appearances "at the plate"). The management of a professional team has hired you to predict next season's performance of a certain hitter who is up for a contract renegotiation after a particularly great year. To analyze the situation, you search the literature and find a study which analyzed players who had at least 50 at bats in 1998 and 1997. There were 379 such players.
(a)The reported regression line in the study is
= 0.138 + 0.467 × ; R2= 0.17
and the intercept and slope are both statistically significant. What does the regression imply about the relationship between past performance and present performance? What values would the slope and intercept have to take on for the future performance to be as good as the past performance, on average?
(b)Being somewhat puzzled about the results, you call your econometrics professor and describe the results to her. She says that she is not surprised at all, since this is an example of "Galton's Fallacy." She explains that Sir Francis Galton regressed the height of offspring on the mid-height of their parents and found a positive intercept and a slope between zero and one. He referred to this result as "regression towards mediocrity." Why do you think econometricians refer to this result as a fallacy?
(c)Your professor continues by mentioning that this is an example of errors-in-variables bias. What does she mean by that in general? In this case, why would batting averages be measured with error? Are baseball statisticians sloppy?
(d)The top three performers in terms of highest batting averages in 1997 were Tony Gwynn (.372), Larry Walker (.366), and Mike Piazza (.362). Given your answers for the previous questions, what would be your predictions for the 1998 season?
Tensile Strength
The maximum amount of tensile (pulling) stress that a material can withstand before breaking or failing.
Standard Deviation
A measure of the dispersion or spread of a set of data from its mean, indicating how spread out the data points are.
Confidence Interval
A range of values, derived from the sample statistic, that is likely to contain the population parameter with a certain degree of confidence.
Standard Deviation
A measure that quantifies the amount of variability or dispersion of a set of data points around the mean.
Q5: Sample selection bias<br>A)occurs when a selection process
Q5: In a two regressor regression model, if
Q11: In a randomized controlled experiment<br>A)there is a
Q24: Set up the null hypothesis and alternative
Q25: Consider the following earnings function:<br>ahe<sub>i</sub>= β<sub>0</sub> +
Q30: One advantage of forecasts based on a
Q39: Earnings functions attempt to find the
Q43: Consider the AR(1)model Y<sub>t</sub> = β<sub>0</sub>
Q44: The GLS assumptions include all of
Q50: Consider a regression with two variables, in