<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=windows-1252">
<META NAME="Generator" CONTENT="Microsoft Word 97">
<TITLE>a. X values being closer to their mean implies that Sxx is smaller. From equation 3.18 and 3.19, we see that a smaller Sxx means a larger variance. Thus the estimates are less precisely estimated and less precisely estimated and the statement is FALSE</TITLE>
<META NAME="Template" CONTENT="C:\Program Files\Microsoft Office\Office\html.dot">
</HEAD>
<BODY LINK="#0000ff" VLINK="#800080">

<P ALIGN="JUSTIFY">I. </P>
<P ALIGN="JUSTIFY">a. (4 points) X values being closer to their mean implies that S<SUB>xx</SUB> is smaller. From equation (3.18) <IMG SRC="Image1.gif" WIDTH=96 HEIGHT=24>and (3.19) <IMG SRC="Image2.gif" WIDTH=138 HEIGHT=22>, we see that a smaller S<SUB>xx</SUB> means a larger variance. Thus the estimates are less precisely estimated and less precisely estimated and the statement is FALSE</P>
<P ALIGN="JUSTIFY">b. (4 points) FALSE because for unbiasedness we need Assumption 3.3 (Each u is random variable with E(u)=0) and 3.4 (Each X<SUB>t</SUB> is given and not a random variable). Violation of Assumption 3.4 implies that unbiasedness is no longer valid.</P>
<P ALIGN="JUSTIFY">c. (4 points) Assumption 3.8 (Each u<SUB>t</SUB> is distributed as N(0,<FONT FACE="Symbol">s</FONT> <SUP>2</SUP>)) is needed only for hypothesis testing. Thus BLUE still holds and the statement is FALSE.</P>
<P ALIGN="JUSTIFY">d. (4 points) TRUE because t- and F- distributions for the test statistics were derived from the assumption of normality which is a must for hypothesis testing.</P>
<P ALIGN="JUSTIFY">e. (4 points) TRUE because the width of a confidence interval directly depends on the standard error of an estimate.</P>
<P ALIGN="JUSTIFY">f. (4 points) TRUE because if Var(x) is large, then from equation (3.18) <IMG SRC="Image1.gif" WIDTH=96 HEIGHT=24>and (3.19) <IMG SRC="Image2.gif" WIDTH=138 HEIGHT=22>the variance will be smaller and hence confidence interval will be narrower.</P>
<P ALIGN="JUSTIFY">g. (4 points) FALSE because a high p-value means rejection of H<SUB>0</SUB> might result in a high probability of Type I error. So we should not reject, implying that we should not conclude that the coefficient is significant.</P>
<P ALIGN="JUSTIFY">h. (4 points) TRUE because a higher level of significance means a lower value for t* and hence actual |t<SUB>c</SUB>| is more likely to be to the right of t*. Also, a higher level of significance means a greater chance for p-value to be below it and hence more likely for the null hypothesis to be rejected, implying significance of a coefficient.</P>
<P ALIGN="JUSTIFY">i. (4 points) PARTLY TRUE. Violation of Assumption 3.5 (All the u’s are identically with the same variance <FONT FACE="Symbol">s</FONT> <SUP>2</SUP>) and 3.6 (The u’s are independently distributed) only affects the BLUE property. Thus estimators are still unbiased and consistent but not BLUE.</P>
<P ALIGN="JUSTIFY">j. (4 points) FALSE. The null hypothesis is a statement about whether or not the parameter has a certain value. This is either true or not true and therefore it is meaningless to attribute a probability to whether H<SUB>0</SUB> is true or not. However, the rejection of a true hypothesis, which is Type I error, is a random event because it can change from trial to trial. The p-value is the probability of making this type of mistake.</P>
<P ALIGN="JUSTIFY">II. (20 points) Turn in printout.</P>
<P>Model A:<FONT SIZE=2> Dependent variable - ATTEND</P>
</FONT><PRE>VARIABLE&#9;COEFFICIENT&#9; STDERROR&#9; T STAT&#9;&#9; 2Prob(t &gt; |T|)
0) constant&#9;-861.272511&#9; 577.486631&#9; -1.491415&#9; 0.140282
2) POP&#9;&#9;0.231068&#9; 0.042602&#9; 5.42388&#9; &lt; 0.0001 ***
4) PRIORWIN&#9;16.617537&#9; 3.868392&#9; 4.295722&#9; &lt; 0.0001 ***
5) CURNTWIN&#9;16.003534&#9; 6.325326&#9; 2.530073&#9; 0.013624 **
10) G5&#9;&#9;-16.869136&#9; 6.698056&#9; -2.518512&#9; 0.01404 **
12) OTHER&#9;-524.333953&#9; 122.677053&#9; -4.2741&#9; &lt; 0.0001 ***
13) TEAMS&#9;-206.042514&#9; 59.757073&#9; -3.448002&#9; 0.000953 ***

Mean of dep. var.&#9; 1782.86491&#9; S.D. of dep. variable&#9; &#9;597.994624
Error Sum of Sq (ESS)&#9; 6.810621e+06&#9; Std Err of Resid. (sgmahat)&#9;309.716378
Unadjusted R-squared&#9; 0.753&#9;&#9; Adjusted R-squared&#9;&#9;0.732
F-statistic (6, 71)&#9; 36.008266&#9; pvalue = Prob(F &gt; 36.008) is&#9;&lt; 0.0001
Durbin-Watson Stat.&#9; 2.236561&#9; First-order auto corr coeff&#9;-0.130</PRE>
<P ALIGN="JUSTIFY">&nbsp;</P>
<P ALIGN="JUSTIFY">Model B: Dependent variable - ATTEND</P>
<PRE>VARIABLE&#9; COEFFICIENT&#9; STDERROR&#9; T STAT&#9;&#9; 2Prob(t &gt; |T|)
0) constant&#9; -886.421023&#9; 573.215343&#9; -1.546401&#9; 0.126517
2) POP&#9;&#9; 0.217937&#9; 0.043216&#9; 5.042996&#9; &lt; 0.0001 ***
4) PRIORWIN&#9; 17.706879&#9; 3.910038&#9; 4.528569&#9; &lt; 0.0001 ***
5) CURNTWIN&#9; 15.712971&#9; 6.278862&#9; 2.502519&#9; 0.014669 **
6) G1&#9;&#9; -22.263022&#9; 15.264011&#9; -1.45853&#9; 0.149167
10) G5&#9;&#9; -13.162167&#9; 7.114941&#9; -1.849933&#9; 0.068544 *
12) OTHER&#9; -509.632258&#9; 122.131256&#9; -4.172824&#9; &lt; 0.0001 ***
13) TEAMS&#9; -190.289705&#9; 60.263974&#9; -3.157603&#9; 0.002348 ***</PRE>
<FONT SIZE=2><P>&nbsp;</P>
</FONT><PRE>Mean of dep. var.&#9; 1782.86491&#9; S.D. of dep. variable&#9; &#9;597.994624
Error Sum of Sq (ESS)&#9; 6.609749e+06&#9; Std Err of Resid. (sgmahat)&#9;307.286498
Unadjusted R-squared&#9; 0.760&#9;&#9; Adjusted R-squared&#9;&#9;0.736
F-statistic (7, 70)&#9; 31.65818&#9; pvalue = Prob(F &gt; 31.658) is&#9;&lt; 0.0001
Durbin-Watson Stat.&#9; 2.18034&#9; First-order auto corr coeff&#9;-0.102</PRE>
<P ALIGN="JUSTIFY">&nbsp;</P>
<P ALIGN="JUSTIFY">1. (8 points) By data-based model reduction technique, we can choose the model A (the last model). We omitted the variable with the least significant coefficient (the highest p-value) in each step.</P>
<P ALIGN="JUSTIFY">2. (8 points) Each step shows reduction of model selection statistics. Note that model B (the 2<SUP>nd</SUP> to the last model) has the lowest model selection statistics and all the coefficients except G1 are highly significant. Also the coefficients for PRIORWIN, G5, OTHER are very different between model A and B. Thus, the bias in omitting G5 might be serious. Based on the 5 out of 8 model-selection-criteria, model B appears to be ‘best’ and is chosen as the final model for interpretation.</P>
<P ALIGN="JUSTIFY">3. (8 points) The population of a city, capacity of the stadium, previous year’s win of the home team, and the current years’, wins are likely to affects positively the attendance at baseball games. The measure G1 through G5, GF, OTHER, and TEAMS are likely to have negative effects.</P>
<P ALIGN="JUSTIFY">In the regression, POP, PRIORWIN, G5, OTHER, and TEAMS had coefficients significant at levels less than 7 percent. All the coefficients had expected signs. </P>
<P ALIGN="JUSTIFY">4. (8 points) H<SUB>0</SUB>: <FONT FACE="Symbol">b</FONT> <SUB>3</SUB> = <FONT FACE="Symbol">b</FONT> <SUB>7</SUB> = <FONT FACE="Symbol">b</FONT> <SUB>8</SUB> = <FONT FACE="Symbol">b</FONT> <SUB>9</SUB> = <FONT FACE="Symbol">b</FONT> <SUB>11</SUB> = 0 H<SUB>1</SUB>: At least one of <FONT FACE="Symbol">b</FONT> <SUB>3</SUB>, <FONT FACE="Symbol">b</FONT> <SUB>7</SUB>, <FONT FACE="Symbol">b</FONT> <SUB>8</SUB>, <FONT FACE="Symbol">b</FONT> <SUB>9</SUB> or <FONT FACE="Symbol">b</FONT> <SUB>11</SUB> is not zero</P>
<P>(ESSR-ESSU)*DFU/(NR*ESSU) = (6.60e+06-6.26e+06)*65/(5*6.26e+06) = 0.71, F<SUB>5,65</SUB> <FONT FACE="Symbol">»</FONT> 1.95, so we can not reject the null.The result of Wald test tells us that we cannot reject the null but instead conclude that <FONT FACE="Symbol">b</FONT> <SUB>3</SUB>, <FONT FACE="Symbol">b</FONT> <SUB>7</SUB>, <FONT FACE="Symbol">b</FONT> <SUB>8</SUB>, <FONT FACE="Symbol">b</FONT> <SUB>9</SUB>, <FONT FACE="Symbol">b</FONT> <SUB>11 </SUB>are jointly insignificant.</P>
<P>5. (8 points) The estimated value of each coefficient implies the marginal effects to ATTEND. For example, a one unit increase in POP (1000 persons) holding every other variables constant will increase 218 attendance more on average, which is a sensible value.</P></BODY>
</HTML>
