Page 1 of 1

White's special case heteroskedasticity test

Posted: Wed Nov 13, 2013 12:13 pm
by mumpenhour
When running a regression and testing for heteroskedasticity using the white test, the degrees of freedom gets eaten up rather quickly. To mitigate this there is a special case in which you can use the fitted values and squares of the fitted values to save on the degrees of freedom.

I'm using woolridge's introductory econometrics with the dataset gpa3.dta, with sample restricted to "spring=1", when I run a regression of cumgpa on c sat hsperc tothrs black white female i get the proper R squared and coefficients from this regression. After I run the regression, I conduct the white test using the built in option for white's test and come to the conclusion that I have heteroskedasticity. I then make the variable residsq=resid^2 and use the forecast button to get the fitted values called yhat. I then make another variable called yhat2=yhat^2 when I run the regression residsq on yhat and yhat^2, I get the right coefficients but the wrong r squared. In gretl I conduct the same experiment and a different r^2. I have conducted this test many times, but I always get the wrong R^2 and thus that fails to reject the null saying that I have homoskedasticity when the model has heteroskedasticity. I believe it is due to human error on my side, but I did the same procedure in gretl to get the right conclusion of heteroskedasticity.

Here are the outputs of the tests
Eviews:
Dependent Variable: RESIDSQ
Method: Least Squares
Date: 11/13/13 Time: 13:43
Sample: 1 732 IF SPRING=1
Included observations: 366

Variable Coefficient Std. Error t-Statistic Prob.

YHAT 0.1629499424398224 0.0524990865430791 3.103862432084619 0.002059592533591627
YHATSQ -0.02952983614856306 0.02115793778331273 -1.395685933619355 0.1636598500795978

R-squared -0.001912477039020866 Mean dependent var 0.2160172954547178
Adjusted R-squared -0.004664983844073101 S.D. dependent var 0.3488463619036611
S.E. of regression 0.3496590964792706 Akaike info criterion 0.7417333439654725
Sum squared resid 44.50318008525475 Schwarz criterion 0.763059209175316
Log likelihood -133.7372019456815 Hannan-Quinn criter. 0.750207645375661

Gretl
Model 3: OLS, using observations 1-366
Dependent variable: usq1

coefficient std. error t-ratio p-value
-------------------------------------------------------
yhat1 0.162950 0.0524991 3.104 0.0021 ***
sq_yhat1 -0.0295298 0.0211579 -1.396 0.1637

Mean dependent var 0.216017 S.D. dependent var 0.348846
Sum squared resid 44.50318 S.E. of regression 0.349659
R-squared 0.276336 Adjusted R-squared 0.274348
F(2, 364) 69.49810 P-value(F) 2.73e-26
Log-likelihood -133.7372 Akaike criterion 271.4744
Schwarz criterion 279.2797 Hannan-Quinn 274.5760

There might be a bug in regards to sample restrictions that cause this error. I do not know. Any help would be appreciated.

Re: White's special case heteroskedasticity test

Posted: Wed Nov 13, 2013 2:34 pm
by EViews Glenn
If I had to guess I'd say that gretl is doing a different R2 computation in the case with no intercept. EViews always uses the intercept only model as a base.

Re: White's special case heteroskedasticity test

Posted: Wed Nov 13, 2013 4:38 pm
by mumpenhour
When I add the intercept to both eviews and gretl the output from each is identical. It is only with the removal of the intercept that there is a difference.
Dependent Variable: RESIDSQ
Method: Least Squares
Date: 11/13/13 Time: 18:39
Sample: 1 732 IF SPRING=1
Included observations: 366

Variable Coefficient Std. Error t-Statistic Prob.

C 0.797966696561924 0.5706942110563645 1.398238638315385 0.1628950978634598
YHAT -0.5270282772418827 0.4962399616402944 -1.062043200833361 0.2889221237967429
YHATSQ 0.1159055044090274 0.1061378459214828 1.09202804525325 0.2755450161819077

R-squared 0.003454787242872404 Mean dependent var 0.2160172954547178
Adjusted R-squared -0.002035819989949062 S.D. dependent var 0.3488463619036611
S.E. of regression 0.3492012755589166 Akaike info criterion 0.7418264054556822
Sum squared resid 44.2647756992667 Schwarz criterion 0.7738152032704475
Log likelihood -132.7542321983898 Hannan-Quinn criter. 0.7545378575709648
F-statistic 0.6292176978568838 Prob(F-statistic) 0.5335889263168632

Model 4: OLS, using observations 1-366
Dependent variable: usq1

coefficient std. error t-ratio p-value
-------------------------------------------------------
const 0.797967 0.570694 1.398 0.1629
yhat1 -0.527028 0.496240 -1.062 0.2889
sq_yhat1 0.115906 0.106138 1.092 0.2755

Mean dependent var 0.216017 S.D. dependent var 0.348846
Sum squared resid 44.26477 S.E. of regression 0.349201
R-squared 0.003455 Adjusted R-squared -0.002036
F(2, 363) 0.629218 P-value(F) 0.533589
Log-likelihood -132.7542 Akaike criterion 271.5084
Schwarz criterion 283.2163 Hannan-Quinn 276.1608

Re: White's special case heteroskedasticity test

Posted: Wed Nov 13, 2013 4:46 pm
by EViews Gareth
Seems like Glenn's guess was pretty good then...

Re: White's special case heteroskedasticity test

Posted: Wed Nov 13, 2013 4:58 pm
by mumpenhour
No you misunderstand, his point is not correct with regards to the test. Because both of these models are supposed to show heteroskedasticity through either the LM statistic or through the use of the F statistic. This test is supposed to mimic the white test that is built in to the program. By excluding the intercept, as I have seen in other forums for programs like stata, you negate the intercept and just have the fitted values and fitted values^2 to get the specialized white test.

Re: White's special case heteroskedasticity test

Posted: Wed Nov 13, 2013 5:06 pm
by mumpenhour
Here is the regression step by step so people understand:
Dependent Variable: CUMGPA
Method: Least Squares
Date: 11/13/13 Time: 18:59
Sample: 1 732 IF SPRING=1
Included observations: 366

Variable Coefficient Std. Error t-Statistic Prob.

C 1.470065 0.229803 6.397063 0.0000
SAT 0.001141 0.000179 6.388504 0.0000
HSPERC -0.008566 0.001240 -6.906003 0.0000
TOTHRS 0.002504 0.000731 3.425510 0.0007
BLACK -0.128284 0.147370 -0.870486 0.3846
WHITE -0.058722 0.140990 -0.416497 0.6773
FEMALE 0.303433 0.059020 5.141165 0.0000

R-squared 0.400560 Mean dependent var 2.334153
Adjusted R-squared 0.390542 S.D. dependent var 0.601126
S.E. of regression 0.469286 Akaike info criterion 1.343732
Sum squared resid 79.06233 Schwarz criterion 1.418372
Log likelihood -238.9029 Hannan-Quinn criter. 1.373392
F-statistic 39.98208 Prob(F-statistic) 0.000000

Built in White test

Heteroskedasticity Test: White

F-statistic 3.629836 Prob. F(23,342) 0.0000
Obs*R-squared 71.81422 Prob. Chi-Square(23) 0.0000
Scaled explained SS 89.84839 Prob. Chi-Square(23) 0.0000


Test Equation:
Dependent Variable: RESID^2
Method: Least Squares
Date: 11/13/13 Time: 19:00
Sample: 1 732 IF SPRING=1
Included observations: 366
Collinear test regressors dropped from specification

Variable Coefficient Std. Error t-Statistic Prob.

C 0.748114 1.006458 0.743314 0.4578
SAT^2 6.85E-07 6.21E-07 1.102059 0.2712
SAT*HSPERC -8.15E-07 5.94E-06 -0.137065 0.8911
SAT*TOTHRS -6.69E-06 3.76E-06 -1.779145 0.0761
SAT*BLACK 0.000798 0.000810 0.985402 0.3251
SAT*WHITE 0.000342 0.000722 0.473846 0.6359
SAT*FEMALE -0.000232 0.000303 -0.767388 0.4434
SAT -0.000930 0.001461 -0.636818 0.5247
HSPERC^2 7.75E-05 3.74E-05 2.069623 0.0392
HSPERC*TOTHRS -5.63E-06 2.69E-05 -0.209662 0.8341
HSPERC*BLACK 0.001311 0.005295 0.247640 0.8046
HSPERC*WHITE 0.004146 0.004907 0.845006 0.3987
HSPERC*FEMALE -0.000903 0.002409 -0.375028 0.7079
HSPERC -0.007883 0.008268 -0.953522 0.3410
TOTHRS^2 5.05E-05 1.73E-05 2.921279 0.0037
TOTHRS*BLACK -0.003979 0.005439 -0.731463 0.4650
TOTHRS*WHITE -0.002024 0.005387 -0.375695 0.7074
TOTHRS*FEMALE -0.000609 0.001214 -0.502005 0.6160
TOTHRS -0.000258 0.006853 -0.037671 0.9700
BLACK^2 -0.433255 0.896341 -0.483360 0.6291
BLACK*FEMALE -0.045258 0.319959 -0.141450 0.8876
WHITE^2 -0.305580 0.821256 -0.372089 0.7101
WHITE*FEMALE 0.127827 0.305694 0.418152 0.6761
FEMALE^2 0.190255 0.432399 0.439999 0.6602

R-squared 0.196214 Mean dependent var 0.216017
Adjusted R-squared 0.142158 S.D. dependent var 0.348846
S.E. of regression 0.323101 Akaike info criterion 0.641619
Sum squared resid 35.70277 Schwarz criterion 0.897530
Log likelihood -93.41635 Hannan-Quinn criter. 0.743311
F-statistic 3.629836 Prob(F-statistic) 0.000000

The model has tested positive for heteroskedasticity.

Running a regression of residuals squared on yhat yhat^2 in both eviews and gretl:(This model is supposed to mimic the white test)

Dependent Variable: RESIDSQ
Method: Least Squares
Date: 11/13/13 Time: 19:04
Sample: 1 732 IF SPRING=1
Included observations: 366

Variable Coefficient Std. Error t-Statistic Prob.

YHAT 0.162950 0.052499 3.103862 0.0021
YHATSQ -0.029530 0.021158 -1.395686 0.1637

R-squared -0.001912 Mean dependent var 0.216017
Adjusted R-squared -0.004665 S.D. dependent var 0.348846
S.E. of regression 0.349659 Akaike info criterion 0.741733
Sum squared resid 44.50318 Schwarz criterion 0.763059
Log likelihood -133.7372 Hannan-Quinn criter. 0.750208

Gretl
Model 3: OLS, using observations 1-366
Dependent variable: usq1

coefficient std. error t-ratio p-value
-------------------------------------------------------
yhat1 0.162950 0.0524991 3.104 0.0021 ***
sq_yhat1 -0.0295298 0.0211579 -1.396 0.1637

Mean dependent var 0.216017 S.D. dependent var 0.348846
Sum squared resid 44.50318 S.E. of regression 0.349659
R-squared 0.276336 Adjusted R-squared 0.274348
F(2, 364) 69.49810 P-value(F) 2.73e-26
Log-likelihood -133.7372 Akaike criterion 271.4744
Schwarz criterion 279.2797 Hannan-Quinn 274.5760

Gretl rejects homoskedasticity =/= Eviews fails to reject

Re: White's special case heteroskedasticity test

Posted: Wed Nov 13, 2013 5:24 pm
by EViews Gareth
Your posts are a little difficult to follow.

You say you ran a secondary regression yourself. In both packages. I believe you're saying that those results don't match? If they matched in every way, other than R-squared, which is what you seem to be saying, then the answer is that Gretl calculates R-squared differently when you have no intercept.

Did you run the built in White test procedure in both EViews and Gretl (I know little about Gretl, but I assume is has a built in White test)? If so, did they give the same results? If they give the same results, then I think we're done.

Re: White's special case heteroskedasticity test

Posted: Wed Nov 13, 2013 5:33 pm
by mumpenhour
No you misunderstand. The problem is that the specialized white's test where you save the residuals, square them, and then run a regression of residsq on (fitted values)yhat and [(fitted values)^2]yhat^2 is wrong in eviews. As stated before, this test is supposed to mimic the white's test, meanwhile conserving the degrees of freedom. After the test is completed, there exists two possibilities to test for heteroskedasticity, one is an LM test or using the f test on the two variables.
The coefficients are correct, but using the special case white's test is not correct in eviews. The results that I showed with gretl reject the null hypothesis of homoskedasticity, whereas the eviews regression does not.

Re: White's special case heteroskedasticity test

Posted: Wed Nov 13, 2013 6:24 pm
by EViews Gareth
If I understand what you're saying, I think you want EViews to (somehow) detect when the user is running a regression in order to perform a manual White's test, and adjust the results to match that special case. That's asking a little much.


Perhaps a better way to state your findings is as follows:
  1. There is a special form of the White test, that works in specific cases involving a specific form of R-squared.
  2. Neither EViews nor Gretl use that special form of the White test as their built-in White test.
  3. In order to perform the special form of the White test, you must manually run a regression and use the specific form of the R-squared.
  4. EViews does not automatically calculate the special form of the R-squared. Gretl does.
  5. Thus when you manually calculate the special White test with the Special R-squared, you have to manually perform a regression in both software packages, but you have to calculate the special R-squared manually in EViews.
To take that final conclusion and translate it as "EViews has a bug in its White test" is a little far-fetched.

I guess the lesson is that you should always understand a) the econometrics of what you're trying to do, and b) the econometrics of what the software package you're trying to use is doing. Presuming that the software package is making the same assumptions that you are assuming is generally a bad idea.

Re: White's special case heteroskedasticity test

Posted: Wed Nov 13, 2013 7:11 pm
by mumpenhour
First of all it is not a special R squared, it is the rsquared obtained from regressing the residuals squared on the fitted values and their squares. It has nothing to do with "special." My statement is not that eviews should magically know what I'm doing either. My point is that conducting the specialized white's test.
Woolridge:
"It is possible to obtain a test that is easier to implement than the White test and more
conserving on degrees of freedom. To create the test, recall that the difference between the
White and Breusch­Pagan tests is that the former includes the squares and cross products
of the independent variables. We can preserve the spirit of the White test while conserv­
ing on degrees of freedom by using the OLS fitted values in a test for heteroskedasticity.
Remember that the fitted values are defined, for each observation i, by

yhat=c+b1hatx1+...bkhatxk+u

These are just linear functions of the independent variables. If we square the fitted values,
we get a particular function of all the squares and cross products of the independent vari­
ables. This suggests testing for heteroskedasticity by estimating the equation

Residsq= deltac + delta1Yhat + delta2yhat^2 [8.20]

where ˆ y stands for the fitted values. It is important not to confuse ˆ y and y in this equation.
We use the fitted values because they are functions of the independent variables (and the es­
timated parameters); using y in (8.20) does not produce a valid test for heteroskedasticity.
We can use the F or LM statistic for the null hypothesis H0: delta1=0 and delta2=0
in equation (8.20)."

This is from my book and this procedure is what I'm trying to do. I hope this clears things up.

White's special case heteroskedasticity test

Posted: Wed Nov 13, 2013 7:14 pm
by EViews Gareth
There are many different forms/definitions of R-squared, especially in models with no constant. Gretl uses one. EViews uses another. Neither is more correct than the other.

Re: White's special case heteroskedasticity test

Posted: Wed Nov 13, 2013 9:37 pm
by mumpenhour
If it is the matter of r^2, how does eviews calculate it's r^2?

Re: White's special case heteroskedasticity test

Posted: Wed Nov 13, 2013 9:59 pm
by EViews Gareth
1-(e'e)/(y-ybar)'(y-ybar)

Where e is the residual, y is the dependent variable, and ybar is the mean of the dependent.

Re: White's special case heteroskedasticity test

Posted: Wed Nov 13, 2013 10:20 pm
by mumpenhour
So 1-RSS/TSS

Re: White's special case heteroskedasticity test

Posted: Thu Nov 14, 2013 5:16 pm
by EViews Glenn
I think you are missing the point. The issue in question is how one defines TSS in a model without a constant. All Gareth and I are saying is that EViews uses the deviations from mean as the residuals in forming the TSS. Our guess is that Gretl does not. This difference is an interpretive one and not enough for you to unfairly conclude that EViews is not correctly computing results for the "special case of the White test".

I must admit to not having followed this thread closely until now. Having done so, I will point out that this whole debate is really moot as the auxiliary regression *should* have a constant as the squared residuals certainly are not mean zero. Note that the Wooldridge text that you quote explicitly includes the constant in the auxiliary regression. I would be astonished, repeat astonished, if Gretl and EViews didn't provide identical results were the constant included.

Next, note that the built-in EViews routine offers a variant of the White test that economizes on d.f. but which differs from the Wooldridge special case test. In this case, EViews includes the levels and squares but not the cross-products of the original regressors in the test equation. This is simply a different "function of all of the squares and cross products of the independent variables" (in the terminology of Wooldridge). This variant may be accessed by unchecking the "Include White cross terms" checkbox in the dialog.

Lastly, I will note that in all cases, the null of heteroskedasticity is rejected.