Hi everyone,
I noticed that EViews 6 offers stepwise regression. Is it possible to use a stepwise regression algorithm that uses HAC-consitent error estimates (Newey_west)? I would like to avoid collinearity between the regressors whilst accounting for autocorrelation and heteroskedasticity in the error estimates. The dialogue for stepwise regression does not include specifications for the error estimates. Can anybody help? Thanx
Stepwise regression and HAC error estimates
Moderators: EViews Gareth, EViews Moderator
-
- Posts: 83
- Joined: Thu Apr 15, 2010 3:54 am
-
- Fe ddaethom, fe welon, fe amcangyfrifon
- Posts: 13319
- Joined: Tue Sep 16, 2008 5:38 pm
Re: Stepwise regression and HAC error estimates
HAC is not available in the built in Stepwise routines (since HAC means you can't use many of the "tricks" we use in the internal Stepwise code). However you could always program one yourself. The program shown here:
viewtopic.php?f=15&t=383&p=1379&hilit=stepwise#p1379
is a good starting point.
viewtopic.php?f=15&t=383&p=1379&hilit=stepwise#p1379
is a good starting point.
Follow us on Twitter @IHSEViews
-
- Posts: 83
- Joined: Thu Apr 15, 2010 3:54 am
Re: Stepwise regression and HAC error estimates
Okay, thanks. I thought as much. I have a VBA code that does forward stepwise regression as well as another snippet for HAC error estimates according to White and Newey-West. I shall try to implement the code in EViews when I have the time and post it again once it is in working order.
-
- Posts: 83
- Joined: Thu Apr 15, 2010 3:54 am
Re: Stepwise regression and HAC error estimates
PLEASE IGNORE THIS POST. REFER TO POST BELOW...
Hi Gareth
I have build on your suggested code snippet and translated an old script from R. As I don't know the command references well yet there may be some errors or superfluous loops. The idea is to incorporate variables from a pool of potential regressors if they contribute more in terms of explanatory power than they 'cost' in terms of degrees of freedom reduction. Similarly, regressors are removed if the benefits from increasing the degrees of freedom outway the loss in explanatory power. The F statistic (F-to-enter and F-to-leave) is used to determine which variables enter/leave. Lastly, the model removes regressors if (one) other regressor(s) explain(s) a significant proportion of the variation in the regressor being tested (Multicollinearity). In a nutshell, at every iteration:
1. variable enters based on F-to-enter
2. variable removed based on collinearity
3. variable leaves based on F-to-leave
Assume all potential regressors are grouped in 'xs' and the regressand is named 'y'. I will post amendments as I go along. Your input is greatly appreciated....
[color=#FF0000] [/color]p.s. it should be fairly easy to amend the code to include HAC error estimates at every step
Hi Gareth
I have build on your suggested code snippet and translated an old script from R. As I don't know the command references well yet there may be some errors or superfluous loops. The idea is to incorporate variables from a pool of potential regressors if they contribute more in terms of explanatory power than they 'cost' in terms of degrees of freedom reduction. Similarly, regressors are removed if the benefits from increasing the degrees of freedom outway the loss in explanatory power. The F statistic (F-to-enter and F-to-leave) is used to determine which variables enter/leave. Lastly, the model removes regressors if (one) other regressor(s) explain(s) a significant proportion of the variation in the regressor being tested (Multicollinearity). In a nutshell, at every iteration:
1. variable enters based on F-to-enter
2. variable removed based on collinearity
3. variable leaves based on F-to-leave
Assume all potential regressors are grouped in 'xs' and the regressand is named 'y'. I will post amendments as I go along. Your input is greatly appreciated....
[color=#FF0000]
Code: Select all
!Fcrit1=3.84
!Fcrit2=2.71
!tolerance=0.99
!idx = 1
!k = xs.@count
!n = xs.@minobs
group xsa
group xsd
group xsi
for !i=1to xs.@count
%n=xs.@seriesname(!i)
xsd.add {%n}
next
!cnt = 0
While xsa.@count < !k and !cnt < 1000 'max iterations
'ENTER
!cnt = !cnt + 1
!maxF = !Fcrit1
!minF = !Fcrit2
vector (!cnt) ssr
for !i=1 to xsd.@count
%n = xsd.@seriesname(!i)
xsa.add {%n}
equation e1.ls y c xsa
!currentssr= e1.@ssr
!ncoef=e1.@ncoef
!currentmsr=!currentssr/(!n-!ncoef)
if !cnt = 1 Then
!currentF = e1.@f
else
!currentF=(ssr(!cnt-1)-!currentssr)/!currentmsr
endif
if !currentF > !maxF then
!maxF = !currentF
!msr=!currentmsr
!idx = !i
ssr(!cnt) = !currentssr
endif
d e1
xsa.drop {%n}
next
If !maxF> !Fcrit1 then
!enter = 1
%n = xsd.@seriesname(!idx)
'variable enter
xsa.add {%n}
xsd.drop {%n}
else
!enter = 0
endif
stepreg(!cnt, 1) = !maxF
'COLLINEARITY (regress all x on all other xs)
!maxR = !tolerance
for !i=1 to xsa.@count
%n=xsa.@seriesname(!i)
xsa.drop {%n}
equation e1.ls {%n} c xsa
!currentR = e1.@r2
If !currentR>!maxR then
!maxR = !currentR
!idxss = !i
endif
d e1
xsa.add {%n}
next
'remove collinear regressors
if !maxR > !tolerance then
%n=xsa.@seriesname(!idxs)
xsa.drop {%n}
equation e1.ls y c xsa
ssr(!cnt)=e1.@ssr
xsi.add {%n}
endif
'LEAVE
for !i=1 to xsa.@count - 1
%n=xsa.@seriesname(!i)
xsa.drop {%n}
equation e1.ls y c xsa
!currentssr=e1.@ssr
!ncoef=e1.@ncoef
!currentmsr=!currentssr/(!n-!ncoef)
!currentF=(!currentssr-ssr(!cnt))/!msr
if !currentF < !minF then
!minF = !currentFr
!idx = !i
endif
d e1
xsa.add {%n}
next
If !minF< !Fcrit2 then
%n = xsa.@seriesname(!idx)
'variable leave
xsa.drop {%n}
equation e1.ls y c xsa
ssr(!cnt)=e1.@ssr
xsd.add {%n}
for !i=1 to xsi.@count
%n = xsi.@seriesname(!i)
xsd.add {%n}
next
else
if !enter = 0 Then
exitloop
endif
endif
wend
Last edited by fboehlandt on Tue Jan 25, 2011 4:41 am, edited 2 times in total.
-
- Posts: 83
- Joined: Thu Apr 15, 2010 3:54 am
Re: Stepwise regression and HAC error estimates
okay,
streamlined code, removed superfluous loops and corrected errors. This is what I have come up with:
As before, all regressors should be grouped and named 'xs' whereas the regressand is 'y'. This is the forward stepwise regression algorithm from Neter (1996) Applied Linear Models: pp. 348-352. You find the chosen regressors in group 'xsa'. I shall post a HAC-version shortly. Comments welcome!
streamlined code, removed superfluous loops and corrected errors. This is what I have come up with:
Code: Select all
!Fcrit1=3.84
!Fcrit2=2.71
!tolerance=0.01
table(xs.@count, 10) stepreg
!idx = 1
!k = xs.@count
!n = xs.@minobs
group xsa
group xsd
for !i=1to xs.@count
%n=xs.@seriesname(!i)
xsd.add {%n}
next
!cnt = 0
While !cnt < !k
!cnt = !cnt + 1
if !cnt > 1 then
equation e1.ls y c xsa
!ssrr = e1.@ssr
endif
!maxF = !Fcrit1
!minF = !Fcrit2
for !i=1 to xsd.@count
%n = xsd.@seriesname(!i)
xsa.add {%n}
equation e1.ls y c xsa
!currentssr= e1.@ssr
!ncoef=e1.@ncoef
!currentmsr=!currentssr/(!n-!ncoef)
if !cnt = 1 Then
!currentF = e1.@f
else
!currentF=(!ssrr-!currentssr)/!currentmsr
endif
d e1
xsa.drop {%n}
equation e1.ls {%n} c xsa
'tolerance
!currentr2=1-e1.@r2
if !currentF > !maxF and !currentr2 > !tolerance then
!enter = 1
!maxF = !currentF
!msr=!currentmsr
!idx = !i
!ssr = !currentssr
endif
d e1
next
If !maxF> !Fcrit1 then
%n = xsd.@seriesname(!idx)
'variable enter
xsa.add {%n}
xsd.drop {%n}
else
exitloop
endif
If !cnt > 1 then
!cnt2 = 0
While !cnt2 < xsa.@count
!cnt2 = !cnt2 + 1
%n=xsa.@seriesname(1)
xsa.drop {%n}
equation e1.ls y c xsa
!currentssr= e1.@ssr
!currentF=(!currentssr-!ssr)/!msr
If !currentF < !minF then
!minF = !currentF
!idx = !i
endif
xsa.add {%n}
wend
If !minF < !Fcrit2 then
%n = xsa.@seriesname(!idx)
'variable leave
xsa.drop {%n}
endif
endif
wend
As before, all regressors should be grouped and named 'xs' whereas the regressand is 'y'. This is the forward stepwise regression algorithm from Neter (1996) Applied Linear Models: pp. 348-352. You find the chosen regressors in group 'xsa'. I shall post a HAC-version shortly. Comments welcome!
Last edited by fboehlandt on Thu Jan 27, 2011 4:57 am, edited 1 time in total.
-
- Posts: 83
- Joined: Thu Apr 15, 2010 3:54 am
Re: Stepwise regression and HAC error estimates
This time the F-to-enter and F-to-leave calculations are based on coefficient estimates and the standard errors thereof. Consequently, the F-estimations benefit from the Newey-West HAC adjustments at every step:
guys, please check the small changes added 1/27/2011 (marked with '! in the code). I will keep posting adjustments until it works flawlessly. Ps be patient
Note that the results for the routine above and the results of the previous routine will almost certainly be different in the presence of heteroskedasticity and autocorrelation of the error terms. However, in the event that HAC is not a major concern, the selected variables are likely to be the same. In many intances one could run the stepwise regression routine without HAC estimates first and then estimate HAC errors for the final group of regressors. The above approach is consistent throughout. I recommend using the model in this post and simply remove (n) from the equations e1 if HAC estimates are not desired.
guys, please check the small changes added 1/27/2011 (marked with '! in the code). I will keep posting adjustments until it works flawlessly. Ps be patient
Code: Select all
!Fcrit1=3.84
!Fcrit2=2.71
!tolerance=0.01
!idx = 1
!k = xs.@count
!n = xs.@minobs
group xsa
group xsd
for !i=1to xs.@count
%n=xs.@seriesname(!i)
xsd.add {%n}
next
!cnt = 0
!enter = 1 '! line added
While !cnt < !k
!cnt = !cnt + 1
!maxF = !Fcrit1
!minF = !Fcrit2
!rowcounter = 0 '!
vector t
matrix F
for !i=1 to xsd.@count
%n = xsd.@seriesname(!i)
xsa.add {%n}
equation e1.ls(n) y c xsa
vector (xsa.@count) t
matrix (xsa.@count, xsd.@count) F
For !j = 1 to xsa.@count
t(!j)= e1.@tstats(1+!j)
F(!j, !i) = t(!j)^2
next
d e1
xsa.drop {%n}
equation e1.ls(n) {%n} c xsa
'tolerance
!r2=1-e1.@r2
if F(!enter, !i) > !maxF and !r2 > !tolerance then '! F(!cnt, !i) to !F(!enter, !i)
'! removed: !enter = 1
!maxF = F(!enter, !i) '! F(!cnt, !i) to !F(!enter, !i)
!idx = !i
endif
d e1
next
If !maxF> !Fcrit1 then
%n = xsd.@seriesname(!idx)
'variable enter
!enter = !enter + 1 '! line added
xsa.add {%n}
xsd.drop {%n}
else
exitloop
endif
If !cnt > 1 then
For !i=1 to xsa.@count
if F(!i, !idx) < !minF then
!minF=F(!i, !idx)
!jdx = !i
endif
next
If !minF < !Fcrit2 then
%n = xsa.@seriesname(!jdx)
'variable leave
!enter = !enter - 1 '! line added
xsa.drop {%n}
endif
endif
wend
Note that the results for the routine above and the results of the previous routine will almost certainly be different in the presence of heteroskedasticity and autocorrelation of the error terms. However, in the event that HAC is not a major concern, the selected variables are likely to be the same. In many intances one could run the stepwise regression routine without HAC estimates first and then estimate HAC errors for the final group of regressors. The above approach is consistent throughout. I recommend using the model in this post and simply remove (n) from the equations e1 if HAC estimates are not desired.
Last edited by fboehlandt on Thu Jan 27, 2011 8:41 am, edited 3 times in total.
-
- Fe ddaethom, fe welon, fe amcangyfrifon
- Posts: 13319
- Joined: Tue Sep 16, 2008 5:38 pm
Re: Stepwise regression and HAC error estimates
Hi, just seeking a clarification. Would the following syntax, for instance, be sufficient to make sure that the step regression process would actually be performed using the HAC SE?
STEPLS(method=UNI,BACK,BTOL=0.1,COV=HAC,COVBW=NEWEYWEST)
STEPLS(method=UNI,BACK,BTOL=0.1,COV=HAC,COVBW=NEWEYWEST)
-
- Fe ddaethom, fe welon, fe amcangyfrifon
- Posts: 13319
- Joined: Tue Sep 16, 2008 5:38 pm
Re: Stepwise regression and HAC error estimates
The whole point of this thread was that the built in stepwise routines don't support HAC covariances, so you have to program it yourself (which Fboehlandt did masterfully). Thus your syntax will not work.
Follow us on Twitter @IHSEViews
Re: Stepwise regression and HAC error estimates
Thanks for the clarification. Wishful thinking on my part!
-
- Posts: 83
- Joined: Thu Apr 15, 2010 3:54 am
Re: Stepwise regression and HAC error estimates
Hope the comments I have included help. Please don't hesitate to contact me should you have any further questions
p.s. make sure I didnt accidentally delete any lines of coding from above. I dont have Eviews on this computer so had to view the code in the text editor
p.s. make sure I didnt accidentally delete any lines of coding from above. I dont have Eviews on this computer so had to view the code in the text editor
Code: Select all
!Fcrit1=3.84 'this line sets the F-to-enter variable
!Fcrit2=2.71 'this line sets the F-to-leave variable
!tolerance=0.01 'this is the tolerance allowed for (1 - R^2). This indicates that whilst variables may be highly correlated, they may not be perfectly correlated in OLS)
!idx = 1
!k = xs.@count 'the list of regressors available
!n = xs.@minobs 'the minimum number of observations (i.e. in timeseries the number of observations for the shortest series)
group xsa 'a group containing all regressors entered. Start out with 0 series
group xsd 'a group containing all regressors not entered (yet). Starts out with 0 series
'This loops enters all regressors grouped under xs into group xsd
for !i=1to xs.@count
%n=xs.@seriesname(!i)
xsd.add {%n}
next
!cnt = 0
!enter = 1 'this counts the number of regressors entered
While !cnt < !k 'loops as long as there are regressors left to enter
!cnt = !cnt + 1
!maxF = !Fcrit1
!minF = !Fcrit2
!rowcounter = 0 '!
vector t
matrix F
'this loop enters one regressor at a time. The regressor resulting in the maximum Fstat is the first variable to enter (provided the Fstat is in excess of Fcrit).
for !i=1 to xsd.@count
%n = xsd.@seriesname(!i)
xsa.add {%n}
equation e1.ls(n) y c xsa 'this is a simple OLS estimate for xi regressed against y
vector (xsa.@count) t ' all t-values stored in vector for reference
matrix (xsa.@count, xsd.@count) F ' all F-values stored in matrix for reference
For !j = 1 to xsa.@count
t(!j)= e1.@tstats(1+!j)
F(!j, !i) = t(!j)^2
next
d e1
xsa.drop {%n}
equation e1.ls(n) {%n} c xsa
'tolerance
!r2=1-e1.@r2 'to avoid perfect collinearity, this additional restriction is imposed
if F(!enter, !i) > !maxF and !r2 > !tolerance then 'note that F-to-enter is tested against Fcritical of 3.84. For large samples, this should be a good enough approximation but you may want to change this manually
!maxF = F(!enter, !i)
!idx = !i
endif
d e1
next
If !maxF> !Fcrit1 then
%n = xsd.@seriesname(!idx)
'variable enter
!enter = !enter + 1
xsa.add {%n}
xsd.drop {%n}
else
exitloop 'it is possible that none of the regressors add any significant explanatory power, in which case the code stops and exits without entering a variable.
endif
If !cnt > 1 then
'This loop stepwise drops one variable already entered and removes variables if no/little explanatory power is lost.
For !i=1 to xsa.@count
if F(!i, !idx) < !minF then
!minF=F(!i, !idx)
!jdx = !i
endif
next
If !minF < !Fcrit2 then
%n = xsa.@seriesname(!jdx)
'variable leave
!enter = !enter - 1
xsa.drop {%n}
endif
endif
wend
'Comment: due to the outside loop and variables being moved back and forth between xsd and xsa groups, some variables may exit at one stage, reenter, and exit again. Although rarely the case with a limited number of regressors tested, there is a chance that the loop continues for quite a long time. In that case, one may want to implement a manual counter limiting the number of iterations to for instance k = 5000
Return to “Program Repository”
Who is online
Users browsing this forum: No registered users and 24 guests