stepwise with log of non positive values
Moderators: EViews Gareth, EViews Moderator
stepwise with log of non positive values
I am using stepwise regression on a data set with missing values.
Eviews 8 Sept 20 2013 build.
In the list of search regressors, I accidentally entered the log of two binary variables. In Eviews 7, the error message "Log of a non-positive value". In Eviews 8, I get results but I would not classify the results as correct.
Below log(ivy) and log(priv) are logs of binary variables.
Forward selection with log(binary) in the search regressors:
STEPLS(FTOL=0.99,BTOL=0.99) LOG(SCORE) C @ LOG(ACCEPT) LOG(ACT) LOG(ALUM) LOG(ENDOW) LOG(ENDPERSTUD) LOG(ENDPERUND) LOG(GRAD) LOG(IVY) LOG(PRIV) LOG(RET) LOG(S_F) LOG(SAT) LOG(TOPTEN) LOG(TOTENROLL) LOG(UNENROLL)
Dependent Variable: LOG(SCORE)
Method: Stepwise Regression
Date: 10/11/13 Time: 09:11
Sample: 1 202
Included observations: 199
Number of always included regressors: 1
Number of search regressors: 15
Selection method: Stepwise forwards
Stopping criterion: p-value forwards/backwards = 0.9/0.9
Note: final equation sample is larger than stepwise sample (rejected
regressors contain missing values)
LOG(SCORE) = C(1) + C(2)*LOG(ENDPERUND) + C(3)*LOG(TOTENROLL) + C(4)*LOG(ALUM) + C(5)*LOG(S_F) + C(6)*LOG(GRAD) + C(7)*LOG(ACCEPT)
Variable Prob.*
C 0.3815
LOG(ENDPERUND) 0
LOG(TOTENROLL) 0
LOG(ALUM) 0
LOG(S_F) 0.0175
LOG(GRAD) 0
LOG(ACCEPT) 0
R-squared 0.916617
Identical results with .99,.99 rather than .9,.9
Combinatorial with same search regressors
STEPLS(METHOD=COMB,FTOL=0.9,BTOL=0.9,NVARS=7) LOG(SCORE) C @ LOG(ACCEPT) LOG(ACT) LOG(ALUM) LOG(ENDOW) LOG(ENDPERSTUD) LOG(ENDPERUND) LOG(GRAD) LOG(IVY) LOG(PRIV) LOG(RET) LOG(S_F) LOG(SAT) LOG(TOPTEN) LOG(TOTENROLL) LOG(UNENROLL)
Dependent Variable: LOG(SCORE)
Method: Stepwise Regression
Date: 10/11/13 Time: 09:18
Sample: 1 202
Included observations: 199
Number of always included regressors: 1
Number of search regressors: 15
Selection method: Combinatorial
Number of search regressors: 7
Note: final equation sample is larger than stepwise sample (rejected
regressors contain missing values)
LOG(SCORE) = C(1) + C(2)*LOG(ENDPERUND) + C(3)*LOG(TOTENROLL) + C(4)*LOG(ALUM) + C(5)*LOG(S_F) + C(6)*LOG(GRAD) + C(7)*LOG(ACCEPT) + C(8)*LOG(RET)
Variable Prob.*
C 0
LOG(ENDPERUND) 0.0003
LOG(TOTENROLL) 0.0001
LOG(ALUM) 0
LOG(S_F) 0.0007
LOG(GRAD) 0
LOG(ACCEPT) 0.003
LOG(RET) 0
R-squared 0.926661
LOG(RET) is in the equation at a pvalue < .9 and yet is not in the forward stepwise. The reported sample and included number of observations are the same and I checked that the 3 missing values of each regression occur at the same observations.
At least as interesting,
STEPLS(METHOD=COMB,FTOL=0.9,BTOL=0.9,NVARS=8) LOG(SCORE) C @ LOG(ACCEPT) LOG(ACT) LOG(ALUM) LOG(ENDOW) LOG(ENDPERSTUD) LOG(ENDPERUND) LOG(GRAD) LOG(IVY) LOG(PRIV) LOG(RET) LOG(S_F) LOG(SAT) LOG(TOPTEN) LOG(TOTENROLL) LOG(UNENROLL)
produces
Dependent Variable: LOG(SCORE)
Method: Stepwise Regression
Date: 10/11/13 Time: 08:59
Sample: 1 202
Included observations: 202
Number of always included regressors: 1
Number of search regressors: 15
Selection method: Combinatorial
Number of search regressors: 8
Note: final equation sample is larger than stepwise sample (rejected
regressors contain missing values)
Variable Coefficient Std. Error t-Statistic Prob.*
C 3.887912 0.023380 166.2932 0.0000
R-squared 0.000000 Mean dependent var 3.887912
Number of combinations compared: 6435
Somehow included obs is 202 nothing excluded and no variables.
LS LOG(SCORE) C LOG(ENDPERUND) LOG(TOTENROLL) LOG(ALUM) LOG(S_F) LOG(GRAD) LOG(ACCEPT) LOG(RET) LOG(UNENROLL)
Produces
Dependent Variable: LOG(SCORE)
Method: Least Squares
Date: 10/11/13 Time: 09:30
Sample: 1 202
Included observations: 199
LOG(SCORE) = C(1) + C(2)*LOG(ENDPERUND) + C(3)*LOG(TOTENROLL) + C(4)*LOG(ALUM) + C(5)*LOG(S_F) + C(6)*LOG(GRAD) + C(7)*LOG(ACCEPT) + C(8)*LOG(RET) + C(9)*LOG(UNENROLL)
C 0
LOG(ENDPERUND) 0.0001
LOG(TOTENROLL) 0.6985
LOG(ALUM) 0
LOG(S_F) 0.0002
LOG(GRAD) 0
LOG(ACCEPT) 0.002
LOG(RET) 0
LOG(UNENROLL) 0.1138
R-squared 0.927622
Which is a 8 variable plus intercept, 199 observations, and higher R^2 than the 7 variable plus intercept above. Why didn't combinatorial at least see that model? Why did it produce NO ANSWER?
In Eviews 7 the LOG(PRIV) and LOG(IVY) produce 'log non-positive number'. In Eviews 8, we get, well, I don't know.
Data set available at http://econ413.wustl.edu/pearlmank-v8.wf1
Regressions of interest at T* what WHAT*
See table RESULTS22 if you like.
Bob
Eviews 8 Sept 20 2013 build.
In the list of search regressors, I accidentally entered the log of two binary variables. In Eviews 7, the error message "Log of a non-positive value". In Eviews 8, I get results but I would not classify the results as correct.
Below log(ivy) and log(priv) are logs of binary variables.
Forward selection with log(binary) in the search regressors:
STEPLS(FTOL=0.99,BTOL=0.99) LOG(SCORE) C @ LOG(ACCEPT) LOG(ACT) LOG(ALUM) LOG(ENDOW) LOG(ENDPERSTUD) LOG(ENDPERUND) LOG(GRAD) LOG(IVY) LOG(PRIV) LOG(RET) LOG(S_F) LOG(SAT) LOG(TOPTEN) LOG(TOTENROLL) LOG(UNENROLL)
Dependent Variable: LOG(SCORE)
Method: Stepwise Regression
Date: 10/11/13 Time: 09:11
Sample: 1 202
Included observations: 199
Number of always included regressors: 1
Number of search regressors: 15
Selection method: Stepwise forwards
Stopping criterion: p-value forwards/backwards = 0.9/0.9
Note: final equation sample is larger than stepwise sample (rejected
regressors contain missing values)
LOG(SCORE) = C(1) + C(2)*LOG(ENDPERUND) + C(3)*LOG(TOTENROLL) + C(4)*LOG(ALUM) + C(5)*LOG(S_F) + C(6)*LOG(GRAD) + C(7)*LOG(ACCEPT)
Variable Prob.*
C 0.3815
LOG(ENDPERUND) 0
LOG(TOTENROLL) 0
LOG(ALUM) 0
LOG(S_F) 0.0175
LOG(GRAD) 0
LOG(ACCEPT) 0
R-squared 0.916617
Identical results with .99,.99 rather than .9,.9
Combinatorial with same search regressors
STEPLS(METHOD=COMB,FTOL=0.9,BTOL=0.9,NVARS=7) LOG(SCORE) C @ LOG(ACCEPT) LOG(ACT) LOG(ALUM) LOG(ENDOW) LOG(ENDPERSTUD) LOG(ENDPERUND) LOG(GRAD) LOG(IVY) LOG(PRIV) LOG(RET) LOG(S_F) LOG(SAT) LOG(TOPTEN) LOG(TOTENROLL) LOG(UNENROLL)
Dependent Variable: LOG(SCORE)
Method: Stepwise Regression
Date: 10/11/13 Time: 09:18
Sample: 1 202
Included observations: 199
Number of always included regressors: 1
Number of search regressors: 15
Selection method: Combinatorial
Number of search regressors: 7
Note: final equation sample is larger than stepwise sample (rejected
regressors contain missing values)
LOG(SCORE) = C(1) + C(2)*LOG(ENDPERUND) + C(3)*LOG(TOTENROLL) + C(4)*LOG(ALUM) + C(5)*LOG(S_F) + C(6)*LOG(GRAD) + C(7)*LOG(ACCEPT) + C(8)*LOG(RET)
Variable Prob.*
C 0
LOG(ENDPERUND) 0.0003
LOG(TOTENROLL) 0.0001
LOG(ALUM) 0
LOG(S_F) 0.0007
LOG(GRAD) 0
LOG(ACCEPT) 0.003
LOG(RET) 0
R-squared 0.926661
LOG(RET) is in the equation at a pvalue < .9 and yet is not in the forward stepwise. The reported sample and included number of observations are the same and I checked that the 3 missing values of each regression occur at the same observations.
At least as interesting,
STEPLS(METHOD=COMB,FTOL=0.9,BTOL=0.9,NVARS=8) LOG(SCORE) C @ LOG(ACCEPT) LOG(ACT) LOG(ALUM) LOG(ENDOW) LOG(ENDPERSTUD) LOG(ENDPERUND) LOG(GRAD) LOG(IVY) LOG(PRIV) LOG(RET) LOG(S_F) LOG(SAT) LOG(TOPTEN) LOG(TOTENROLL) LOG(UNENROLL)
produces
Dependent Variable: LOG(SCORE)
Method: Stepwise Regression
Date: 10/11/13 Time: 08:59
Sample: 1 202
Included observations: 202
Number of always included regressors: 1
Number of search regressors: 15
Selection method: Combinatorial
Number of search regressors: 8
Note: final equation sample is larger than stepwise sample (rejected
regressors contain missing values)
Variable Coefficient Std. Error t-Statistic Prob.*
C 3.887912 0.023380 166.2932 0.0000
R-squared 0.000000 Mean dependent var 3.887912
Number of combinations compared: 6435
Somehow included obs is 202 nothing excluded and no variables.
LS LOG(SCORE) C LOG(ENDPERUND) LOG(TOTENROLL) LOG(ALUM) LOG(S_F) LOG(GRAD) LOG(ACCEPT) LOG(RET) LOG(UNENROLL)
Produces
Dependent Variable: LOG(SCORE)
Method: Least Squares
Date: 10/11/13 Time: 09:30
Sample: 1 202
Included observations: 199
LOG(SCORE) = C(1) + C(2)*LOG(ENDPERUND) + C(3)*LOG(TOTENROLL) + C(4)*LOG(ALUM) + C(5)*LOG(S_F) + C(6)*LOG(GRAD) + C(7)*LOG(ACCEPT) + C(8)*LOG(RET) + C(9)*LOG(UNENROLL)
C 0
LOG(ENDPERUND) 0.0001
LOG(TOTENROLL) 0.6985
LOG(ALUM) 0
LOG(S_F) 0.0002
LOG(GRAD) 0
LOG(ACCEPT) 0.002
LOG(RET) 0
LOG(UNENROLL) 0.1138
R-squared 0.927622
Which is a 8 variable plus intercept, 199 observations, and higher R^2 than the 7 variable plus intercept above. Why didn't combinatorial at least see that model? Why did it produce NO ANSWER?
In Eviews 7 the LOG(PRIV) and LOG(IVY) produce 'log non-positive number'. In Eviews 8, we get, well, I don't know.
Data set available at http://econ413.wustl.edu/pearlmank-v8.wf1
Regressions of interest at T* what WHAT*
See table RESULTS22 if you like.
Bob
-
EViews Gareth
- Fe ddaethom, fe welon, fe amcangyfrifon
- Posts: 13605
- Joined: Tue Sep 16, 2008 5:38 pm
Re: stepwise with log of non positive values
Sorry, I'm having difficulty parsing your post. I can't follow what you believe is the issue.
Re: stepwise with log of non positive values
OK
I do a stepwise. Stops at 6 search variables with R^2 .916617
I do a combinatorial (to prove the point) selecting 7 of the 15 search variables.
That produces 7 search variables, adding log(ret) at a PVALUE of 0.00000000 and an R^2 = .926661
So the first stepwise, entering variables at .9 FAILS to enter log(ret) which has a pvalue of 0,00000000
I don't know how to make it more clear to you that stepwise is not working correctly.
Add to that, a combinatorial of 8 produces BLANK!!!!!!!!!!!!!!!! with an R^2 = 0.00000000000000000000
You have to think that is an ISSUE!
I was careful to say that this data is special, that log(ivy) and log(priv) are logs of binary variables and in Version 7 I get 'log of a non-positive number'.
I took two hours constructing and reconstructing the example, being very careful, and writing what I thought were the pertinent points.
I do a stepwise. Stops at 6 search variables with R^2 .916617
I do a combinatorial (to prove the point) selecting 7 of the 15 search variables.
That produces 7 search variables, adding log(ret) at a PVALUE of 0.00000000 and an R^2 = .926661
So the first stepwise, entering variables at .9 FAILS to enter log(ret) which has a pvalue of 0,00000000
I don't know how to make it more clear to you that stepwise is not working correctly.
Add to that, a combinatorial of 8 produces BLANK!!!!!!!!!!!!!!!! with an R^2 = 0.00000000000000000000
You have to think that is an ISSUE!
I was careful to say that this data is special, that log(ivy) and log(priv) are logs of binary variables and in Version 7 I get 'log of a non-positive number'.
I took two hours constructing and reconstructing the example, being very careful, and writing what I thought were the pertinent points.
-
EViews Gareth
- Fe ddaethom, fe welon, fe amcangyfrifon
- Posts: 13605
- Joined: Tue Sep 16, 2008 5:38 pm
Re: stepwise with log of non positive values
What's the build date of your copy?
Re: stepwise with log of non positive values
really, the first two lines of my original post
I am using stepwise regression on a data set with missing values.
Eviews 8 Sept 20 2013 build.
-
EViews Gareth
- Fe ddaethom, fe welon, fe amcangyfrifon
- Posts: 13605
- Joined: Tue Sep 16, 2008 5:38 pm
Re: stepwise with log of non positive values
Sorry, missed that. 32bit or 64bit?
I cannot replicate your results...
I cannot replicate your results...
- Attachments
-
- 2013-10-12_070719.png (36.08 KiB) Viewed 15834 times
Re: stepwise with log of non positive values
I should have mentioned STANDARD EDITION, XP service pack 3.
Your stepwise stops at log(ret) but by hand

199 observations, your version stops without LOG(UNENROLL) which enters with pvalue=.11<.99
Your stepwise stops at log(ret) but by hand
Code: Select all
LS LOG(SCORE) C LOG(ENDPERUND) LOG(TOTENROLL) LOG(ALUM) LOG(S_F) LOG(GRAD) LOG(ACCEPT) LOG(RET) LOG(UNENROLL)
199 observations, your version stops without LOG(UNENROLL) which enters with pvalue=.11<.99
-
EViews Gareth
- Fe ddaethom, fe welon, fe amcangyfrifon
- Posts: 13605
- Joined: Tue Sep 16, 2008 5:38 pm
Re: stepwise with log of non positive values
I'm somewhat confused over which issue you're asking about.
The first issue you mentioned, I believe, was that stepls with forward selection doesn't match the results given with stepls with combinatorial. However when I did it (as my screenshot shows), I obtained the same results between the two. Can you confirm that a) my interpretation of your issue is correct, b) you do not receive the same results between the two methods?
The first issue you mentioned, I believe, was that stepls with forward selection doesn't match the results given with stepls with combinatorial. However when I did it (as my screenshot shows), I obtained the same results between the two. Can you confirm that a) my interpretation of your issue is correct, b) you do not receive the same results between the two methods?
Re: stepwise with log of non positive values
Problem #1
Stepwise does not select variables which pass the selection criteria. In my case LOG(RET) was not entered but the pvalue passed the criterion. YOUR case, LOG(UNENROLL), was not entered but passes the criteria.
I used combinatorial to show that in MY CASE the stepwise stopped too soon. I could have done it by hand but thought better to show it via combinatorial - but MATCH is not THE issue although it is an issue.
Issue #2: the combinatorial produces a nothing NO VARIABLES selected
Version 7 produces a 'log of non positive value' for these regressions.
Stepwise does not select variables which pass the selection criteria. In my case LOG(RET) was not entered but the pvalue passed the criterion. YOUR case, LOG(UNENROLL), was not entered but passes the criteria.
I used combinatorial to show that in MY CASE the stepwise stopped too soon. I could have done it by hand but thought better to show it via combinatorial - but MATCH is not THE issue although it is an issue.
Issue #2: the combinatorial produces a nothing NO VARIABLES selected
Code: Select all
STEPLS(METHOD=COMB,FTOL=0.9,BTOL=0.9,NVARS=8) LOG(SCORE) C @ LOG(ACCEPT) LOG(ACT) LOG(ALUM) LOG(ENDOW) LOG(ENDPERSTUD) LOG(ENDPERUND) LOG(GRAD) LOG(IVY) LOG(PRIV) LOG(RET) LOG(S_F) LOG(SAT) LOG(TOPTEN) LOG(TOTENROLL) LOG(UNENROLL)-
EViews Gareth
- Fe ddaethom, fe welon, fe amcangyfrifon
- Posts: 13605
- Joined: Tue Sep 16, 2008 5:38 pm
Re: stepwise with log of non positive values
Ah, I think I see your confusion. The key is the warning message printed at the top of the output:
In your particular case the number of observations used during the stepwise procedure is 8. Not surprisingly with only 8 observations, combinatorial (forcing 8 regressors) breaks down. Similarly, a forwards routine will stop before the model becomes singular due to a lack of observations.
Code: Select all
Note: final equation sample is larger than stepwise sample (rejected
regressors contain missing values)
Re: stepwise with log of non positive values
I am well aware of the warning message. Eviews 7 did not warn, it would not do the regression.
I now see where you say 8 observations (common sample), but why then

Number of observations: 8
Are you saying you use the cross product matrix common sample for ALL the variables, make the selection, and then EXPAND the data set for that particular regression. If so could you please point to a reference? And I would ask that the generic message be replaced with an exact message, something like 'Selection is based on ## observations while the final regression displayed is based on ## observations'. At least then someone like me will not waste time again.
I now see where you say 8 observations (common sample), but why then
Code: Select all
STEPLS(METHOD=COMB,FTOL=0.9,BTOL=0.9,NVARS=7) LOG(SCORE) C @ LOG(ACCEPT) LOG(ACT) LOG(ALUM) LOG(ENDOW) LOG(ENDPERSTUD) LOG(ENDPERUND) LOG(GRAD) LOG(IVY) LOG(PRIV) LOG(RET) LOG(S_F) LOG(SAT) LOG(TOPTEN) LOG(TOTENROLL) LOG(UNENROLL)

Number of observations: 8
Are you saying you use the cross product matrix common sample for ALL the variables, make the selection, and then EXPAND the data set for that particular regression. If so could you please point to a reference? And I would ask that the generic message be replaced with an exact message, something like 'Selection is based on ## observations while the final regression displayed is based on ## observations'. At least then someone like me will not waste time again.
-
EViews Gareth
- Fe ddaethom, fe welon, fe amcangyfrifon
- Posts: 13605
- Joined: Tue Sep 16, 2008 5:38 pm
Re: stepwise with log of non positive values
From the manual:
We'll take your suggestion on board.
Code: Select all
Following the Stepwise selection process, EViews reports the results of the final regression,
i.e. the regression of the always-included and the selected variables on the dependent variable.
In some cases the sample used in this equation may not coincide with the regression
that was used during the selection process. This will occur if some of the omitted search
variables have missing values for some observations that do not have missing values in the
final regression. In such cases EViews will print a warning in the regression output.
Re: stepwise with log of non positive values
Well I looked around at the help files but likely I did not look at the manual but only the help files. My fault I guess.
Again, do you have a references about this method to handle missing values in a stepwise proceedure?
Again, do you have a references about this method to handle missing values in a stepwise proceedure?
-
EViews Gareth
- Fe ddaethom, fe welon, fe amcangyfrifon
- Posts: 13605
- Joined: Tue Sep 16, 2008 5:38 pm
Re: stepwise with log of non positive values
Help files are identical to the manual.
We do not have a reference.
We do not have a reference.
Who is online
Users browsing this forum: No registered users and 2 guests
