Dear All,
I am running a simple OLS regression in STATA and EViews.
lnY=a+blnX+cZ+e
I obtain exactly the same results. However, when I run the equation without a constant, I get different R-sq and adjusted R-sq in EViews than in STATA. The R-sq in EViews (for a model without a constant) is v.low and AR-sq is negative. Same model (withjout a constant) run in STATA gives me v.high r-sq and AR-sq (even higher than in the model with a constant).
Why is it that I get identical results with a constant but not without a constant? Why do R-sqs differ in the latter case?
Thank you.
K
Different R-sq in Eviews and STATA.
Moderators: EViews Gareth, EViews Moderator
-
k.m.majkut
- Posts: 2
- Joined: Mon Feb 21, 2011 8:32 am
Re: Different R-sq in Eviews and STATA.
Right,
I managed to figure out what is going on so I'm posting it here FYI.
The formula for R-sq is derived from SStotal=SSerror+SSregression.
R2=1-SSerror/SStotal=(SStotal - SSerror)/SStotal=SSregression/SStotal where SS is the sum of squares.
STATA uses formula:
R2=SSregression/SStotal
SSregression=sum (Yhat-Ybar)2 where 2 means squared, Yhat is the predicted value of Y, and Ybar is the mean value of Y.
SStotal= sum (Y-Ybar)2
When the OLS regression is run without a constant in the model the mean of Y i.e. Ybar is 0 (as the intercept is expected mean value of Y). We then get:
R2=sum (Yhat)2/(Y)2 which is strictly non-negative. Stata uses this formula for R-sq calculus hence we always get positive R-sq.
What is happening in EViews is that when the model provides a really poor fit the data may show more variation around the regression line than around Ybar, in which case SSerror>SStotal, then:
R2=1-SSerror/SStotal < 0. This is possible because EViews must be using the R-sq formula R2=1-SSerror/SStotal.
Also, see the article by Eisenhauer ('Regression through the Origin') where he explains why one can get different results under different statistical programs for OLS models without an intercept (he compares Excel and SPSS). He also explains the above in more detail.
http://evergreen.loyola.edu/chm/www/st4 ... at-rto.pdf
Hope it helps anyone.
I managed to figure out what is going on so I'm posting it here FYI.
The formula for R-sq is derived from SStotal=SSerror+SSregression.
R2=1-SSerror/SStotal=(SStotal - SSerror)/SStotal=SSregression/SStotal where SS is the sum of squares.
STATA uses formula:
R2=SSregression/SStotal
SSregression=sum (Yhat-Ybar)2 where 2 means squared, Yhat is the predicted value of Y, and Ybar is the mean value of Y.
SStotal= sum (Y-Ybar)2
When the OLS regression is run without a constant in the model the mean of Y i.e. Ybar is 0 (as the intercept is expected mean value of Y). We then get:
R2=sum (Yhat)2/(Y)2 which is strictly non-negative. Stata uses this formula for R-sq calculus hence we always get positive R-sq.
What is happening in EViews is that when the model provides a really poor fit the data may show more variation around the regression line than around Ybar, in which case SSerror>SStotal, then:
R2=1-SSerror/SStotal < 0. This is possible because EViews must be using the R-sq formula R2=1-SSerror/SStotal.
Also, see the article by Eisenhauer ('Regression through the Origin') where he explains why one can get different results under different statistical programs for OLS models without an intercept (he compares Excel and SPSS). He also explains the above in more detail.
http://evergreen.loyola.edu/chm/www/st4 ... at-rto.pdf
Hope it helps anyone.
Who is online
Users browsing this forum: No registered users and 2 guests
