Regression for every industry and year

For questions regarding programming in the EViews programming language.

Moderators: EViews Gareth, EViews Moderator, EViews Jason, EViews Matt

ROSSINL
Posts: 25
Joined: Sun Jul 11, 2010 10:43 pm

Regression for every industry and year

Postby ROSSINL » Wed Jul 06, 2011 1:07 pm

Hi,

I have an unbalanced panel dataset of approximately 3,000 firms over a period of 20 years, with approximately 20,000 firm years.
Every firmyear contains information on a couple of dozen variables and every firm is part of an industry, identified by a SIC code.

Now I would like to run OLS regressions for every 2-digit SIC industry (approximately 60) and every fiscal year (20 years).
Since running 1200 regressions manually is a bit inefficient, I am looking for help to program this in Eviews.

The regression model is linear and has 3 independent variables: Y = a+b1X1+b2X2+b3X3+e

I also need to save all four parameter estimates for all the separate 1200 cross-sectional regressions and in addition, the residuals of these models, both preferably in one series.
So saving "resid" to e.g. the series "residuals" after every regression, resulting in one series that contains the residual for every observation estimated by one of the 1200 regressions. The same idea for the parameters estimates, which would result in four series that contain 1200 unique values for the constant, b1, b2 and b3. Saving in one series is not strictly necessary, but otherwise this would result in (4+1) x 1200 = 6000 series.
But perhaps there is an easier way to accomplish this...

I have already read the Eviews manual and the introduction to programming on this forum, but after hours of trying I'm afraid programming is not my strength :cry: I figured out I need to create a loop...
If someone is able to provide a program or guide me into the right direction, I would be extremely grateful! If my description is unclear, please let me know.

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13583
Joined: Tue Sep 16, 2008 5:38 pm

Re: Regression for every industry and year

Postby EViews Gareth » Wed Jul 06, 2011 1:55 pm

Assuming, for the moment, that you had a balanced panel, rather than unbalanced (the theory for both is the same, but back-of-envelope size calculations are easier in the balanced case), you have 20,000 observations in your panel. i.e. each series is of length 20,000. You will be doing 1,200 regressions, and for each regression you'll have 1 value for coefficient 1. Thus you'll have a total of 1,200 observations. It doesn't make sense to store those coefficient values into a series of size 20,000. You're better off storing them in a matrix.

A sample bit of code would be something like:

Code: Select all

matrix(1200,4) coefs !counter = 1 for %sic siccode1 siccode2 siccode3 siccode4 'list all sic codes here. for !year=1 to 20 smpl @first+!year-1 @first+!year-1 if sic = %sic equation eq1.ls y c x1 x2 x3 coefs(!counter,1) = eq1.@coef(1) coefs(!counter,2) = eq1.@coef(2) coefs(!counter,3) = eq1.@coef(3) coefs(!counter,4) = eq1.@coef(4) series residuals = resid !counter = !counter+1 next next

ROSSINL
Posts: 25
Joined: Sun Jul 11, 2010 10:43 pm

Re: Regression for every industry and year

Postby ROSSINL » Wed Jul 06, 2011 3:44 pm

Thank you Gareth, this seems to go into the right direction!
I get a matrix with the right dimensions in the workfile, but then the program returns an error:
Error in Sample: Range Error in "SMPL @FIRST+1991-1 @FIRST+1991-1 IF SIC2 = "1""

I have the two digit SIC codes in a series called "sic2" which run from 1 to 99.
For running the regression "ni = c at sale ppent" I modified your code as follows:

Code: Select all

matrix(1200,4) coefs !counter = 1 for %sic2 1 to 99 'list all sic codes here. for !year=1991 to 2010 smpl @first+!year-1 @first+!year-1 if sic2 = %sic2 equation eq1.ls ni c at sale ppent coefs(!counter,1) = eq1.@coef(1) coefs(!counter,2) = eq1.@coef(2) coefs(!counter,3) = eq1.@coef(3) coefs(!counter,4) = eq1.@coef(4) series residuals = resid !counter = !counter+1 next next
Perhaps it's easier if you see the workfile, so I attached a trimmed version of it with data from 2005-2009 and only the key variables (due to size limit of 2MB).
Thank you in advance!

The date series id is "fyear", the cross section id is "gvkey".
Attachments
code.wf1
(988.52 KiB) Downloaded 695 times

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13583
Joined: Tue Sep 16, 2008 5:38 pm

Regression for every industry and year

Postby EViews Gareth » Wed Jul 06, 2011 4:09 pm

Change %sic to be !sic

ROSSINL
Posts: 25
Joined: Sun Jul 11, 2010 10:43 pm

Re: Regression for every industry and year

Postby ROSSINL » Wed Jul 06, 2011 4:31 pm

I changed both %sic to !sic but got an error "syntax error in "FOR 0 1 TO 99"
If I place an equality sign as in

Code: Select all

for !sic2=1 to 99 'list all sic codes here.
the syntax error disappears but then I'm back at the previous error:
Error in Sample: Range Error in "SMPL @FIRST+1991-1 @FIRST+1991-1 IF SIC2 = "1""

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13583
Joined: Tue Sep 16, 2008 5:38 pm

Regression for every industry and year

Postby EViews Gareth » Wed Jul 06, 2011 5:18 pm

Post your entire program

ROSSINL
Posts: 25
Joined: Sun Jul 11, 2010 10:43 pm

Re: Regression for every industry and year

Postby ROSSINL » Wed Jul 06, 2011 5:33 pm

Gareth, I think I (we) almost got it. After a lot of trial and error I tried this for a small sample of SIC industries 60-63. I've also extended the matrix for storing other statistics:

Code: Select all

matrix(1200,16) coefs !counter = 1 for !year=1991 to 2010 for !sic2=60 to 63 'list all sic codes here. smpl !year !year if sic2 = !sic2 equation eq1.ls ni c at sale ppent coefs(!counter,1) = eq1.@coef(1) coefs(!counter,2) = eq1.@coef(2) coefs(!counter,3) = eq1.@coef(3) coefs(!counter,4) = eq1.@coef(4) coefs(!counter,5) = eq1.@stderrs(1) coefs(!counter,6) = eq1.@stderrs(2) coefs(!counter,7) = eq1.@stderrs(3) coefs(!counter,8) = eq1.@stderrs(4) coefs(!counter,9) = eq1.@tstats(1) coefs(!counter,10) = eq1.@tstats(2) coefs(!counter,11) = eq1.@tstats(3) coefs(!counter,12) = eq1.@tstats(4) coefs(!counter,13) = eq1.@r2 coefs(!counter,14) = eq1.@rbar2 coefs(!counter,15) = eq1.@f coefs(!counter,16) = eq1.@regobs series residuals = resid !counter = !counter+1 next next
It works! I get 20 years x 4 SIC industries = 80 rows in the COEFS matrix and the "residuals" series contains residuals only for the observations that are part of these industry-years. Could you please give your opinion whether the above code is appropriate?

There is only one problem left: some industries have too few or no observations. So when I change the SIC range to:

Code: Select all

for !sic2=1 to 99 'list all sic codes here
the program stops at SIC=3 because there are no observations in this industry.
Can you advise how to force the program to ignore the error for that industry and proceed with the next industry in the loop? I've not found any topics that address this problem in programming.

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13583
Joined: Tue Sep 16, 2008 5:38 pm

Regression for every industry and year

Postby EViews Gareth » Wed Jul 06, 2011 6:34 pm

When you run the program, set the number of errors to something greater than zero.

ROSSINL
Posts: 25
Joined: Sun Jul 11, 2010 10:43 pm

Re: Regression for every industry and year

Postby ROSSINL » Wed Jul 06, 2011 8:48 pm

Thank you, that did the trick!

I have one final question:

I would like the program to only perform the industry-year regression if there are at least 15 observations to avoid a poor fit.
I've tried to use an if statement in combination with the @regobs command, but that did not work (guess because @regobs becomes only known after the regression?).

Do you have any suggestions for this? Is there a command that calculates the number of observations before a regression?

The code so far:

Code: Select all

matrix(2000,21) coefs !counter = 1 for !year=1991 to 2010 for !sic2=1 to 99 'list all sic codes here. smpl !year !year if sic2 = !sic2 equation eq1.ls ni c at sale ppent coefs(!counter,1) = eq1.@coef(1) coefs(!counter,2) = eq1.@coef(2) coefs(!counter,3) = eq1.@coef(3) coefs(!counter,4) = eq1.@coef(4) coefs(!counter,5) = eq1.@stderrs(1) coefs(!counter,6) = eq1.@stderrs(2) coefs(!counter,7) = eq1.@stderrs(3) coefs(!counter,8) = eq1.@stderrs(4) coefs(!counter,9) = eq1.@tstats(1) coefs(!counter,10) = eq1.@tstats(2) coefs(!counter,11) = eq1.@tstats(3) coefs(!counter,12) = eq1.@tstats(4) coefs(!counter,13) = 2*(1-@ctdist(@abs(@tstats(1)),@regobs-@ncoef)) coefs(!counter,14) = 2*(1-@ctdist(@abs(@tstats(2)),@regobs-@ncoef)) coefs(!counter,15) = 2*(1-@ctdist(@abs(@tstats(3)),@regobs-@ncoef)) coefs(!counter,16) = 2*(1-@ctdist(@abs(@tstats(4)),@regobs-@ncoef)) coefs(!counter,17) = eq1.@r2 coefs(!counter,18) = eq1.@rbar2 coefs(!counter,19) = eq1.@f coefs(!counter,20) = @fdist(eq1.@f,@ncoef-1,eq1.@regobs-@ncoef) coefs(!counter,21) = eq1.@regobs series residuals = resid !counter = !counter+1 next next

ROSSINL
Posts: 25
Joined: Sun Jul 11, 2010 10:43 pm

Re: Regression for every industry and year

Postby ROSSINL » Thu Jul 21, 2011 7:26 am

To date, I haven't found a solution yet :(

Does anyone know how to let the program check whether there are at least # number of observations, before performing industry-year regressions?

Thank you.

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13583
Joined: Tue Sep 16, 2008 5:38 pm

Re: Regression for every industry and year

Postby EViews Gareth » Thu Jul 21, 2011 7:56 am

You can't get that information from an equation. You'll have to get it from the individual series. I think I would put them all in a group, then use the @robs function to work which observations are available.

To be honest though, you're probably better off just estimating every equation, then only using the results if @regobs>15.

ROSSINL
Posts: 25
Joined: Sun Jul 11, 2010 10:43 pm

Re: Regression for every industry and year

Postby ROSSINL » Thu Jul 21, 2011 3:36 pm

Ok too bad, but than I'll look at the number of observations after the regression, like you suggested.
Thank you for the help Gareth.

ROSSINL
Posts: 25
Joined: Sun Jul 11, 2010 10:43 pm

Re: Regression for every industry and year

Postby ROSSINL » Mon Jul 25, 2011 7:32 pm

One quick question please:

In this section of the code above:

Code: Select all

for !sic2=1 to 99
is it possible to refer to a series containing the SIC codes instead of "1 to 99"? Since I read in the manual that "for" has to be combined with "to"? I need this because the SIC codes do not run continously from 1 to 99 but contain gaps.
I've tried a ton of things but can't get it to work...

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13583
Joined: Tue Sep 16, 2008 5:38 pm

Regression for every industry and year

Postby EViews Gareth » Mon Jul 25, 2011 9:38 pm

Create a string containing the codes you want to use, then loop over the string.

ROSSINL
Posts: 25
Joined: Sun Jul 11, 2010 10:43 pm

Re: Regression for every industry and year

Postby ROSSINL » Tue Jul 26, 2011 9:19 pm

Thank you, that does work :)


Return to “Programming”

Who is online

Users browsing this forum: No registered users and 1 guest