Page 1 of 1

Industry dummy variables

Posted: Mon Jul 20, 2009 4:46 am
by jessisam
Hi.
I need help to create industry groups. I have a data set with different companies that I want to group into industry codes. So for example: One firm has the value 2, which is agriculture and hunting companies, another firm has the value 5 which is fishing companies, and another company with value 13 are mining firms... I want to create a group called Id1 where all these 3 companies take on the industry code 1 for Agricultural companies . The point here is that I must not include firms which take on the values in between 2, 5, and 13, because they are not agricultural firms. etc. Then I would like to do the same for Manufaturing companies that shall take on the value 2.

Afterwards when I have created these groups I would like to create an industry dummy variable, to see how the different industry groups affect our data. How do I do this?

Very greatful for all help I can get!

Re: Industry dummy variables

Posted: Mon Jul 20, 2009 6:51 am
by startz
jessisam wrote:Hi.
I need help to create industry groups. I have a data set with different companies that I want to group into industry codes. So for example: One firm has the value 2, which is agriculture and hunting companies, another firm has the value 5 which is fishing companies, and another company with value 13 are mining firms... I want to create a group called Id1 where all these 3 companies take on the industry code 1 for Agricultural companies . The point here is that I must not include firms which take on the values in between 2, 5, and 13, because they are not agricultural firms. etc. Then I would like to do the same for Manufaturing companies that shall take on the value 2.

Afterwards when I have created these groups I would like to create an industry dummy variable, to see how the different industry groups affect our data. How do I do this?

Very greatful for all help I can get!

Code: Select all

series id1 = 0
smpl if firm=2 or firm=5 or firm=13
id1=1
smpl @all

Re: Industry dummy variables

Posted: Tue Aug 04, 2009 3:25 am
by jessisam
Thanks, that worked!

Now I managed to estimate an OLS regression with the dummy industry varaibles as control varaibles in my regression and got resonble results. This is how my regression line looked like:

"Avgdivpercent c avgriskpercent avginvopp4 avgassets avgroa avgdebt avgcash avgceosallary avgceoshare id1 id2 id3 id4 id5 id6 id7 id8"

Now I want to estimate the same equation after removing 1% of the outliers in my sample. So I have created new regressors without 1% of the outliers. But I still use the same dummy variables to control for the industry. The regression line looked like this:

"Out_divpercent c out_riskpercent out_invopp4 out_assets out_roa out_debt out_cash out_ceosallary out_ceoshare id1 id2 id3 id4 id5 id6 id7 id8"

However, I get the message near singular matrix, so I performed a correation table on all the regressors and found just NA on industry 3 and 6. This probably means that after removing 1% of the outliers in our sample there are no observations within these two industry groups.
But the problem is that I still need to estimate the regression line with controling for industry dummy varaibles. So how does my regression line has to be specified then? I cannot just remove id3 and id6? Still if I do remove id3 and id6 I manage to perform the regression, but the results are now totally different from before removing the outliers! Doesn't all the industry varaibles (except one, the reference group) has to be in the regression line in order to work as a dummy varaibles?

Have I performed the regression correct? If not how do I solve this problem?

Thank you for your help!

Re: Industry dummy variables

Posted: Tue Aug 04, 2009 6:50 am
by startz
If you don't have any data for industries 3 and 6, you can't estimate an equation specific to those industries and can't include dummies for them.