Page 1 of 1
reference dummy variable - results not the same
Posted: Wed Apr 23, 2014 3:26 am
by bibimiwi
Hello,
I estimated a probit model. As explanatory variables I included 2 of 3 dummy variables, in order not to fall into dummy varibale trap. Therefore my omitted dummy varibale is my reference dummy variabel. After estiamting the equation all the p-values were statistically significant!
BUT then I wanted to try out if my results remain the same when I omitt one of the other dummy variables and include the former reference dummy variable in my equation. And suddenly the one dummy variabel I haven't touched at all, is not significant any more and according to the redundant variable test completely irrelevant.
How can this be??
Re: reference dummy variable - results not the same
Posted: Wed Apr 23, 2014 7:00 am
by startz
The coefficient on the dummy tells the difference from the reference group. When you change the reference group you change the interpretation of the other dummy coefficients.
Re: reference dummy variable - results not the same
Posted: Wed Apr 23, 2014 10:18 am
by bibimiwi
But can it really be that the dummy changes from highly significant not not significant at all just because I Change the reference category? Because according to my books this should not be the case. They say that the results should be the same, no matter which dummy is omitted.
Re: reference dummy variable - results not the same
Posted: Wed Apr 23, 2014 10:22 am
by startz
But can it really be that the dummy changes from highly significant not not significant at all just because I Change the reference category? Because according to my books this should not be the case. They say that the results should be the same, no matter which dummy is omitted.
The substantive results do not change. What a coefficient tells you about the substantive results does change.
Re: reference dummy variable - results not the same
Posted: Wed Apr 23, 2014 10:32 am
by bibimiwi
I am having difficulties distinguishing between "substantive results" and "what a coefficient tells you about the substantive results".
Can you, by any chance, think of an exmple that could ilustrate the difference?
Because I thought, that the interpretation of the coefficients give me the substantive results...
Re: reference dummy variable - results not the same
Posted: Wed Apr 23, 2014 11:08 am
by startz
Suppose that FEMALE is coded 1 for women and 0 for men and that MALE is coded the other way around.
If you run the two regressions
you will get opposite signs on the coefficients for the dummy variables because you have changed the reference group. However, the predicted values of y will be identical.
Re: reference dummy variable - results not the same
Posted: Wed Apr 23, 2014 11:13 am
by bibimiwi
But I have not done it like that!
My dummies are always coded 1 for the happening of an event and 0 for no happening. I did never change this!
Re: reference dummy variable - results not the same
Posted: Wed Apr 23, 2014 11:19 am
by startz
I think you might want to go back and read up on dummy variables. Don't worry that you're doing a probit; the issues are the same in a linear regression.
Re: reference dummy variable - results not the same
Posted: Wed Apr 23, 2014 11:31 am
by bibimiwi
I think you just misunderstood me :(
Let's say I have 3 dummy variables D1, D2, D3. In the event situation 1 happens D1 is coded 1 (0 if not), D2 and D3 are coded 0. If Situation 2 happens D2 is coded 1 (if not 0), D1 and D3 are coded 0....
Now I estimate an equation:
y c x D1 D2
D1 and D2 have a high z-statistic and high p-values. C and x are also significant.
Now I Change my equation into:
y c x D1 D3
D1 and C are no more significant. X and D3 are significant.
Do you understand what I mean?
Re: reference dummy variable - results not the same
Posted: Wed Apr 23, 2014 11:34 am
by startz
I think you just misunderstood me :(
Let's say I have 3 dummy variables D1, D2, D3. In the event situation 1 happens D1 is coded 1 (0 if not), D2 and D3 are coded 0. If Situation 2 happens D2 is coded 1 (if not 0), D1 and D3 are coded 0....
Now I estimate an equation:
y c x D1 D2
D1 and D2 have a high z-statistic and high p-values. C and x are also significant.
Now I Change my equation into:
y c x D1 D3
D1 and C are no more significant. X and D3 are significant.
Do you understand what I mean?
What do you think the coefficient on D1 tells you in the first equation? And what does the same coefficient tell you in the second equation?