Nonlinear panel GMM: bug?
Moderators: EViews Gareth, EViews Moderator
Nonlinear panel GMM: bug?
The attached program tries to estimate a simple regression model with 1 regressor and a nonlinearity in the parameter, using a slightly unbalanced panel data set. Although the GMM objective function has just one minimum and is perfectly differentiable (as suggested by the grid search in the program), for some reasonable starting values (such as 4) the estimation procedure diverges. For other starting values (such as 3) the estimation results are fine. Is there a bug in the estimation routine?
 Attachments

 dataset.wf1
 (632.81 KiB) Downloaded 313 times

 BugQuestion.prg
 (651 Bytes) Downloaded 309 times

 Fe ddaethom, fe welon, fe amcangyfrifon
 Posts: 11912
 Joined: Tue Sep 16, 2008 5:38 pm
Re: Nonlinear panel GMM: bug?
This isn't really a bug per se. Any nonlinear estimation will be sensitive to starting values, and there will always (well nearly always) be a range of starting values that cause the estimation to fail.
Follow us on Twitter @IHSEViews
Re: Nonlinear panel GMM: bug?
Well, a grid search makes clear that the objective function is perfect (single minimum, smooth, ... ). I think an estimation procedure without bugs should be able to find the minimum in this model. For completeness, the model is equation a.GMM yy = C(1) + @logit(C(2)) * yy(1) @ yy(1). As you see: really straightforward. So I still think there is a bug. What is wrong?

 EViews Developer
 Posts: 2599
 Joined: Wed Oct 15, 2008 9:17 am
Re: Nonlinear panel GMM: bug?
This is an interesting example. But I still don't believe it's a bug.
Your "grid search" results, while interesting, don't quite indicate what you think they do. The results when fixing the value for one parameter and optimizing the function in a second parameter only says that the particular algorithm implied by your "grid" loop, which doesn't perform joint updating of parameters, works for poor starting values of C(2). But just because sequential updating of coefficients has monotonicity properties, doesn't mean that those same properties hold for joint updating of coefficients.
The bottom line is that even if a function is smooth and well behaved, and a grid search in orthogonal dimensions shows a single optimum, there is nothing to keep an iterative procedure based on twodimensional gradients from failing to converge if the starting values are poor. Another way of saying this is that at bad starting values for this problem, you are better off doing sequential updating than joint updating. Fair enough. But the solution here is simply to provide better starting values.
If you are still interested in doing a grid search, what you need to obtain is the shape of the *joint* objective surface in the neighborhood of your divergent starting values. I would bet that it is quite flat, with a ridge away from your sequential solution, which is what is leading to divergence. But that's a property of the objective function and not a bug in the optimization routine.
Your example does suggest that the ability to do single (or more generally, lower) dimensional updates to start nonlinear estimation might be a useful feature in EViews. It is not immediately apparent to me how this would work in large parameter settings, and what sort of interface we would provide for choosing how to do the dimension subsetting, but it is something for us to consider.
Your "grid search" results, while interesting, don't quite indicate what you think they do. The results when fixing the value for one parameter and optimizing the function in a second parameter only says that the particular algorithm implied by your "grid" loop, which doesn't perform joint updating of parameters, works for poor starting values of C(2). But just because sequential updating of coefficients has monotonicity properties, doesn't mean that those same properties hold for joint updating of coefficients.
The bottom line is that even if a function is smooth and well behaved, and a grid search in orthogonal dimensions shows a single optimum, there is nothing to keep an iterative procedure based on twodimensional gradients from failing to converge if the starting values are poor. Another way of saying this is that at bad starting values for this problem, you are better off doing sequential updating than joint updating. Fair enough. But the solution here is simply to provide better starting values.
If you are still interested in doing a grid search, what you need to obtain is the shape of the *joint* objective surface in the neighborhood of your divergent starting values. I would bet that it is quite flat, with a ridge away from your sequential solution, which is what is leading to divergence. But that's a property of the objective function and not a bug in the optimization routine.
Your example does suggest that the ability to do single (or more generally, lower) dimensional updates to start nonlinear estimation might be a useful feature in EViews. It is not immediately apparent to me how this would work in large parameter settings, and what sort of interface we would provide for choosing how to do the dimension subsetting, but it is something for us to consider.
Re: Nonlinear panel GMM: bug?
Thanks a lot for your serious answer. It is informative about the interpretation of grid results.
Notwithstanding this, I still think there is a bug.
Let us forget about the grid search. The code I enter is equation a.GMM yy = C(1) + @logit(C(2)) * yy(1) @ yy(1), so just simple linear panel regression of yy on yy(1) with parameters C(1) and C(3), where C(3) is nonlinearly transformed into C(2). If I enter equation a.GMM yy = C(1) + C(3) * yy(1) @ yy(1) , then the estimated C(3) is 0.7637677018407483. Thus the estimate of C(2) should be log( 0.764/(10.764) )=1.17.
This is indeed what I get when starting from C(2)=3. But when I start from C(2)=4, Eviews does not converge ("near singular matrix" error message). The estimation problem is (very) wellbehaved: single optimum when using C(3), and that does not change after a smooth nonlinear transformation on just one parameter. Hence, I really do not see any valid reason why Eviews may not converge, except for the presence of a bug. I hope you can solve this issue. Thanks a lot.
Notwithstanding this, I still think there is a bug.
Let us forget about the grid search. The code I enter is equation a.GMM yy = C(1) + @logit(C(2)) * yy(1) @ yy(1), so just simple linear panel regression of yy on yy(1) with parameters C(1) and C(3), where C(3) is nonlinearly transformed into C(2). If I enter equation a.GMM yy = C(1) + C(3) * yy(1) @ yy(1) , then the estimated C(3) is 0.7637677018407483. Thus the estimate of C(2) should be log( 0.764/(10.764) )=1.17.
This is indeed what I get when starting from C(2)=3. But when I start from C(2)=4, Eviews does not converge ("near singular matrix" error message). The estimation problem is (very) wellbehaved: single optimum when using C(3), and that does not change after a smooth nonlinear transformation on just one parameter. Hence, I really do not see any valid reason why Eviews may not converge, except for the presence of a bug. I hope you can solve this issue. Thanks a lot.

 EViews Developer
 Posts: 2599
 Joined: Wed Oct 15, 2008 9:17 am
Re: Nonlinear panel GMM: bug?
I still don't believe there is a bug, nor do I think that you quite understood the essence of my earlier post.
The issue here is that, as with many nonlinear functions, gradient based estimation of your function may not be particularly well behaved at all parameter values. This poor behavior can occur irrespective of the global properties of the function (i.e., you can have single optimum in mdimensions, but if you are in a completely flat part of the objective, you won't be able to get a gradient based algorithm going). Importantly, the existence of a single optimum is irrelevant for whether the estimation will converge from a given set of starting values. To offer you an extreme example, suppose you specified a starting value of C(2)=10000000. In this case, the coefficient on YY(1) is effectively zero and the derivative of the objective with respect to the coefficient is also zero. There is no way that a gradient based optimizer can get started using those derivatives since at those starting values, it looks like the function is independent of the value of C(2).
While your starting value for C(2) isn't as extreme as my example, it does, when combined with particular starting values for C(1) appear to be poor enough so that EViews detects a singularity. (In this regard, I note that at a starting value of C(2)=4 you have an autoregressive parameter of 0.982, which is getting close to a unit root; at C(2)=3, you're at .953.) As I said in the earlier post, grid search in one or more directions could help here. Your example provides evidence that it would be a useful tool to add to EViews. But the example does not in any way provide evidence of a bug in the program.
There are a large number of books on nonlinear optimization that will do a far better job that I of describing the issues involved (with far fewer heuristics and handwaving). In the event that the above didn't make sense, I highly recommend that you at least take a glance at one of them...
Regards,
The issue here is that, as with many nonlinear functions, gradient based estimation of your function may not be particularly well behaved at all parameter values. This poor behavior can occur irrespective of the global properties of the function (i.e., you can have single optimum in mdimensions, but if you are in a completely flat part of the objective, you won't be able to get a gradient based algorithm going). Importantly, the existence of a single optimum is irrelevant for whether the estimation will converge from a given set of starting values. To offer you an extreme example, suppose you specified a starting value of C(2)=10000000. In this case, the coefficient on YY(1) is effectively zero and the derivative of the objective with respect to the coefficient is also zero. There is no way that a gradient based optimizer can get started using those derivatives since at those starting values, it looks like the function is independent of the value of C(2).
While your starting value for C(2) isn't as extreme as my example, it does, when combined with particular starting values for C(1) appear to be poor enough so that EViews detects a singularity. (In this regard, I note that at a starting value of C(2)=4 you have an autoregressive parameter of 0.982, which is getting close to a unit root; at C(2)=3, you're at .953.) As I said in the earlier post, grid search in one or more directions could help here. Your example provides evidence that it would be a useful tool to add to EViews. But the example does not in any way provide evidence of a bug in the program.
There are a large number of books on nonlinear optimization that will do a far better job that I of describing the issues involved (with far fewer heuristics and handwaving). In the event that the above didn't make sense, I highly recommend that you at least take a glance at one of them...
Regards,
Re: Nonlinear panel GMM: bug?
Thanks for your answer. I still find it problematic that Eviews does not give the right estimate for such a simple problem. I hope you understand my worries after reading the following comments on your note.
1) The relevance of the starting value for c(1), as you correctly noticed, was something that had slipped my mind, as I supposed that Eviews would start at c(1)=0. I thus explicitly added c(1)=0 to my program, but the problem remains. Note that c(1)=0 implies a mean of 0
and that is well within the range of the dependent variable yy (which has mean=0.006, minimum=0.13, maximum=0.06). Hence, c(1)=0 should be a good starting value and not the cause of the problem.
2) I find it hard to believe that starting at an autoregressive parameter of 0.982 (i.e., c(2)=4) is too close to 1.
3) But even if it were too close to 1, I would expect that the problem is that the objective function is too flat there, so that the algorithm does not get going (gradient effectively 0), as your example indicates. However, the problem is not that the algorithm fails to get going; the problem is that it goes in the WRONG direction. After all, the singularity is only reported after the algorithm has brought the value of c(2) from 4 to over 3000 (c(1) is still close to 0), whereas the correct estimate is below 4. Perhaps you could yourself run the program I added to my first message and see what happens.
4) I do not claim that my program proofs there is a bug. I only claim that it worries my a lot and it questions the reliability of Eviews in general. As I like Eviews, I really hope the problem can be solved. That is why I have put so much effort in this (just like you).
1) The relevance of the starting value for c(1), as you correctly noticed, was something that had slipped my mind, as I supposed that Eviews would start at c(1)=0. I thus explicitly added c(1)=0 to my program, but the problem remains. Note that c(1)=0 implies a mean of 0
and that is well within the range of the dependent variable yy (which has mean=0.006, minimum=0.13, maximum=0.06). Hence, c(1)=0 should be a good starting value and not the cause of the problem.
2) I find it hard to believe that starting at an autoregressive parameter of 0.982 (i.e., c(2)=4) is too close to 1.
3) But even if it were too close to 1, I would expect that the problem is that the objective function is too flat there, so that the algorithm does not get going (gradient effectively 0), as your example indicates. However, the problem is not that the algorithm fails to get going; the problem is that it goes in the WRONG direction. After all, the singularity is only reported after the algorithm has brought the value of c(2) from 4 to over 3000 (c(1) is still close to 0), whereas the correct estimate is below 4. Perhaps you could yourself run the program I added to my first message and see what happens.
4) I do not claim that my program proofs there is a bug. I only claim that it worries my a lot and it questions the reliability of Eviews in general. As I like Eviews, I really hope the problem can be solved. That is why I have put so much effort in this (just like you).

 EViews Developer
 Posts: 2599
 Joined: Wed Oct 15, 2008 9:17 am
Re: Nonlinear panel GMM: bug?
To be honest, I don't think that there is anything constructive that I can add to my previous response. All of the comments made earlier apply equally to your current set of comments.

 EViews Developer
 Posts: 161
 Joined: Wed Sep 17, 2008 10:39 am
Re: Nonlinear panel GMM: bug?
Just to clarify a little as to what is going on...
The steps that occur during the optimization starting from c(1)=0, c(2)=4 are:
(You can reproduce this by using the m= option to set the maximum number of iterations to halt at 1,2,3...)
Each iteration lowers the objective (the Jstat) and each step appears to be 'in the right direction' (note the sign flips in c(2)), but the steps in c(2) are 'too big' which take us out into a region where the gradient with respect to c(2) is too small to be numerically useful.
It looks like the iterations that we're doing here are full GaussNewton steps. In some other places in EViews we damp these steps down a bit to reduce the possibility of the sort of explosive oscillation that you're seeing, but in this case we're simply following the rule that if the full Newton step lowers the objective, then we will take the step. We don't 'look ahead' and see that if we keep taking these steps, we are going to end up in trouble.
Note, on the other hand, what happens when we start from c(1)=0, c(2)=3. Then the iterations are:
Note that the sign of C(2) still flips here, but things no longer explode. In this case the GaussNewton steps converge very quickly, and it's hard to argue that any other optimization method would be any more effective.
Overall, while I agree that we could possibly do better in your particular case, nonlinear optimization procedures can fail for a variety of reasons, and starting values are *always* going to be very important in nonlinear problems.
The steps that occur during the optimization starting from c(1)=0, c(2)=4 are:
Code: Select all
iter c(1) c(2) Jstat
0 0.000000 4.000
1 0.001383 8.356 286.3577693
2 0.001383 3243.5 61.69287656
3 6.18E05 3243.5 59.26117147
Each iteration lowers the objective (the Jstat) and each step appears to be 'in the right direction' (note the sign flips in c(2)), but the steps in c(2) are 'too big' which take us out into a region where the gradient with respect to c(2) is too small to be numerically useful.
It looks like the iterations that we're doing here are full GaussNewton steps. In some other places in EViews we damp these steps down a bit to reduce the possibility of the sort of explosive oscillation that you're seeing, but in this case we're simply following the rule that if the full Newton step lowers the objective, then we will take the step. We don't 'look ahead' and see that if we keep taking these steps, we are going to end up in trouble.
Note, on the other hand, what happens when we start from c(1)=0, c(2)=3. Then the iterations are:
Code: Select all
iter c(1) c(2)
0 0.00000 3.00000
1 0.001383024 1.179291389
2 0.001383024 1.759428612
3 0.001383024 1.046140326
4 0.001383024 1.169511488
5 0.001383024 1.173443975
6 0.001383024 1.173448053
7 0.001383024 1.173448053
Note that the sign of C(2) still flips here, but things no longer explode. In this case the GaussNewton steps converge very quickly, and it's hard to argue that any other optimization method would be any more effective.
Overall, while I agree that we could possibly do better in your particular case, nonlinear optimization procedures can fail for a variety of reasons, and starting values are *always* going to be very important in nonlinear problems.
Re: Nonlinear panel GMM: bug?
Thanks; this clarifies the issue. Note that my simple example resulted from removing many complications in the model that I am actually interested in. In that model, I could not find good starting values at all (i.e., values that lead to convergence), even though a grid search indicated where the optimum would be). Hence, the convergence problem may occur in many more estimations. I hope my example helps further improve the optimization routine (regarding step size) a bit in future versions of Eviews. All the best.
Re: Nonlinear panel GMM: bug?
Dear Eviews Chris hello,
ın your message dated 6 february 2010, you mention that we can set m= option to view results of each iteration step. I copied your reply below. I also have to do a grid search and woukd like display results at each iteration in a table but couldnt figure out what you meant by using m= option.
Would appreciate if you can clarify this.
Br
Ahmet
"(You can reproduce this by using the m= option to set the maximum number of iterations to halt at 1,2,3...)"
ın your message dated 6 february 2010, you mention that we can set m= option to view results of each iteration step. I copied your reply below. I also have to do a grid search and woukd like display results at each iteration in a table but couldnt figure out what you meant by using m= option.
Would appreciate if you can clarify this.
Br
Ahmet
"(You can reproduce this by using the m= option to set the maximum number of iterations to halt at 1,2,3...)"

 EViews Developer
 Posts: 2599
 Joined: Wed Oct 15, 2008 9:17 am
Re: Nonlinear panel GMM: bug?
m is the option for maximum number of iterations. It may be set from the command line, or from the appropriate edit field in the dialog.
Who is online
Users browsing this forum: No registered users and 2 guests