EViews.com

Posted: **Mon Mar 21, 2022 6:14 am**

Dear all,

I have a question on the BFGS optimiser in EViews 12 (build 2022-01-19). In particular, there have been many improvements in the BFGS implementations over the years that are present in other software packages as an optimiser parameter yet I have been unable to find them in EViews:

1. Numerical difference computation. The 'General Options --- Estimation Options' has two options, 'Speed' and 'Accuracy', and the help says: 'speed = fewer function evaluations, accuracy = more function evaluations'. Does it mean that the order of the Taylor series is higher (e.g. (f(x-2h) - 8f(x-h) + 8f(x+h) - f(x+2h))/(12h) for second-order accuracy)? Or does it imply that the number of iterations for the Richardson extrapolation is different, if it is used? What is the exact meaning of 'fewer or more evaluations'---how many are there for these options? I could not find it in the documentation.

2. Numerical difference epsilon. Given the fact that the smaller the numerical difference, the higher the accuracy loss due to finite machine precision, the user should be able to choose the numerical difference size (O(MachEps^(1/3)) is optimal according to some metrics) based on the problem. In some applications (e.g. Hessian computation for GARCH models), I have had wrong results due to the suboptimal default step size. Can the numerical difference step size be explicitly chosen in EViews?

3. BFGS stopping criterion. In principle, the optimiser can stop based on one of the following criteria:
--- The gradient is close to zero (e.g. Euclidean norm);
--- The relative or absolute function value improvement is close to zero (unable to reduce the function);
--- Step tolerance (maximum of the percentage changes in the scaled coefficients)---parameter `c` of `ml` in EViews 12.
--- Absolute convergence tolerance (tolerance for reaching zero for non-negative functions).

Do I understand it correctly that the gradient-based stopping criterion and stopping based on lack of value improvement is unavailable in EViews 12? I am having some trouble with coefficients close to zero in at state-space model because the relative argument value change if the value is close to zero is large, and for models with many parameters, convergence takes too much because there is no substantial improvement of the function value (<1e-8), yet the optimiser continues cruising in the high-dimensional space.

To cite a couple of examples, the GNU Scientific Library (GSL) has `gsl_multimin_test_gradient` to test the gradient, while R's `optim` has `abstol`, `reltol`, and `pgtol` (projected gradient tolerance), and Python's scipy has `optimize.fmin_bfgs` with `gtol` (gradient tolerance).

So is it possible to request the optimiser to stop based on the lack of function value reduction or gradient norm, not small argument value change?

Thank you very much in advance!

Yours sincerely,
Andreï V. Kostyrka

Posted: **Thu Mar 24, 2022 11:30 am**

Hello,

Let me try to address your three main points...

(1) As of EViews 10, the speed vs. accuracy general option no longer applies to state-space estimation via FIML w/ BFGS. Formerly, that option did select between Taylor series order, basically either 2-point or 4-point numeric derivatives (with a fixed Richardson extrapolation for the later). The choice between these two forms is made dynamically now, with the algorithm switching between them as the optimization progresses. The 'evaluations' referred to are evaluations of the log-likelihood function, with more evaluations clearly needed when 4-point derivatives are being used.

(2) The base unit for the step size is MachEps^(1/3) currently, with scaling for parameter magnitude. This size cannot be explicitly set by the user, but such an option is reasonable.

(3) You're essentially correct, the optimization termination criterion is based on the quasi-Newton step size which is not necessarily the most practical for all problems. Again, the alternatives are all reasonable, just not available in EViews at this time.

You've highlighted some good areas for additional user control of the optimization process, but I cannot make any promises about if/when said features will be available in EViews. We'll add your feature request to our pool of potential future work, but I strongly suspect it will be a low development priority. I just don't want you to get your hopes up waiting for the next EViews patch.

Posted: **Mon Mar 28, 2022 7:37 am**

Dear Matt,

Thank you very much for your reply and for possibly considering these changes at some point in time.

In this case, I have one last remaining question: the documentation says that the `c` parameter of the ML procedure governs the maximum percentage change of the scaled coefficients. How should one interpret it? Percentage change is by definition based on the prior level ([new-old]/old). Is there extra scaling, then?

Suppose that at some iteration i, X_i = (10, 100, 1000) and X_{i+1} changes to (10.001, 100.001, 1000.001). The relative change is (X_{i+1} - X_i) / X_i = (0.0001, 0.00001, 0.000001). If the argument `c = 0.001` was passed to the ML optimiser, will the procedure terminate at step i+1 because the maximum change is 0.0001 < 0.001? Or will it continue because 0.0001 = 0.01% > 0.001%? Also, will the likelihood value change be taken into account at all, or will it terminate purely based on the step size computed based on the estimated Hessian and the gradient (which involves likelihood internally)?

Yours sincerely,
Andreï V. Kostyrka

Posted: **Tue Apr 05, 2022 4:32 pm**

Hello,

For this scenario, the documentation of the 'c' parameter is a bit misleading. Elaborating on my answer (3) above, the termination criterion actually examines the trust region size associated with each quasi-Newton step, ending the algorithm when the trust region shrinks below the value of the 'c' parameter. This is more analogous to examining the L2 norm of the quasi-Newton step rather than the L-infinity norm the documentation implies. The change in the objective/likelihood value is not directly involved. And just to complicate things a little further, there is a backwards compatibility adjustment performed (the provenance of which I don't know) that effectively divides 'c' by 10000, so the threshold is always four orders of magnitude smaller than specified.

EViews.com

BFGS stopping criterion and numerical difference tuning

BFGS stopping criterion and numerical difference tuning

Re: BFGS stopping criterion and numerical difference tuning

Re: BFGS stopping criterion and numerical difference tuning

Re: BFGS stopping criterion and numerical difference tuning