Page 1 of 1

Residual Analysis Dashboard

Posted: Sat Oct 28, 2017 2:12 am
by diggetybo
This is a program that outputs 4 residual diagnostic plots.

1. Residual - Fitted
2. Quantile - Quantile
3. Scale - Location
4. Mahalanobis Distance

The first 3 of the plots are fairly commonplace. The 4th plot is slightly complex and needs various assumptions, such as your variables are normally distributed. The advantage of the Mahalanobis distances plotted against a theoretical chi-squared distribution is that outliers can be detected even in high dimensions. Whereas typical leverage plots may fail to convey the unusual combination of data points across all dimensions (as they can only be plotted in 2 dimensions). The Mahalanobis distance works best when the mean and variance/covariance are estimated robustly. That is why you will see me using the @median call for that vector. If you are using eviews 10 (or higher), you can replace my for loop with @cmedian. The theoretical distribution is typically the chi-squared distribution, but I think that could change to a beta or fisher's t depending on your sample size and estimation.

Here is the linear algebra representation:
CodeCogsEqn.gif
CodeCogsEqn.gif (1.03 KiB) Viewed 23289 times
Here is an example output:
Residual Dashboard.png
Residual Dashboard.png (64.07 KiB) Viewed 23293 times
It takes two arguments that you configure at the beginning of the program:
-your equation's name
-your dependent variable series' name

- Make sure that the desired equation was the most recent equation you have estimated, or else the residuals in the resid series will not match.
- Also make sure that your specification is of the following format: y c x1 x2 x3 ... (reason being that the program will ignore first 2 variables from the left, presumably the dependent variable and the constant, leaving only the independent variables for analysis)

Code: Select all

%y = "your_dependent_variable" %equation = "your_equation" '--- configuration end --- %vars = {%equation}.@varlist %right_vars = @wright(%vars, @wcount(%vars)-2) group regressor_group {%right_vars} series fitted = {%y} - resid group g1 fitted resid freeze(resid_plot_1) g1.scat nnfit(b=.3,d=1,neval=100) resid_plot_1.draw(line, left, color(gray), pattern(7)) 0 resid_plot_1.addtext(t, font(12pt, +b)) "Residual - Fitted" series standardized_resid = resid/@stdev(resid) freeze(resid_plot_2) standardized_resid.qqplot theory resid_plot_2.setelem(1) axis(l) resid_plot_2.setelem(2) axis(b) resid_plot_2.addtext(t, font(12pt, +b)) "Quantile - Quantile" series sqr_stnd_resid = @sqrt(standardized_resid) group g2 fitted sqr_stnd_resid freeze(resid_plot_3) g2.scat nnfit(b=.3,d=1,neval=100) resid_plot_3.addtext(t, font(12pt, +b)) "Scale - Location" for %i sqr_stnd_resid g1 g2 standardized_resid d {%i} next '--- calculate mahalanobis distance --- stom(regressor_group, var_matrix) %mat = "var_matrix" !dim = @columns({%mat}) regressor_group.cov(out=sym_) vector(!dim) mu for !i = 1 to regressor_group.@count %g_member = regressor_group.@seriesname(!i) mu(!i) = @median({%g_member}) next vector(!dim) x matrix(!dim, !dim) sigma = sym_cov matrix(!dim, !dim) sigma_inverse sigma_inverse = @inverse(sigma) vector(!dim) xsigma vector x_t = @transpose(x) scalar inner_product series m_distance for !i = 1 to @obsrange x_t = @rowextract({%mat},!i) x = @transpose(x_t) x = x-mu xsigma = sigma_inverse*x x_t = x_t-@transpose(mu) inner_product = @sqrt(@inner(x_t,xsigma)) m_distance(!i) = inner_product next for %i x sigma xsigma mu sym_cov x_t inner_product var_matrix regressor_group sigma_inverse d {%i} next freeze(resid_plot_4) m_distance.qqplot theory(dist=chisq) resid_plot_4.addtext(t, font(12pt, +b)) "Mahalanobis Distance" graph resid_dashboard.merge resid_plot_1 resid_plot_2 resid_plot_3 resid_plot_4 show resid_dashboard
I am not sure how backwards compatible this program is; it was developed using eviews 9.

Lastly, the Mahalanobis calculation was carried out over a for loop as long as the observation range. Please let me know you are aware of a matrix language solution for this calculation -- it would speed things up.