1. Residual - Fitted
2. Quantile - Quantile
3. Scale - Location
4. Mahalanobis Distance
The first 3 of the plots are fairly commonplace. The 4th plot is slightly complex and needs various assumptions, such as your variables are normally distributed. The advantage of the Mahalanobis distances plotted against a theoretical chi-squared distribution is that outliers can be detected even in high dimensions. Whereas typical leverage plots may fail to convey the unusual combination of data points across all dimensions (as they can only be plotted in 2 dimensions). The Mahalanobis distance works best when the mean and variance/covariance are estimated robustly. That is why you will see me using the @median call for that vector. If you are using eviews 10 (or higher), you can replace my for loop with @cmedian. The theoretical distribution is typically the chi-squared distribution, but I think that could change to a beta or fisher's t depending on your sample size and estimation.
Here is the linear algebra representation:
Here is an example output:
It takes two arguments that you configure at the beginning of the program:
-your equation's name
-your dependent variable series' name
- Make sure that the desired equation was the most recent equation you have estimated, or else the residuals in the resid series will not match.
- Also make sure that your specification is of the following format: y c x1 x2 x3 ... (reason being that the program will ignore first 2 variables from the left, presumably the dependent variable and the constant, leaving only the independent variables for analysis)
Code: Select all
%y = "your_dependent_variable"
%equation = "your_equation"
'--- configuration end ---
%vars = {%equation}.@varlist
%right_vars = @wright(%vars, @wcount(%vars)-2)
group regressor_group {%right_vars}
series fitted = {%y} - resid
group g1 fitted resid
freeze(resid_plot_1) g1.scat nnfit(b=.3,d=1,neval=100)
resid_plot_1.draw(line, left, color(gray), pattern(7)) 0
resid_plot_1.addtext(t, font(12pt, +b)) "Residual - Fitted"
series standardized_resid = resid/@stdev(resid)
freeze(resid_plot_2) standardized_resid.qqplot theory
resid_plot_2.setelem(1) axis(l)
resid_plot_2.setelem(2) axis(b)
resid_plot_2.addtext(t, font(12pt, +b)) "Quantile - Quantile"
series sqr_stnd_resid = @sqrt(standardized_resid)
group g2 fitted sqr_stnd_resid
freeze(resid_plot_3) g2.scat nnfit(b=.3,d=1,neval=100)
resid_plot_3.addtext(t, font(12pt, +b)) "Scale - Location"
for %i sqr_stnd_resid g1 g2 standardized_resid
d {%i}
next
'--- calculate mahalanobis distance ---
stom(regressor_group, var_matrix)
%mat = "var_matrix"
!dim = @columns({%mat})
regressor_group.cov(out=sym_)
vector(!dim) mu
for !i = 1 to regressor_group.@count
%g_member = regressor_group.@seriesname(!i)
mu(!i) = @median({%g_member})
next
vector(!dim) x
matrix(!dim, !dim) sigma = sym_cov
matrix(!dim, !dim) sigma_inverse
sigma_inverse = @inverse(sigma)
vector(!dim) xsigma
vector x_t = @transpose(x)
scalar inner_product
series m_distance
for !i = 1 to @obsrange
x_t = @rowextract({%mat},!i)
x = @transpose(x_t)
x = x-mu
xsigma = sigma_inverse*x
x_t = x_t-@transpose(mu)
inner_product = @sqrt(@inner(x_t,xsigma))
m_distance(!i) = inner_product
next
for %i x sigma xsigma mu sym_cov x_t inner_product var_matrix regressor_group sigma_inverse
d {%i}
next
freeze(resid_plot_4) m_distance.qqplot theory(dist=chisq)
resid_plot_4.addtext(t, font(12pt, +b)) "Mahalanobis Distance"
graph resid_dashboard.merge resid_plot_1 resid_plot_2 resid_plot_3 resid_plot_4
show resid_dashboard
Lastly, the Mahalanobis calculation was carried out over a for loop as long as the observation range. Please let me know you are aware of a matrix language solution for this calculation -- it would speed things up.