Suggestions for my model

For econometric discussions not necessarily related to EViews.

Moderators: EViews Gareth, EViews Moderator

Agorhus
Posts: 1
Joined: Mon Jun 01, 2026 4:22 pm

Suggestions for my model

Postby Agorhus » Tue Jun 02, 2026 6:39 am

Hi everyone, I am an undergraduate economics student working on this model, I am posting here not just to get answers, but genuinely to learn and test my own understanding of the methodology I applied. Any feedback, criticism, or suggestions are welcome.I want to understand where I might be wrong. The primary objective of this model is to isolate and quantify the effect of meteorological drought, measured by the SPEI_7 index, on annual barley production. ΔCultivatedArea is included strictly as a control variable to prevent the drought coefficient from absorbing the effect of physical land changes, not as a variable of independent interest

Here is my setup

Model: Production_t = β0 + β1SPEI7_t + β2ΔCultivatedAreat + ε_t

(n=26).(due to differencing)
t=year

Where:

PRODUCTION: Annual barley production (tonnes)
SPEI_7: 7-month SPEI index for August
ΔCultivatedArea: First difference of barley cultivated area

Steps followed:

ADF unit root tests (intercept for PRODUCTION and SPEI_7; intercept+trend for CultivatedArea due to visible deterministic trend)
First-differenced CultivatedArea to achieve stationarity
Pearson correlation matrix to check multicollinearity (r = -0.081 between SPEI_7 and ΔCultivatedArea)
OLS estimation
Breusch-Godfrey test for autocorrelation (lag=1)
Breusch-Pagan-Godfrey test for heteroskedasticity
Jarque-Bera and Shapiro-Wilk tests for normality of residuals
Ramsey RESET test for functional form (F p=0.8856)

Results:
SPEI_7: β=874,320, p=0.0021 (significant at 1%)
ΔCultivatedArea: β=1.983, p=0.0188 (significant at 5%)
R²=0.453, Adjusted R²=0.401, F p=0.0014
All diagnostic tests passed (no autocorrelation, no heteroskedasticity, normality satisfied, correct functional form

MY QUESTIONS:

Two of the diagnostic tests produced borderline results that I would like to highlight:

1. Breusch-Godfrey Test (Autocorrelation)

Chi-Square p = 0.0691
F p = 0.0874
Both values exceed the 0.05 threshold, so the null hypothesis of no autocorrelation cannot be rejected. However, the margin is relatively narrow. I am wondering whether this should be a concern or whether it is simply a consequence of the small sample size (n=26).


2. Shapiro-Wilk Test (Normality of Residuals)

p = 0.0532
The null hypothesis of normality cannot be rejected, but the result is marginally above the critical value. Again, I suspect this may be related to the limited number of observations.


With only n=26 observations, ADF unit root tests are known to have low power. Is there a more appropriate test for this sample, and should I run both for robustness?

While I argue that SPEI_7 is strictly exogenous, the same argument does not hold for ΔCultivatedArea, as annual planting decisions may be correlated with omitted socioeconomic variables such as input costs or government subsidies. However, since the correlation between SPEI_7 and ΔCultivatedArea is negligible (r=-0.081, p=0.73), I argue that even if the ΔCultivatedArea coefficient is biased, this does not contaminate the SPEI7 estimate. Is this reasoning valid, or should I be more concerned about the potential endogeneity of ΔCultivatedArea?

startz
Non-normality and collinearity are NOT problems!
Posts: 3798
Joined: Wed Sep 17, 2008 2:25 pm

Re: Suggestions for my model

Postby startz » Tue Jun 02, 2026 10:36 am

You've done a very nice job so far and your questions are well-put. Here are a few thoughts.
ΔCultivatedArea is included strictly as a control variable to prevent the drought coefficient from absorbing the effect of physical land changes, not as a variable of independent interest.
You may want to think about whether droughts cause land to be withdrawn from cultivation. If so, you may want to rethink whether you want to control for the area cultivated. You do speak to this a bit at the end of your post, so maybe it isn't very important.

Your evidence suggests that autocorrelation is not very important. But it's easy to add AR(1) to the regression and see if your results are sensitive.

Why would you care about multicollinearity? Multicollinearity doesn't invalidate any of the regression results.

Normality is not very important. It matters a bit for the statistical tests when you only have 26 observations, but there's not a whole lot to be done about it.

One thing you might want to do if you haven't already is to do a scatterplot of production against drought conditions to see if there is any visual indications of a nonlinear relation.


Return to “Econometric Discussions”

Who is online

Users browsing this forum: No registered users and 0 guests