Using EViews for a very large panel data set

For requesting general information about EViews, sharing your own tips and tricks, and information on EViews training or guides.

Moderators: EViews Gareth, EViews Moderator

CornelisV
Posts: 4
Joined: Tue Apr 02, 2013 7:33 am

Using EViews for a very large panel data set

Postby CornelisV » Tue Apr 02, 2013 8:03 am

Hi all,

I'm currently working on my Master thesis and it involves a very large panel data set containing European (geographically, not the type) options between 1997 and 2011. Essentially this means that I have thousands of options(cross-sectional dimension), all having different starting and expiration dates (time series dimension), which results in a dataset of >20 million datapoints. As I imagine it each option must be forced in to a (nx1) vector (where n is the amount of trading days between 1997 and 2011), which only has values during the option's lifetime. As the value (or return) of the options become the dependent variable, I also have several explanatory variables which makes the dataset even more extensive.

As you can imagine structuring the data in such a way EViews recognizes it as panel data requires a lot of work, which makes me doubt whether EViews is the right program to do this extensive unbalanced panel regression. Running the regressions is, I think, not even the most difficult part. The thing that frightens me most is structuring the data in a correct way (i.e. removing/changing missing values, renaming series and creating several types of portfolios) and the runtime of the regressions (which I think will be hours!). I prefer working with EViews because I'm most familiar with it, but I can also switch to MatLab, Stata or SAS (which I'll have to learn myself, but that's an issue which I'll have to tackle later ;)).

My question is whether I should pursue working with EViews and start structuring my dataset or switch to one of the above mentioned programs? Or perhaps combine two or more programs (i.e. structuring the data in Matlab and run the regressions with EViews)?

Thanks in advance!

Cornelis

p.s. I'm sorry if I have posted this thread incorrectly.

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 12317
Joined: Tue Sep 16, 2008 5:38 pm

Re: Using EViews for a very large panel data set

Postby EViews Gareth » Tue Apr 02, 2013 8:13 am

You shouldn't really need to worry about the time. I just created a panel with 31m observations (8000 cross-sections, daily data from 1997 2011). Running a fixed effects regression with 4 regressors took 21 seconds.
Follow us on Twitter @IHSEViews

EViews Glenn
EViews Developer
Posts: 2620
Joined: Wed Oct 15, 2008 9:17 am

Re: Using EViews for a very large panel data set

Postby EViews Glenn » Mon Apr 08, 2013 11:19 am

I have a question about the first part of the discussion. How are your data currently stored? In principle, it should be pretty easy for EViews to handle the unbalanced panel.

CornelisV
Posts: 4
Joined: Tue Apr 02, 2013 7:33 am

Re: Using EViews for a very large panel data set

Postby CornelisV » Tue Apr 09, 2013 10:09 am

You're right, apparently it wasn't as difficult as I thought it was. I downloaded a small part of the data from OptionMetrics (via WRDS) and by playing around a little, I was able to figure it out. The data was downloaded into a text file which consisted of several columns, containing variables like 'date', 'bid/ask price', 'issuer name' and most importantly 'option id'. All the available option prices of a certain issuer on a specific day where put below each other, so the data sort of looked liked it was chronologically dated. For example all the prices of the options of AMP INC (first company on the constituents list of the S&P500) on the 4th of January 1996 form the first few entrants of the 'bid/ask price' column. The option prices for the same options of AMP INC on the 5th of January 1996 follow after and this goes on until December 31st or the expiration date of the option (which is given in a separate column). Then it switches to the second company and the process repeats itself, until it has covered all the companies of the S&P 500 in 1996. Then it starts over for 1997, 1998 and so on..
I mentioned that each option has its own option id, so it got me thinking whether I could use that to structure the data. I imported the text file as a foreign workfile into EViews as an unstructured/undated file, to see whether I could restructure the data within in EViews. It turned out by changing the structure to 'panel data' and fill in the option id as 'Cross Section ID series' and the date column as 'Date Series', it gave me the desirable set-up. If you don't understand precisely what I did, I can provide you with some screenshots if you want.
I downloaded the whole dataset today by the way and it is approximately 100GB!! I never worked with a dataset this large, so I really hope the runtime remains limited.

CornelisV
Posts: 4
Joined: Tue Apr 02, 2013 7:33 am

Re: Using EViews for a very large panel data set

Postby CornelisV » Sun Apr 14, 2013 7:18 am

I got the error that I have reached the maximum amount of observations per series. In the error it was mentioned that I could manually adjust the limit to a maximum of 15 million, but unfortunately I have series that have up to 25 million datapoints. Is this amount of observations (only) restricted by the capacity of you computer or is it limited in EViews itself (I'm using Eviews 7)? In other words, do I need to upgrade my computer?

startz
Non-normality and collinearity are NOT problems!
Posts: 3499
Joined: Wed Sep 17, 2008 2:25 pm

Re: Using EViews for a very large panel data set

Postby startz » Sun Apr 14, 2013 7:34 am

I believe that 15 million is the maximum number of observations per series in both EViews 7 and in the 32 bit version of EViews 8.

The 64-bit version of EViews 8 allows 120 million observations per series.


Return to “General Information and Tips and Tricks”

Who is online

Users browsing this forum: No registered users and 4 guests