Using EViews for a very large panel data set
Posted: Tue Apr 02, 2013 8:03 am
Hi all,
I'm currently working on my Master thesis and it involves a very large panel data set containing European (geographically, not the type) options between 1997 and 2011. Essentially this means that I have thousands of options(cross-sectional dimension), all having different starting and expiration dates (time series dimension), which results in a dataset of >20 million datapoints. As I imagine it each option must be forced in to a (nx1) vector (where n is the amount of trading days between 1997 and 2011), which only has values during the option's lifetime. As the value (or return) of the options become the dependent variable, I also have several explanatory variables which makes the dataset even more extensive.
As you can imagine structuring the data in such a way EViews recognizes it as panel data requires a lot of work, which makes me doubt whether EViews is the right program to do this extensive unbalanced panel regression. Running the regressions is, I think, not even the most difficult part. The thing that frightens me most is structuring the data in a correct way (i.e. removing/changing missing values, renaming series and creating several types of portfolios) and the runtime of the regressions (which I think will be hours!). I prefer working with EViews because I'm most familiar with it, but I can also switch to MatLab, Stata or SAS (which I'll have to learn myself, but that's an issue which I'll have to tackle later ).
My question is whether I should pursue working with EViews and start structuring my dataset or switch to one of the above mentioned programs? Or perhaps combine two or more programs (i.e. structuring the data in Matlab and run the regressions with EViews)?
Thanks in advance!
Cornelis
p.s. I'm sorry if I have posted this thread incorrectly.
I'm currently working on my Master thesis and it involves a very large panel data set containing European (geographically, not the type) options between 1997 and 2011. Essentially this means that I have thousands of options(cross-sectional dimension), all having different starting and expiration dates (time series dimension), which results in a dataset of >20 million datapoints. As I imagine it each option must be forced in to a (nx1) vector (where n is the amount of trading days between 1997 and 2011), which only has values during the option's lifetime. As the value (or return) of the options become the dependent variable, I also have several explanatory variables which makes the dataset even more extensive.
As you can imagine structuring the data in such a way EViews recognizes it as panel data requires a lot of work, which makes me doubt whether EViews is the right program to do this extensive unbalanced panel regression. Running the regressions is, I think, not even the most difficult part. The thing that frightens me most is structuring the data in a correct way (i.e. removing/changing missing values, renaming series and creating several types of portfolios) and the runtime of the regressions (which I think will be hours!). I prefer working with EViews because I'm most familiar with it, but I can also switch to MatLab, Stata or SAS (which I'll have to learn myself, but that's an issue which I'll have to tackle later ).
My question is whether I should pursue working with EViews and start structuring my dataset or switch to one of the above mentioned programs? Or perhaps combine two or more programs (i.e. structuring the data in Matlab and run the regressions with EViews)?
Thanks in advance!
Cornelis
p.s. I'm sorry if I have posted this thread incorrectly.