Page 1 of 1

Stack variables in large panel

Posted: Mon Mar 23, 2015 6:29 am
by bananen
I have a panel with 1016 cross sections and total ~5 million observations. In the data set, I have 4 different series, that contains data that of the same variabel. I want to merge these 4 together, as one series in a panel.

95% of the observation are NA. There might be some overlapping, where the data should be identical. Therefore it would be ok to overwrite data, as long as the new data is not NA.

The four variables are named stack?.

My issue is that Eviews won't stack the series, either through order Stacked or Interleaved due to the workfile beeing to large. It seems that this option does not perform a simple merge of the variables, but something else.

Max observation should be 15million in one series, and the new series should only have ~5million observations when the merge is completed.

Re: Stack variables in large panel

Posted: Mon Mar 23, 2015 8:14 am
by EViews Gareth
I'm not quite sure I understand what you're trying to do.

You currently have a workfile with a panel structure with 1016 cross-sections, and 5 million total row (so, about 5,000 time periods). In that panel workfile you have 4 series. You want to stack those 4 series into one giant series. How will the stacked observations be identified? You want to create a 3d panel?

Re: Stack variables in large panel

Posted: Mon Mar 23, 2015 10:38 am
by bananen
I'm not quite sure I understand what you're trying to do.

You currently have a workfile with a panel structure with 1016 cross-sections, and 5 million total row (so, about 5,000 time periods). In that panel workfile you have 4 series. You want to stack those 4 series into one giant series. How will the stacked observations be identified? You want to create a 3d panel?
*Edit

So apparently, when you save a large panel as xls, you don't gett all the data saved, and no warning. Probably due to limits in the file format/Excel. Therefore I have not solved the problem...

I will therefore try to explain myself better. I have panel data of the same variabel, retrieved from 4 different sources. The data contains prices of different cross sectional units of the same asset. These I want to merge into one series that represent the single variable. Some of the data will be slightly overlapping, but should be exactly the same. A large majority of the observations are NA.

As the data is actually the same variable, just in different series, they use the same cross sectional identifier. I want to create a two dimensional panel.

When I started out, I had about 1000 different time series in Eviews. I then used the stack command to create the the 4 panel data series, that I now also want to stack. But due to the number of observations, it seems that Eviews cannot do it.

Do you follow my intentions?

Thanks and regards

Re: Stack variables in large panel

Posted: Mon Mar 23, 2015 10:56 am
by startz
.xls files are limited to 64k rows. That's a Microsoft limit, nothing to do with EViews.

I believe .xlsx files can have up to 1024k rows.

Re: Stack variables in large panel

Posted: Mon Mar 23, 2015 10:58 am
by EViews Gareth
Anyway you could send us the files? I think it should be pretty easy and straight forward, but I can't picture what you want in my head.

Re: Stack variables in large panel

Posted: Mon Mar 23, 2015 3:38 pm
by bananen
.xls files are limited to 64k rows. That's a Microsoft limit, nothing to do with EViews.

I believe .xlsx files can have up to 1024k rows.
Allright I had a lot more than that, but then the export did not work out well.
Anyway you could send us the files? I think it should be pretty easy and straight forward, but I can't picture what you want in my head.
I have made a sample file with the same structure with 4 separate series that I want to merge. Unfortunately I cannot attach the full file since its proprietary data, but I hope that you shall see the structure and my intentions.

The outcome I want to achieve is a panel with all the 1018 cross section, and one series that includes all data for the variable it represents. Its just a simple merger of the initial series. But as I said before, it seems not to be possible to do due to there beeing too many observations.

Thanks for the support.

Re: Stack variables in large panel

Posted: Mon Mar 23, 2015 3:46 pm
by EViews Gareth
So you don't really want to stack them at all. You want to merge them into one series that is of the same length (i.e. in the example file you want a series that has 7,208 observations).

How about putting the four series into a group, then taking the rowwise mean:

Code: Select all

group g stack1 stack2 stack3 stack4 series x = @rmean(g)

Re: Stack variables in large panel

Posted: Wed Mar 25, 2015 4:40 am
by bananen
So you don't really want to stack them at all. You want to merge them into one series that is of the same length (i.e. in the example file you want a series that has 7,208 observations).

How about putting the four series into a group, then taking the rowwise mean:

Code: Select all

group g stack1 stack2 stack3 stack4 series x = @rmean(g)
Thanks Gareth, I see why now.