Panel or Pool?

For requesting general information about EViews, sharing your own tips and tricks, and information on EViews training or guides.

Moderators: EViews Gareth, EViews Moderator

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13305
Joined: Tue Sep 16, 2008 5:38 pm

Panel or Pool?

Postby EViews Gareth » Fri Feb 25, 2011 12:30 pm

Most data comes in one of three forms, Time Series, where each observation is identified by a time or date, Cross-section, where each observation is identified by a unique ID (such as State or Country), or by a mixture of the two, commonly called Panel or Pooled data, where each observation is identified by both a time/date and a unique ID (USA 1990, for example).

Most EViews users are very familiar with using Time Series data inside EViews. The standard EViews workfile is often created by specifying a date range and frequency, thus creating a time series structure. Working with panel data is not so familiar to many users however, so this thread will provide some hints and definitions.

There are two distinct ways to work with panel data inside EViews. The first, called the Pool Object has been in EViews for a long time. The second, called the Panel Workfile, was introduced in EViews 5 and is still a relatively new concept to many long time EViews users.

First we will discuss the pool workfile, then the pool object, then panel workfiles, and finally a comparison of the two. If you are already comfortable with pools and panels, and are just interested in the advantages/disadvantages of both, skip down to the last post.

Further, if you want more information than this thread can give, we thoroughly recommend the description given in Richard Startz's book, EViews Illustrated.


The Pool Workfile
The pool object is a way of structuring your panel data inside a standard time-series workfile. An example of such a workfile can be seen below:

Image
The workfile can be downloaded from here (although note the data is actually just random data).

The workfile itself is a standard time series workfile. Notice how the Range statement indicates that it is quarterly data running from 1990Q1 to 2010Q4 (for a total of 84 observations). Although the structure of the workfile is time-series, the workfile does contain cross-sectional information. There are three types of data in the workfile - GDP, inflation (infl) and unemployment (unemp). There are 4 cross-sections, USA, JPN, UK and FRA. Each cross-section (countries in this case) has a separate series for each data type. Thus there is a series called GDP_FRA which contains GDP data for FRA over the periods 1990Q1-2010Q4, there is a series called UNEMP_JPN which contains unemployment data for JPN, etc.... Note that the naming of each series follows a defined pattern, in this case, the name of the data variable followed by an underscore, then the name of the country.

One of the advantages of this type of workfile structure is that it is very easy to perform within-country analysis. For example, say you want to regress unemployment in France on GDP in France and inflation in France. To do this, you can just specify the variables in the normal manner:

Image
Image
(remember this is random data!)

In a similar fashion you could easily regress across countries, to say regress unemployment in France against unemployment in Japan and unemployment in the USA.

Similarly it is easy to graph unemployment in the UK and GDP in the UK together, simply by selecting the two corresponding series as a group and graphing the group:

Image
Follow us on Twitter @IHSEViews

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13305
Joined: Tue Sep 16, 2008 5:38 pm

Re: Panel or Pool?

Postby EViews Gareth » Fri Feb 25, 2011 12:48 pm

The Pool Object
So far we have only used this workfile to perform cross-section or time-series specific operations, nothing we have done has actually taken advantage of the panel nature of this data. To do that we need to create a pool object.

At heart a pool object is simply an object that contains information on the cross-sectional nature of the series in your workfile. In our example workfile the cross-sections are our countries: _FRA, _JPN, _UK, and _USA. Thus a pool object would contain information about those cross-sections.

The easiest way to create a pool is to simply type POOL in the command window. This will bring up a new pool object with a window asking you to enter the cross-section identifiers. These are simply the naming pattern used to define each cross-section:

Image

Once you have defined the pool object, you can use it to perform many panel-based operations. The simplest of these would be some simple over-all descriptive statistics of a variable. To do this we click on the View menu of the pool and select descriptive statistics. We will then be asked to enter a list of pool series. A pool series is simply the name of the data you want to find statistics on, with a "?" in place of the cross-section identifier. In our example, the cross-section specific part of the naming convention is "_FRA", "_JPN" etc.... Thus if we wanted statistics on GDP, we would enter GDP?, since the series in the workfile are called GDP_FRA, GDP_JPN etc...

Image
Image

You can perform more complicated panel statistics, such as run panel unit root tests, or panel co-integration tests (again, both from the View menu):
Image
Image

Note that both of these are different from the standard time-series tests that are available from a series object. Because you are inside a Pool object, EViews knows that it should perform the panel based versions of the tests (consult your econometrics text book for details on the panel versions of unit root and cointegration tests).


You can also use the Pool object to perform panel based estimation techniques. Clicking on the Estimate button in the Pool will bring up the estimation dialog which lets you specify the estimation procedure. This dialog looks different from the standard estimation dialog in an Equation object, since it is now based upon panel estimation:
Image

Note how, as with all pool operations, the variables entered into the dialog are entered using the "?" to stand for the cross-section identifier. Also note that the dialog offers panel-estimation options such as Fixed or Random Effects, cross-section specific coefficients and weighting (again consult your econometrics text book for descriptions of these options). In our particular case we chose to have cross-section fixed effects, with GDP as the dependent variable, and GDP(-1) and UNEMP as the independent variables. The estimation output looks like:
Image
Follow us on Twitter @IHSEViews

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13305
Joined: Tue Sep 16, 2008 5:38 pm

Re: Panel or Pool?

Postby EViews Gareth » Fri Feb 25, 2011 3:04 pm

The Panel Workfile
Panel workfiles are a completely different way of using panel data inside EViews. They can be thought of as the stacked version of the pool workfile. Rather than having separate series for each variable for each cross-section, each variable has only a single series, but with the cross-sections stacked on top of each other in rows. Below is the panel workfile version of the dataset used above:
Image
The workfile can be downloaded from here.

For information on the best ways to set up panel workfiles, see this thread.

The Frequency Conversion guide has instructions on how to perform frequency conversion involving panels.

Note how the Range statement at the top of the workfile now has a "x 4" appended to the end of it - this indicates that EViews recognises this as a panel workfile with 4 cross-sections. Also notice how rather than having separate series for each variable for each cross-section, there is just a single series for unemployment (unemp), a single series for inflation (infl) and a single series for gdp.

Opening up a series lets you view how EViews has organised the rows of the data:
Image

You can see that rather than each row simply having a date assigned to it, it now has both a date and a country. The observations for France between 1990Q1 and 2010Q4 come first, followed by the observations for Japan, then the UK, then the US.


The main advantage that this structure has is that it is much easier to do panel based operations on the data inside EViews. Panel operations are done in the same way you would do their non-panel equivalents. For example, in a standard time series workfile, if you wanted to perform a unit root test, you would open a series, click on View, then click on Unit Root Test. To perform a panel unit root test inside a panel workfile, you do exactly the same thing - open up the series, click on View and then click on Unit Root Test. Since EViews knows that you are in a panel workfile, it will offer the panel version of the test. Here we have opened up the unemployment series and performed a unit root test:
Image

Notice that it gives exactly the same results as the Pool equivalent above. However it is somewhat easier to perform in the panel workfile (requiring you to just open the series and then click on a menu item, rather than having to create the pool object and specify the series pattern for the test etc...).

Similarly, a cointegration test can be performed in the same way as in a standard time-series workfile - open the two series as a group, then from the group click on View->Cointegration Test. Again since EViews knows you're in a panel workfile, it will offer the panel version of the cointegration tests:
Image


To perform panel estimation in a panel workfile, you use an equation object, exactly the same way you would in a time-series workfile. Note that although the estimation dialog front page looks identical to the standard workfile version, there is a new second page/tab that allows you to set panel options:
Image

On the second tab, as with the pool estimation object above, you can set panel estimation options such as fixed or random effects or gls weights. The estimation results here give identical results to those of the pool above:
Image

Although the panel estimation, unlike the pool estimation, does not display the fixed effects on the standard output, they are still available via the View menu.


Although it is not as easy as in the Pool workfile, you can still perform within country analysis in a Panel workfile. You can use the workfile sample to define which country you want to work with. For example to regress French unemployment against French GDP and inflation, we would set the sample to just be all observations in France. The easiest way to set the sample is via the command line:

Code: Select all

smpl if country="FRA"


Having set the sample, we can perform the estimation using the usual equation object:
Image
Note how this matches the estimation done in the Pool workfile above.

However, unlike the Pool workfile, in a panel workfile there is no easy way to regress a variable in one country against another variable in another country.

You can also use the sample statement to compare two countries against each other in a graph:

Code: Select all

smpl if country="FRA" or country="UK"

When graph series or groups in a panel workfile, EViews offers you options on how you want to control the cross-sections. You can choose to stack them (i.e. just form one long list of observations and graph them), have different cross-sections as different lines on the same graph, or have different graphs for each cross-section. You can also compute the mean across cross-sections and graph it along side standard-deviation bounds:
Image

The first three options are shown here:
Image
Image
Image
Follow us on Twitter @IHSEViews

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13305
Joined: Tue Sep 16, 2008 5:38 pm

Re: Panel or Pool?

Postby EViews Gareth » Mon Feb 28, 2011 11:13 am

Panel or Pool?

There is no right or wrong answer when deciding whether to use a panel or a pool workfile structure when working with panel data in EViews. However, here are some bullet points that trade off one against the other.
  • Pool workfiles can become complicated and messy when you have many cross-sections.
    Since you need a series per cross-section per variable, as soon as you have many cross-sections, your workfile can grow very large. In a panel, you only have one series per variable no matter how many cross-sections you have.
  • Panel workfiles can be tricky to use if you want to deal with large sub-sets of cross-sections.
    In the pool workfile, you can create new pool objects for each set of cross-sections you want to deal with. Defining those pool objects is simply a case of listing the cross-section IDs you want to use. In a panel workfile, you need to use sample if statements to define which cross-sections you want to deal with, which can become quite cumbersome:

    Code: Select all

    smpl if country="USA" or country="FRA" or country="JPN" or country="CAN" or country="UK" etc.....

  • Pool workfiles make it easy to perform cross-section, cross-variable analysis. Panels do not
    In a pool workfile it is easy to group different variables for different cross-sections. For example, cointegration between Unemployment in France and GDP in the UK can be performed by simply opening those two series as a group in the pool workfile. In a panel workfile it is almost impossible to do this.
  • Panel estimation has more estimation options and techniques available
    As well as specialised panel-estimation techniques for least-squares, two-stage least squares and GMM (including dynamic panel data estimation), most of the standard non-panel estimation methods (probit, liml, quantile regression, etc...) are available on the stacked data. Further any lags/leads will never cross seams (the last observation for one cross-section will never be included in the lags of the next cross-section).
  • Pool estimation lets you calculate cross-section specific coefficients
    The pool object lets you specify cross-section specific coefficients in least squares estimation. The panel estimator does not allow this (other than cross-section specific constants, i.e. fixed effects).
  • The panel workfile lets you generate cross-section or period statistics easily using the @statsby functions
    For example if you want to create a series containing by-cross-section means (i.e. the mean of a variable per cross-section) of a variable, Y, you could use the following:

    Code: Select all

    series cxmeans = @meansby(Y, @crossid)

    where @crossid is a keyword meaning cross-section identifiers. To do the same for periods:

    Code: Select all

    series pxmeans = @meansby(Y, @obsid)

    where @obsid is a keyword meaning date/time identifiers.

    This can be done in a pool workfile by using the pool object's generate procedure.
  • Panel equations have built in forecasting. Pool objects do not - model objects must be used instead

Feel free to post any other issues with pools or panels below.
Follow us on Twitter @IHSEViews

NylasVisser
Posts: 3
Joined: Mon Feb 28, 2011 1:12 pm

Re: Panel or Pool?

Postby NylasVisser » Mon Feb 28, 2011 1:29 pm

Hi,

I have a question about creating groups with stacked panel data. I've got five variables and 33 countries, and these are all stacked. I need to make two groups of countries, to check if a hypothesis holds true. What might be the best way to generate these groups?

t.i.a.

Nylas Visser

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13305
Joined: Tue Sep 16, 2008 5:38 pm

Re: Panel or Pool?

Postby EViews Gareth » Mon Feb 28, 2011 2:18 pm

Depends on what you're trying to test. If you're already in a panel workfile, it might be that you can simply create a new dummy variable series which as a 1 for observations that are in group 1 and a 0 for observations that are in group 2. You can then use the Series->View->Descriptive Statistics and Tests->Equality Tests by Classification, and then use the dummy variables series as the classifying series.

Otherwise you're probably best off moving to the Pool workfile.
Follow us on Twitter @IHSEViews

NylasVisser
Posts: 3
Joined: Mon Feb 28, 2011 1:12 pm

Re: Panel or Pool?

Postby NylasVisser » Tue Mar 01, 2011 3:50 am

Hello Gareth,

Thanks for your input.

To clarify, there are 5 different variables: GDP per capita growth, population growth, inflation, and two trade variables. I have done some preliminary research, to classify 33 countries in a certain stage of economic development. From this, I got 2 groups; one where the hypothesis states that the relationship between GDP per capita growth (the dependent variable) and population growth (most important independent variable) is negative, and a group where that relationship is positive. Since the data is in the form of stackel panel data, I need to exclude either one or the other group, to perform the analysis. A dummy might indeed work here, but if you have additional ways to make those groups, I'd love to hear them.

Nylas

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13305
Joined: Tue Sep 16, 2008 5:38 pm

Re: Panel or Pool?

Postby EViews Gareth » Tue Mar 01, 2011 9:14 am

In the panel you're going to have to define the dummy the hard way:

Code: Select all

series dummy=0
smpl if country="UK" or country="USA" or country="FRA" '......
series dummy=1
smpl @all
Follow us on Twitter @IHSEViews

Muzammil
Posts: 3
Joined: Tue Mar 01, 2011 8:47 am

Re: Panel or Pool?

Postby Muzammil » Tue Mar 01, 2011 9:30 am

Hi

I am a new user on Eviews 7 and currently working on a time series using dated panel option. The model I am working with is to create a time series on trade count recorded and time as my x-axis. My model consist of approx. 29000 cross sections and approx. 800000 rows of data which means it would be tricky to use pool analysis which is why I opted for Panel data analysis.

I have managed to build the initial model but would like to have some help on the following:

I am trying to sum the trade count level across for each date across all cross sections and end up with a time series (COB_DATE, TRADE COUNT). For that I have used the following code:

SMPL @FIRST @FIRST
SERIES COUNTSUM = COUNT
SMPL @FIRST @LAST
COUNTSUM = @SUMSBY(COUNT, COB_DATE)

However the problem which I face is that the resulting outcome from this code for time series COUNTSUM is the sum of all trade count across all cross section for each cob_date. However, this infromation is being repeated approx. 29000 times i.e. for each cross section and the data are being stacked on top of each other ( i.e. 91 cob_date with corresponding trade count summed across all cross sections for each different cross section). This prevents me to perform OLS regression analysis between COUNTSUM and cob_date since the underlying data in the series is incorrect (repeated 29000 times).

Could you please advise how in this case can I end up with a time series of each cob_date and the corresponding summed trade count without having that repeated for each cross section?

Please let me know should you need any details from my model. I thank you very much and look forward to hear from you soon.

Muzammil

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13305
Joined: Tue Sep 16, 2008 5:38 pm

Re: Panel or Pool?

Postby EViews Gareth » Tue Mar 01, 2011 9:35 am

I don't follow what you're trying to do. If the data is the same for each cross-section (and since it is the sum across all cross-sections, then it should be the same for each), then the only way to treat it is by stacking and repeating. What else did you have in mind?
Follow us on Twitter @IHSEViews

Muzammil
Posts: 3
Joined: Tue Mar 01, 2011 8:47 am

Re: Panel or Pool?

Postby Muzammil » Tue Mar 01, 2011 9:58 am

Hi,

Sorry, i dont think i've explained it very well. The model created is based on a set of trades. I then grouped the trades (just as it would have been in a pivot table) and record the number of trades which would fit each group.
Everyday of data which I have consis of approx 10,000 different groups and the data is collected everyday. I have created a dated panel model for my set of data.

The objective in Eviews is to sum across all cross sections for each day of data that I have and return me a time series with a date and the corresponding summed trade count.

Since each cross section has a different Identifier, what the software is currently doing is to sum across all cross sections for each day of data (which is correct and what I was after) but the same result is appearing 29000 (number of different cross sections Identifier) times.

Example:
100_Amsterdam_RMS_CZK_EUR_FX Forward_P_P_0 - 10/14/10 trade count: 1,831,730

100_Amsterdam_RMS_EUR_JPY_FX Forward_P_P_0 - 10/14/10 trade count: 1,831,730

As you can see these are two different cross section Identifier for the same date and have the same values (the value is correct). What I would like it to be is to give me the cob_date i.e. 10/14/10 and trade count 1,831,730 and that for each day that I have.

Thanks again for helping and hope to hear from you soon.

Muzammil

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13305
Joined: Tue Sep 16, 2008 5:38 pm

Re: Panel or Pool?

Postby EViews Gareth » Tue Mar 01, 2011 10:03 am

Ah, it sounds like you want to convert your panel workfile into a simple time-series workfile by summing across cross-sections. This can be done by creating a new page that has the correct time series structure, then copying from the panel page to the time-series page and using "sum" as the contraction method.
Follow us on Twitter @IHSEViews

Muzammil
Posts: 3
Joined: Tue Mar 01, 2011 8:47 am

Re: Panel or Pool?

Postby Muzammil » Tue Mar 01, 2011 10:15 am

Thanks for that.

I am wondering whether there is no other way i can do this within panel data strcture. The reason for this is that it will allow me at a later stage to sample on specific information and also perform other queries.

The code that i tried to achieve my current results is:

SMPL @FIRST @FIRST
SERIES COUNTSUM = COUNT
SMPL @FIRST @LAST
COUNTSUM = @SUMSBY(COUNT, COB_DATE)

Maybe there can be something I can add to this which would supppress the Identifiers to come up and to give back only the cob date and summed trade count? If so could you please advise on that?

Otherwise, could you please give me further details on how I can get about doing what you suggested - apologies about that, I am fairly new to this software and knowledge is quite limited.

Thanks
Muzammil

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13305
Joined: Tue Sep 16, 2008 5:38 pm

Re: Panel or Pool?

Postby EViews Gareth » Tue Mar 01, 2011 10:25 am

If you want to keep it in a panel structure, then you'll have to use the repeating structure. Every series inside a workfile has to be the same length as the workfile. You can't have series of differing lengths. The reason that EViews repeats the values is that for most econometric/mathematical operations, that is exactly how the calculations would be calculated - you would repeat the data for each cross-section.

The Frequency Conversion guide has instructions on how to convert from a panel to a non-panel.
Follow us on Twitter @IHSEViews

NylasVisser
Posts: 3
Joined: Mon Feb 28, 2011 1:12 pm

Re: Panel or Pool?

Postby NylasVisser » Tue Mar 01, 2011 2:13 pm

Diolch yn fawr!


Return to “General Information and Tips and Tricks”

Who is online

Users browsing this forum: No registered users and 5 guests