Random Split Data

For questions regarding the import, export and manipulation of data in EViews, including graphing and basic statistics.

Moderators: EViews Gareth, EViews Steve, EViews Moderator, EViews Jason

diggetybo
Posts: 152
Joined: Mon Jun 23, 2014 12:04 am

Random Split Data

Postby diggetybo » Mon Mar 21, 2016 8:34 am

Hey everyone,

I'm aware that certain sample selection methods can be used to do in-sample predictions and measure test error, but I would like to know if eviews has a built-in way to do a random split on the data in the workfile. A browse through the command reference suggests it is not supported. But let me reiterate, I'm imagining a configurable command, say between 0 and 1. In my imaginary function:

Code: Select all

sample sample_data =@randomsplit(.8)
80% of the workfile size would be selected at random and saved as a sample called "sample_data" and I guess the remaining 20% should be stored somewhere too.

I don't think there is a way to do this with the if clause in the sample GUI, but that just might be my user error.

In any event, please share some of the typical ways this is achieved in eviews.

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13602
Joined: Tue Sep 16, 2008 5:38 pm

Re: Random Split Data

Postby EViews Gareth » Mon Mar 21, 2016 8:39 am

Code: Select all

smpl if rnd<0.8

startz
Non-normality and collinearity are NOT problems!
Posts: 3797
Joined: Wed Sep 17, 2008 2:25 pm

Re: Random Split Data

Postby startz » Mon Mar 21, 2016 8:40 am

smpl if rnd<.8

or

series keep = rnd<.8
smpl if keep
wfsave
smpl if not keep
wfsave

diggetybo
Posts: 152
Joined: Mon Jun 23, 2014 12:04 am

Re: Random Split Data

Postby diggetybo » Mon Mar 21, 2016 8:42 am

Ok great, thanks again.

If I wanted to run some tests in the remaining 20% of the data in isolation would that be possible in the way you suggested?

Would subtracting the 80% sample from @all give me the 20%? or do samples not work like that?
Last edited by diggetybo on Mon Mar 21, 2016 8:45 am, edited 1 time in total.

startz
Non-normality and collinearity are NOT problems!
Posts: 3797
Joined: Wed Sep 17, 2008 2:25 pm

Re: Random Split Data

Postby startz » Mon Mar 21, 2016 8:44 am

No problem. Just use
smpl if not keep

diggetybo
Posts: 152
Joined: Mon Jun 23, 2014 12:04 am

Re: Random Split Data

Postby diggetybo » Mon Mar 21, 2016 8:47 am

Oh I understand now, thanks!

diggetybo
Posts: 152
Joined: Mon Jun 23, 2014 12:04 am

Re: Random Split Data

Postby diggetybo » Mon Mar 21, 2016 5:42 pm

Hey, I'm back after having some time to take the sample selections you guys suggested for a test drive. I have some final questions when you have the chance,

1. The rnd < .8 if clause sample seemed very elegant in terms of the coding, so I liked it. However even after I created a sample object in my workfile this method would use a different random .8 sample every time I clicked estimate. I tried creating a separate sample named "fixed_data" from my existing .8 random sample range called "random_data":

Code: Select all

sample fixed_data = random_data
However, it said: Error, illegal date "=". So, it seems equating/assigning samples to each other is not allowed? If that's the case, is there some other way I can fix the sample after it has be randomly drawn the first time?

2. The series "keep" approach does work and doesn't recalculate each time, which is great if I need to go back and add/remove things from the estimation. The only drawback is taking up object space if you have a large data set and need many different randomly drawn samples. Also, I'm not sure why we have to call wfsave after.

So I'm using the keep way for now, but ideally I'd like to find someway to get the first method to work. Let me know if there is something more I can do.

I appreciate all the help!
Last edited by diggetybo on Mon Mar 21, 2016 5:48 pm, edited 1 time in total.

startz
Non-normality and collinearity are NOT problems!
Posts: 3797
Joined: Wed Sep 17, 2008 2:25 pm

Re: Random Split Data

Postby startz » Mon Mar 21, 2016 5:48 pm

The series keep is just a bunch of ones and zeros. It takes the same amount of space as any other series, which is to say that the space in memory is negligible. You could make a sample with

Code: Select all

sample s if keep
but that doesn't save any space. In fact, it adds an object.

diggetybo
Posts: 152
Joined: Mon Jun 23, 2014 12:04 am

Re: Random Split Data

Postby diggetybo » Mon Mar 21, 2016 6:29 pm

Hey startz,

Yea, you're right, the memory is negligible. I think it's mostly curiosity or stubbornness that has me still thinking on the .rnd way, even though the series 'keep' has already proved to work for me. I would suspect though, if you needed many distinct, randomly drawn samples from the data, at some point clutter would be an issue. You'd have to name them very intuitively, if the keep series name gets longer as a result, it will be hard to refer to it without auto-complete, or if you go with keep1, keep2, ect you'd need a legend/key. Anyway it might complicate things, if taken to the extreme.

Another thought that occurred is writing a program to run tests then delete the keep series as necessary if you are ocd about objects in the worfkile.

As it stands though, a static random sample using only the GUI if clause is not feasible?

startz
Non-normality and collinearity are NOT problems!
Posts: 3797
Joined: Wed Sep 17, 2008 2:25 pm

Re: Random Split Data

Postby startz » Mon Mar 21, 2016 6:36 pm

Well, you could do what you suggested above: do a random sample and then save the observations in that sample. Then just use the saved workfile without worrying about the sample.

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13602
Joined: Tue Sep 16, 2008 5:38 pm

Re: Random Split Data

Postby EViews Gareth » Mon Mar 21, 2016 7:34 pm

You want a random sample that is persistent (I.e. The same observations are used every time). The only way to have it persistent is to keep it around (makes sense right?!).

You don't need to keep the sample objects around though. Just keep the series of 1/0s.

Sample objects are nothing more than little text strings - literally just the text "if keep=1".


Return to “Data Manipulation”

Who is online

Users browsing this forum: No registered users and 2 guests