delete duplicate observations within a group

For questions regarding the import, export and manipulation of data in EViews, including graphing and basic statistics.

Moderators: EViews Gareth, EViews Jason, EViews Steve, EViews Moderator

lpm
Posts: 4
Joined: Wed Mar 02, 2016 2:20 pm

delete duplicate observations within a group

Postby lpm » Wed Jan 17, 2018 2:06 pm

here is my data:

name value
A 1
A 1
B 2
B 2

clearly I have duplicates. how do i delete duplicates within name them ending up with

name value
A 1
B 2

EViews Matt
EViews Developer
Posts: 557
Joined: Thu Apr 25, 2013 7:48 pm

Re: delete duplicate observations within a group

Postby EViews Matt » Thu Jan 18, 2018 10:30 am

Hello,

Is it sufficient to construct a sample that excludes the duplicates? Under the assumption that all duplicates in "name" are contiguous, it's simple to construct a dummy series that retains the first instance within a group of duplicates.

Code: Select all

series dummy = @nan(name <> name(-1), 1)
smpl if dummy

lpm
Posts: 4
Joined: Wed Mar 02, 2016 2:20 pm

Re: delete duplicate observations within a group

Postby lpm » Thu Jan 18, 2018 12:03 pm

I think this is the right approach but it didn't work. I have thousands of observations. Some of the groups under name have multiple-up to 15- duplicates. using my data and your commands would i be able to create a dummy such as:

Name value dummy
A 1 1
A 1 0
B 2 1
B 2 0
C 3 1
C 3 0
C 3 0

so first observation in each group gets a value of 1. All other observations in the group get a value of 0. I could then pagecontract out all the 0 values under dummy. Is that possible?

EViews Matt
EViews Developer
Posts: 557
Joined: Thu Apr 25, 2013 7:48 pm

Re: delete duplicate observations within a group

Postby EViews Matt » Thu Jan 18, 2018 2:27 pm

Sure, you can use my expression directly with pagecontract:

Code: Select all

pagecontract if @nan(name <> name(-1), 1)

Obviously, "name" above should be your series holding the name information. What didn't work when you tried my commands?

lpm
Posts: 4
Joined: Wed Mar 02, 2016 2:20 pm

Re: delete duplicate observations within a group

Postby lpm » Fri Jan 19, 2018 6:43 am

Looks like i got it to work. I believe this is the issue that was causing the problem. My dataset is large. I want to look at a subset of that large dataset. To do so I use smpl if command. From that point I want to delete duplicates, and i did so by applying the command you gave me. it created the dummy variable, but lots of observations that should be coded as 1 were instead coded as 0.

This is what I did to get it to work. Instead of using smpl if command to limit size of data. I used pagecontract. I then used the command you provided me. It worked perfectly. Any ideas on why this makes sense? regardless it seems to work.

PS. obviously my data is a lot more complicated than the simple example I provide. But, i can't delete duplicates from smpl @all. I have to create subsets and then delete duplicates. That is, i need the first observation under name in each subset to take on a value of 1 and duplicate observations within that subset to take on values of 0. Pagecontract to include only that subset works. Is there a way to write the program so that it runs your code within each subset?

Thanks for you help. Very useful.

EViews Matt
EViews Developer
Posts: 557
Joined: Thu Apr 25, 2013 7:48 pm

Re: delete duplicate observations within a group

Postby EViews Matt » Fri Jan 19, 2018 10:18 am

Curious. Would it be possible for you to post a portion of your actual dataset (large enough to exhibit the problem you're experiencing)? If you'd rather not do so publicly on the forum, you can attach it to a private message to me.

lpm
Posts: 4
Joined: Wed Mar 02, 2016 2:20 pm

Re: delete duplicate observations within a group

Postby lpm » Fri Jan 19, 2018 12:07 pm

I can't share the data publicly or privately. But I can create a made up data set with the same characteristics that is clear to follow. Currently I'm swamped. Give me till about Wed to make data available on the public forum.

EViews Gareth
Fe ddaethom, fe welon, fe amcangyfrifon
Posts: 13294
Joined: Tue Sep 16, 2008 5:38 pm

Re: delete duplicate observations within a group

Postby EViews Gareth » Fri Jan 19, 2018 12:17 pm

You may also email the data to support@eviews.com and reference this forum thread if that is easier than posting on the forum.
Follow us on Twitter @IHSEViews


Return to “Data Manipulation”

Who is online

Users browsing this forum: No registered users and 15 guests