Re: Creating a non-redundant set


Don't ask to ask, just ask
	PerlMonks

Re: Creating a non-redundant set

by blokhead (Monsignor)

on Jul 18, 2007 at 15:12 UTC ( [id://627272]=note: print w/replies, xml )

Need Help??

in reply to Creating a non-redundant set

If it's true that *every* line also has its partner occurring elsewhere in the file, then all you have to do is scan through the file and throw out lines where you have "ENSPxxx ENSPyyy" where xxx < yyy. This will leave only the lines where xxx > yyy, which is only one line for each pair of {xxx,yyy} that occurs in the file.

If some lines' partners do not appear in the file, then I doubt there's a simple way other than naively going through and keeping track of which lines you've seen. The above solution also may not be appropriate if you have some requirements about the order in which the lines appear (e.g, only the first occurrence should be preserved, not necessarily the one with xxx > yyy).

blokhead