Re: list of unique strings, also eliminating matching substrings

First method, an old fashioned loop within a loop.

my @list = qw(AGCT AGGT GG AGCT);

MAIN: for my $i (0..$#list) {
    my $substr_re = qr/$list[$i]/;
    for my $j (0..$#list) {
        next if $i == $j || ! defined $list[$j];
        if ($list[$j] =~ $substr_re) {
            undef $list[$i];
            next MAIN;
        }
    }
}

my @unique = grep {defined} @list;

print "$_\n" for @unique;
[download]

Update: Increase efficiency by grouping the strings by size before processing:

use strict;
use warnings;

my @list = qw(AGCT AGGT GG AGCT);

my %bucket;
for (@list) {
    push @{$bucket{length($_)}}, $_;
}

# Only want to sort these once.
my @sizes = sort {$a <=> $b} keys %bucket;

while (my $size = shift @sizes) {
    MAIN: for my $i (0..$#{$bucket{$size}}) {
        # Same Size first
        for my $j ($i+1..$#{$bucket{$size}}) {
            if ($bucket{$size}[$i] eq $bucket{$size}[$j]) {
                undef $bucket{$size}[$i];
                next MAIN;
            }
        }
        
        # Bigger strings
        my $substr_re = qr/$bucket{$size}[$i]/;
        for my $bigger (@sizes) {
            for my $str (@{$bucket{$bigger}}) {
                if ($str =~ $substr_re) {
                    undef $bucket{$size}[$i];
                    next MAIN;
                }
            }
        }
    }
}

my @unique = grep {defined} map {@$_} values %bucket;

print "$_\n" for @unique;
[download]

Comment on Re: list of unique strings, also eliminating matching substrings Select or Download Code

Replies are listed 'Best First'.
Re^2: list of unique strings, also eliminating matching substrings by lindsay_grey (Novice) on May 21, 2011 at 03:36 UTC
I was thinking that might be necessary but hoping not. Thank you for posting your code.	[reply]
Re^2: list of unique strings, also eliminating matching substrings by lindsay_grey (Novice) on May 21, 2011 at 04:07 UTC
Thank you! I will definitely try this too.	[reply]


The stupid question is the question not asked
	PerlMonks