c:\@Work\Perl>perl -wMstrict -MData::Dump -le "my $line = 'AAAATTTTCCCCGGGGAAAGGxAAAACCCCTTTTGGGGAAAGGxTTTTAAAACCCCGGGGAAAGG' ; my %unique_data; my $count; while ( $line =~ / ( .{9} [ATCG]{10} G \K G ) /gsxp ) { print qq{>crispr_@{[ ++$count ]} '$1' ($&) (${^MATCH})} unless $unique_data{$1}++; } ;; dd \%unique_data; " >crispr_1 'AAAATTTTCCCCGGGGAAAGG' (G) (G) >crispr_2 'AAAACCCCTTTTGGGGAAAGG' (G) (G) >crispr_3 'TTTTAAAACCCCGGGGAAAGG' (G) (G) { AAAACCCCTTTTGGGGAAAGG => 1, AAAATTTTCCCCGGGGAAAGG => 1, TTTTAAAACCCCGGGGAAAGG => 1, }