http://qs321.pair.com?node_id=1205300


in reply to Re^2: unique sequences
in thread unique sequences

Sorry, I didn't test the code well enough. I thought the capturing parenthesis would capture the look-behind pattern as well. This code will do what I originally intended:

Updated (thanks Cristoforo)

while ( $line =~ / (?<= .{9} [ATCG]{10} G ) G /gsx ) { my $match = substr $line, $+[ 0 ] - 21, 21; print $KMERS '>crispr_', ++$count, "\n$match\n" unless $unique_data{ $match }++; }