Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: How do i use regexes in one file to match FASTA sequences in another file

by lune (Pilgrim)
on Nov 22, 2013 at 14:46 UTC ( #1063956=note: print w/replies, xml ) Need Help??


in reply to How do i use regexes in one file to match FASTA sequences in another file

Basically there is nothing special in reading regexes from a file in contrast to using predefined ones.

It boils down to the question, how to represent the matches and the number of matches in an efficient way.

I created some files with simplified test input to concentrate on the problem: regexes.txt

ID1>>^a ID2>>h$ ID3>>b ID4>>[a-z]{9,10} ID5>>[ah]
lines.txt
id_A: abcdefg id_B: bcdefgh id_C: cdefghijk
Probably you will have to make changes to the "split"-Statements to match the format of your input.

I am storing the matches in a Hash that uses the regex-expressions as keys and array references of matches as values.

#!/usr/bin/perl -w use strict; use autodie; open(my $regexefile, "<", "regexes.txt"); my @regexes = <$regexefile>; chomp @regexes; my %regexes = map { split(/>>/, $_) } @regexes; my %matches; open(my $inputfile, "<", "lines.txt"); while (<$inputfile>) { while (my ($id, $regex) = each(%regexes)) { my (undef, $line) = split(/ /, $_); if ( $line =~ /$regex/) { if (! defined($matches{$regex})) { $matches{$regex} = []; } chomp $line; push($matches{$regex}, $line); } } } while (my ($regex, $matches) = each(%matches)) { if (!scalar @$matches) { next; } print "$regex: No of matches " . scalar @$matches . "\n"; foreach my $match (@$matches) { print "matched $match\n"; } }

Update: added autodie; warnings are already active because of -w.

Replies are listed 'Best First'.
Re^2: How do i use regexes in one file to match FASTA sequences in another file
by Kenosis (Priest) on Nov 22, 2013 at 19:30 UTC

    Nice logic to share with OP. Consider, however, adding use warnings; use autodie;, the latter to handle open errors.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1063956]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (2)
As of 2022-05-29 01:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (101 votes). Check out past polls.

    Notices?