Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Re: pattern matching to separate data

by graff (Chancellor)
on May 03, 2009 at 23:42 UTC ( #761626=note: print w/replies, xml ) Need Help??

in reply to pattern matching to separate data

Previous replies have given you a working solution, but in case it helps to know how the OP code went wrong:
while ($line=<FILE>){ $hit1= $line=~ /^(>data_\d+\s+GENEID_\d+.*\n.*)/s; print OUT1 "$hit1\n"; $hit2= $line=~ /^(>data_\d+\s+PROTID_\d+.*\n.*)/s; print OUT2 "$hit2\n"; }
The problems are:
  • The while loop is reading one line at a time, and printing to both output files on every iteration.

  • The input is structured as multi-line records, and the criteria for selecting the correct output file is only present on the first line of each record, so you would need to maintain a "state" variable (or use a variable for the output file handle, and assign it properly on reading the first line of each multi-line record) -- but your loop is pretending that every line contains the criteria for deciding which output to use.

  • You are using capturing parens in your regex match, but assigning the result to a scalar variable in a scalar context, which means the value assigned will be the number of captured strings (i.e. 1 or 0, depending on which line was just read). Note the following difference between assigning the match return in a scalar context ($c) versus a list context (@m, or $m in parens)

    $str = "text with some pattern in it"; $c = $str =~ / (some pattern) /; # sets $c to the numeric value " +1" @m = $str =~ / (some pattern) /; # assigns "some pattern" as sole + element of @m ( $m ) = $str =~ / (some pattern) /; # sets $m to "some pattern"
The result of those points taken together is that both your output files had the same line count as your input file, and the content of those lines is either "1" or "0". (When you said your "results are giving only the headers...", I suspect that you were looking at data that was not created by the code you posted.)

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://761626]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (None)
    As of 2021-10-16 06:10 GMT
    Find Nodes?
      Voting Booth?
      My first memorable Perl project was:

      Results (69 votes). Check out past polls.