in reply to pattern matching to separate data
The problems are:while ($line=<FILE>){ $hit1= $line=~ /^(>data_\d+\s+GENEID_\d+.*\n.*)/s; print OUT1 "$hit1\n"; $hit2= $line=~ /^(>data_\d+\s+PROTID_\d+.*\n.*)/s; print OUT2 "$hit2\n"; }
The while loop is reading one line at a time, and printing to both output files on every iteration.
The input is structured as multi-line records, and the criteria for selecting the correct output file is only present on the first line of each record, so you would need to maintain a "state" variable (or use a variable for the output file handle, and assign it properly on reading the first line of each multi-line record) -- but your loop is pretending that every line contains the criteria for deciding which output to use.
You are using capturing parens in your regex match, but assigning the result to a scalar variable in a scalar context, which means the value assigned will be the number of captured strings (i.e. 1 or 0, depending on which line was just read). Note the following difference between assigning the match return in a scalar context ($c) versus a list context (@m, or $m in parens)
$str = "text with some pattern in it"; $c = $str =~ / (some pattern) /; # sets $c to the numeric value " +1" @m = $str =~ / (some pattern) /; # assigns "some pattern" as sole + element of @m ( $m ) = $str =~ / (some pattern) /; # sets $m to "some pattern"
|
---|