Re: extraction of sequences


P is for Practical
	PerlMonks

Re: extraction of sequences

by jethro (Monsignor)

on Oct 13, 2009 at 11:31 UTC ( [id://800890]=note: print w/replies, xml )

Need Help??

in reply to extraction of sequences

You escape the closing brace but not the opening brace in your regexp. You should escape both

Also there is a '\n' between coding sequence and protein sequence you seem to have overlooked. But since you don't look for lines anyway it might make sense to do a 'chomp @array' before you join the lines (instead of adapting the regex).

You might also use non-greedy .*? instead of .* to make your regex a bit faster.

Note that the whole string in parenthesis is captured in $1,$2.... If you only want the coding and protein sequences without the text surrounding it you have to shrink the parenthesis to sit just around the .*

General advice: Please add at least 'use warnings;' to your code, and 'use strict;' is recommended too