http://qs321.pair.com?node_id=142906

agustina_s has asked for the wisdom of the Perl Monks concerning the following question:

Hi perlmonks I really need help concerning the regex. I have a program that open a file input and create a file output. The input and output file looks like this :
INPUT DATE 13-JUN-2000 COMMERCIAL SUPPLIERS SEQUENCE /exon="49-333" /intron="1-48;334-385" // DATE 13-JUN-2000 COMMERCIAL SUPPLIERS SEQUENCE /exon="" /intron="1-29" // OUTPUT DBACC D000001 DATE "13-JUN 2002" Exon {Translation%49-133} Intron {Translation%1-48} Intron (Translation%334-385} DBACC D000002 DATE "13-JUN 2002" Exon {Translation -} Intron {Translation%1-29}
I have some problem with the printing of exon and intron. There can be 0 or more element separate by ; in it.I know that if there are 0 or more element in regex we use *.But in this case I'm quite confused with the way to print all the $1,$2 elements.

My code partly looks like:

#!/usr/local/bin/perl -w # A program that accept an input file: Scorpion database from Gen Bank # and will output the database in BioWare format my $file1="$ARGV[0]"; #var to save the input database my $result=">".$ARGV[1]; my $counter=1; my $no='D000001'; open(INFO1,$file1) or die "Can't open $file1.\n"; #open file1 open(OUT,$result) or die "Can't open $result.\n"; #foreach line in the files foreach(<INFO1>) { if(/^DATE\s*(.*)-(.*)-(.*)/){ print OUT "DBACC\t $no\n"; print OUT "Date\t $1-$2-$3\n"; $no++; } elsif(/\s*\/exon="(\d*-\d*)*"\n/){ print OUT "Exon\t \{Translation\%$1\}\n"; } elsif(/\s*\/intron="(\d*-\d*)*"\n/){ print OUT "Intron\t \{Translation\%$1\}\n"; } else{ print OUT "line $counter\n"; } $counter++; } close(INFO1); close(OUT);
Actually does " is considered a metacharacter? I mean if we want to search a { in a string we must use \{ if we want to search for " do I have to put \" since it always give me some error.

Thanks so much...