Hi perlmonks
I really need help concerning the regex.
I have a program that open a file input and create a file output. The input and output file looks like this :
INPUT
DATE 13-JUN-2000
COMMERCIAL SUPPLIERS
SEQUENCE
/exon="49-333"
/intron="1-48;334-385"
//
DATE 13-JUN-2000
COMMERCIAL SUPPLIERS
SEQUENCE
/exon=""
/intron="1-29"
//
OUTPUT
DBACC D000001
DATE "13-JUN 2002"
Exon {Translation%49-133}
Intron {Translation%1-48}
Intron (Translation%334-385}
DBACC D000002
DATE "13-JUN 2002"
Exon {Translation -}
Intron {Translation%1-29}
I have some problem with the printing of exon and intron. There can be 0 or more element separate by ; in it.I know that if there are 0 or more element in regex we use *.But in this case I'm quite confused with the way to print all the $1,$2 elements.
My code partly looks like:
#!/usr/local/bin/perl -w
# A program that accept an input file: Scorpion database from Gen Bank
# and will output the database in BioWare format
my $file1="$ARGV[0]"; #var to save the input database
my $result=">".$ARGV[1];
my $counter=1;
my $no='D000001';
open(INFO1,$file1) or die "Can't open $file1.\n"; #open file1
open(OUT,$result) or die "Can't open $result.\n";
#foreach line in the files
foreach(<INFO1>)
{
if(/^DATE\s*(.*)-(.*)-(.*)/){
print OUT "DBACC\t $no\n";
print OUT "Date\t $1-$2-$3\n";
$no++;
}
elsif(/\s*\/exon="(\d*-\d*)*"\n/){
print OUT "Exon\t \{Translation\%$1\}\n";
}
elsif(/\s*\/intron="(\d*-\d*)*"\n/){
print OUT "Intron\t \{Translation\%$1\}\n";
}
else{
print OUT "line $counter\n";
}
$counter++;
}
close(INFO1);
close(OUT);
Actually does " is considered a metacharacter? I mean if we want to search a { in a string we must use \{ if we want to search for " do I have to put \" since it always give me some error.
Thanks so much...
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|