Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
this should do exactly what you want. i cleaned up your code a little, and added my comments with ##. you can look up info in perldoc on the items in parentheses. in particular, you may want to look at shift, FileHandle, perlre, split, and while.

some nodes you might want to read are:
while or foreach?
Opening files
Use strict warnings and diagnostics or die
Death to Dot Star!

best of luck in the future!

#!/usr/local/bin/perl -w use strict; ## use strict, use strict, use strict!!! $|++; ## enable line buffering to STDOUT use FileHandle; # A program that accept an input file: Scorpion database from Gen Bank # and will output the database in BioWare format ## used descriptive variable names ## used shift operator to process arguments (shift) and die with usage my $infile = shift || die "usage: $0 infile outfile\n"; my $outfile = shift || die "usage: $0 infile outfile\n"; my $item_count=1; my $item='D000001'; my $IN = new FileHandle; my $OUT = new FileHandle; ## check status of open and print $! for descriptive error message open($IN, "< " . $infile) or die "Can't open $infile. $!"; open($OUT, "> " . $outfile) or die "Can't open $outfile. $!"; while(<$IN>) { ## remove trailing newline chomp; ## skip blank lines next if( '^\s*$' ); ## print newline if end of record if( '^//$' ) { print $OUT "\n"; next; } ## expects date format like 1or2-three-four characters (perlre) if( /^DATE\s+(..?)-(...)-(....)$/ ) { ## very fast regex print $OUT "DBACC\t", $item++, "\n"; print $OUT "DATE\t\"$1-$2 $3\"\n"; } ## non-greedy match between double quotes (perlre) elsif( /^\s*\/exon="(.*?)"$/ ) { ## handle null case print $OUT "Exon\t{Translation -}\n" unless $1; ## seperate the matched string and process each (split) for(split ';', $1) { print $OUT "Exon\t{Translation\%", $_ ,"}\n"; } } ## non-greedy match between double quotes (perlre) elsif( /^\s*\/intron="(.*?)"$/ ) { ## handle null case print $OUT "Intron\t{Translation -}\n" unless $1; ## seperate the matched string and process each for(split ';', $1) { print $OUT "Intron\t{Translation\%", $_ ,"}\n"; } } } ## check status of close and print $! for descriptive error message close($IN) or die "Can't close $infile. $!"; close($OUT) or die "Can't close $outfile. $!";

~Particle


In reply to Re: about regular expression by particle
in thread about regular expression by agustina_s

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (3)
As of 2024-04-25 18:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found