Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: parsing multiple lines

by mwah (Hermit)
on May 21, 2008 at 19:24 UTC ( #687830=note: print w/replies, xml ) Need Help??


in reply to parsing multiple lines

if it's an option to slurp the file before processing, like:

... my $fn = 'data.dat'; my $stuff; open my$fh, '<', 'data.dat' or die "$fn - $!"; { local $/; $stuff = <$fh> } close $fh; ...

then the program could be simplified like:

... while( $stuff =~ /^ (\d+): \s+ # the number xx: => $1 (\w+) \s+ # the locus => $2 ( (?:.(?!^\d+:))+ ) # the remaining record => $3 /msgx ) { my ($locus, $name, $record, $kegg, $func, $proc, $comp) = ($2, '', +$3, ('unknown')x4); $name = $1 if $record =~ /([^\n\[]+)\s*/; $kegg = $1 if $record =~ /KEGG \s+ pathway: \s+ (.+?)\s+Function \s ++ Evidence/sx; $func = $1 if $record =~ /Function \s+ Evidence \s+ (.+?) (?:\n\n| +\z)/sx; $proc = $1 if $record =~ /Process \s+ Evidence \s+ (.+?) (?:\n\n| +\z)/sx; $comp = $1 if $record =~ /Component \s+ Evidence \s+ (.+?) (?:\n\n| +\z)/sx; print join "\n\n", $locus, $name, $kegg, $func, $proc, $comp; } ...

Regards

mwa

Replies are listed 'Best First'.
Re^2: parsing multiple lines
by sm2004 (Acolyte) on May 22, 2008 at 01:07 UTC
    Thanks. I appreciate all the suggestions and help.
Re^2: parsing multiple lines
by sm2004 (Acolyte) on May 28, 2008 at 22:52 UTC
    Thanks a lot. I'm using your script now as it's easy to modify some of my other text files to work with this script/idea. I didn't understand some of the syntax before (I'm new to perl) but I get it now. Thanks again.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://687830]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2021-04-10 23:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?