Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Seeing double

by amearse (Sexton)
on Apr 20, 2001 at 01:19 UTC ( [id://73999]=perlquestion: print w/replies, xml ) Need Help??

amearse has asked for the wisdom of the Perl Monks concerning the following question:

This script takes a text file and parses two types of lines; Customer name and Message-Id. Unfortunately, there are usually two Message-Id tags per email. Anybody know how I would get the script to ignore the second one? Here is the code:
#!/usr/bin/perl $textfile = "iwn.txt"; open (TEXT, "$textfile") || die "Con't Open $textfile"; @text=<TEXT>; close(TEXT); $glob = @text; for($i=0; $i<$glob; $i++){ $_ = @text[$i]; if(/Email address:/){ print $_; } elsif(/Message-Id:/){ print $_; } }
Here are the current results with two emails:
Email address: 19324213@glerp Message-Id: <200103160515.WAA07878@mal.elp.rr.com> Message-Id: <200103160515.WAA07878@mal.elp.rr.com> Email address: 19075680@glerp Message-Id: <200103150512.VAA27508@prxy3.ba.best.com> Message-Id: <200103150512.VAA27508@prxy3.ba.best.com>
Any advice would be much appreciated. Thanks

Replies are listed 'Best First'.
(Ovid - remove duplicates) Re: Seeing double
by Ovid (Cardinal) on Apr 20, 2001 at 01:53 UTC
    My thought: use a hash.
    #!/usr/bin/perl -w use strict; my %message_id; while ( <DATA> ) { if ( /Email address:/ ) { print; } elsif ( /Message-Id:/ && ! $message_id{ $_ }++ ) { print; } } __DATA__ Email address: 19324213@glerp Message-Id: <200103160515.WAA07878@mal.elp.rr.com> Message-Id: <200103160515.WAA07878@mal.elp.rr.com> Email address: 19075680@glerp Message-Id: <200103150512.VAA27508@prxy3.ba.best.com> Message-Id: <200103150512.VAA27508@prxy3.ba.best.com>

    Cheers,
    Ovid

    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Re: Seeing double
by stephen (Priest) on Apr 20, 2001 at 01:58 UTC
    If what you're parsing is a file full of e-mails, I'd recommend going straight for the Mail-Tools. Mail::Util will break up a file full of mail messages into individual messages. Mail::Internet can be used to parse out those messages, and Mail::Header can be used to parse that header. Once you're dealing with headers individually, your problem becomes much simpler.

    stephen

Re: Seeing double
by indigo (Scribe) on Apr 20, 2001 at 01:35 UTC
    Sure.
    #!/usr/bin/perl -w use strict;
    Always use -w and use strict. my $textfile = "iwn.txt"; Use my to declare variables under use strict.
    open TEXT, $textfile or die "Can't Open $textfile"; my ($email, $id); while (<TEXT>){ $email ||= $_ if /Email address:/; $id ||= $_ if /Message-Id:/; }
    You were using C style looping. Perl loops are much nicer.

    $x ||= $y (or equals) is the same as $x = $x || $y. The end effect for you is $id gets a value only if it doesn't already have one.
    close TEXT; print $email; print $id;
    Pulling these variables out of the loop ensures they only get printed once.
(jeffa) Re: Seeing double
by jeffa (Bishop) on Apr 20, 2001 at 01:26 UTC
    Here is one way, with a cheesy flag:
    use strict; my $textfile = "iwn.txt"; open (TEXT, "$textfile") || die "Con't Open $textfile"; my @text=<TEXT>; close(TEXT); my $flag = 0; foreach (@text) { if (/Email address:/) { print; $flag = 0; } if (/Message-Id:/ and !$flag) { print; $flag = 1; } }

    Jeff

    R-R-R--R-R-R--R-R-R--R-R-R--R-R-R--
    L-L--L-L--L-L--L-L--L-L--L-L--L-L--
    
Re: Seeing double
by satchboost (Scribe) on Apr 20, 2001 at 01:26 UTC
    Use a counter. So, you'd have something like:
    elsif (/Message-Id:/) { $counter = 1 - $counter; print $_ if $counter; }

    Somewhere in the beginning, have my $counter = 0; and you should be fine.

Re: Seeing double
by ChemBoy (Priest) on Apr 20, 2001 at 01:33 UTC

    Couple of quick ideas:

    my $flag=0; for($i=0; $i<$glob; $i++){ $_ = @text[$i]; if(/Email address:/){ print $_; } elsif(/Message-Id:/ and !$flag++){ print $_; } }

    Or alternatively

    my ($email,$mesgid)= ("",""); for($i=0; $i<$glob; $i++){ $_ = @text[$i]; if(/Email address:/){ $email = $_; } elsif(/Message-Id:/){ $mesgid = $_; } } print $email,$mesgid;

    And a question: why the for loop, instead of

    while (<TEXT>){ ... }
    ?

    If God had meant us to fly, he would *never* have give us the railroads.
        --Michael Flanders

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://73999]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2024-04-25 05:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found