Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: How to extract an email address from a mailto URL?

by eye (Chaplain)
on Dec 30, 2008 at 07:03 UTC ( [id://733200]=note: print w/replies, xml ) Need Help??


in reply to How to extract an email address from a mailto URL?

If you want to differentiate between addresses in anchor tags and other uses of "mailto:" in the file, read the entire file into memory and use the match operator (m//). As suggested previously, you should use Regexp::Common::Email::Address to help compose a regular expression for the email address and enclosing HTML. I would use "\s+" between the "a" and "href" and "\s*" adjacent to the equal sign to match HTML's treatment of whitespace. Note that HTML allows quoting with both single and double quotes. Also, older HTML allowed you to not quote the information after the equal sign in some circumstances.
  • Comment on Re: How to extract an email address from a mailto URL?

Replies are listed 'Best First'.
Quoting attribute values in HTML
by dorward (Curate) on Dec 30, 2008 at 20:47 UTC
Re^2: How to extract an email address from a mailto URL?
by jdlev (Scribe) on Dec 30, 2008 at 13:17 UTC
    My experience in perl is going on about 3 weeks...so some of what you are saying is greek to me. Can you provide an example of how you would do it? The source file to pull the information from has the tag as follows:

    showTollfree(1010)
    // -->
    '/script'
    Fax:  (301)931-1285 
    'br''a
    href='mailto:KHargrove@servpro1010.com'>KHargrove@servpro1010.com'/a'

    '/td'

    I'm sorry to have to be wet nursed through this...but I have learned a ton of stuff over the last few weeks...I feel like my brain is going to explode!

      Well, first install these two modules (and their unresolved dependencies if there are any):

      Then you can do something like this (Quickshot, untested):

      #!/usr/bin/perl use strict; use warnings; use Regexp::Common qw(Email::Address); use Email::Address; my $filename = 'file_to_parse.dat'; open my $rh, '<', $filename or die "$filename: $!"; # Requirement: href=, mailto: and the mailaddress must be in the same +line! my @addresses = map { m/mailto:($RE{Email}{Address})/o; $1 } grep { m/href=.+?mailto:/ } <$rh> ; close $rh; { local $, = local $\ = "\n"; print @addresses; } __END__
        Thanks, works great!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://733200]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2024-04-19 03:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found