Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

regular expression

by ansh batra (Friar)
on Jan 07, 2013 at 09:06 UTC ( [id://1011981]=perlquestion: print w/replies, xml ) Need Help??

ansh batra has asked for the wisdom of the Perl Monks concerning the following question:

my data is

</div><div class='ClipPicture'><a href='http://www.laptop-keys.com/Key +boardKeys/Cart/Acer/A_Series/A110/A8A'><img alt='Clip Style Pic' src= +'http://www.laptop-keys.com/images/KeyboardImages/A8A.png' width='659 +' height='135'/></a></div><div class='RadioButtonContainerClipStyle'>
this is one line of input
i want to get value of src attribute of img tag. i.e http://www.laptop-keys.com/images/KeyboardImages/A8A.png
my code
if($line=~ /<img .* src=\'(.*?)\' .*\/><\/a>/) { print "got2\n"; $kb_lay_img_url=$&; }

please help

Replies are listed 'Best First'.
Re: regular expression
by frozenwithjoy (Priest) on Jan 07, 2013 at 09:38 UTC
    You almost have it right. If I change $kb_lay_img_url=$&; to $kb_lay_img_url=$1;, it works for me. (see: Special Variables)

    EDIT: Here, I've made this change (and added Anon's suggestion, which makes the regex safer/better):

    #!/usr/bin/env perl use strict; use warnings; use feature 'say'; my $line = "</div><div class='ClipPicture'><a href='http://www.laptop-keys.com/Ke +yboardKeys/Cart/Acer/A_Series/A110/A8A'><img alt='Clip Style Pic' src +='http://www.laptop-keys.com/images/KeyboardImages/A8A.png' width='65 +9' height='135'/></a></div><div class='RadioButtonContainerClipStyle' +>"; if ($line=~ /<img .* src='([^']+)'.*\/><\/a>/) { print "got2\n"; my $kb_lay_img_url = $1; say $kb_lay_img_url; } __END__ got2 http://www.laptop-keys.com/images/KeyboardImages/A8A.png
      thanks. :)
Re: regular expression
by choroba (Cardinal) on Jan 07, 2013 at 09:36 UTC
    Do not use regular expressions to parse HTML. Use a proper tool. For example, using XML::XSH2, you can load the whole file and find the src attribute easilly:
    open :F html 1011981.html ; echo //img[@alt="Clip Style Pic"]/@src ;
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: regular expression (blah blah blah)
by Anonymous Monk on Jan 07, 2013 at 09:07 UTC
     '([^']+)'

      it is fetching the whole img tag
      i need only value of src attribute

        Most likely what Anonymous Monk showed you was not supposed to be the complete regular expression but intended to be used for matching the src attribute only. You will have to apply that hint appropriately.

        Did you try to understand my regex pattern? YAPE::Regex::Explain? Put it in your regex, match up the quotes, delete your stuff, use my stuff

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1011981]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2024-03-29 09:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found