Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Re^5: anchor text match

by JadeNB (Chaplain)
on Dec 30, 2009 at 21:08 UTC ( #814990=note: print w/replies, xml ) Need Help??

in reply to Re^4: anchor text match
in thread anchor text match

You have just copied the code from Re^2: anchor text match literatim (except for some mild massaging of the input). What have you tried?

Replies are listed 'Best First'.
Re^6: anchor text match
by kumar801012 (Initiate) on Dec 30, 2009 at 21:59 UTC
    I also tried this:
    use WWW::Mechanize(); my $mech = WWW::Mechanize->new(); my $html = $mech->get(''); my @links= $mech->find_all_links( text_regex => qr/a/i ); foreach(@links){ if($_->url() eq ''){ print "\n"; print "url \n"; print $_->url(); print "\n"; print " text\n"; print $_->text(); print "\n"; } } _END_

    The out put is :


    text: Victoria's Secret

    In case the page had an anchor tag like below:

    a href="" target=_blank><img src= height=11 width=11 border=0 alt="Open this result in new window"> </anchor>

    The above perl script would give :


    text: Open this result in new window

    But the desired result is:


    text: IMAGE

      I think that you may have expected a ready-made solution, which is why Re^2: anchor text match surprised you. The poster there was not (I think) trying to solve your problem, but rather to indicate to you how you could solve it. (That was the meaning of the “Two clues in one” text.)

      It's not surprising that the code you indicate doesn't do what you want—the for loop makes no effort to check whether the link being processed satisfies any special conditions, and so must treat every link equally.

      To fix this, you must have something of the following shape in your code:

      for my $link ( @links ) { if ( is_special $link ) { do_special_thing $link } else { do_ordinary_thing $link } }
      * where it's up to you to determine how to write is_special and do_special_thing (you've already indicated what you want do_ordinary_thing to be). As an aid, you have the $link object to hand, and so can test its properties in as much detail as necessary.

      * I don't mean literally that your code has to contain these words; just that, without some sort of conditional, you'll never get the special treatment you like.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://814990]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2022-11-29 21:32 GMT
Find Nodes?
    Voting Booth?