http://qs321.pair.com?node_id=507948

davemcgi has asked for the wisdom of the Perl Monks concerning the following question:

I'm using LWP::UserAgent and HTTP::Request to check out rankings on the "popular" search engines ... as you know, the most "popular" uses many data centers around the world, and your queries get routed to the one closest to you geographically.

Now whilst I can retrive the pages fine and parse out the results, is there any way to retrieve the actual remote IP that these HTTP::Requests are being served from.

I mean if I do a GET call to www.booble.com, the page might be sent back from one of N IP numbers ... how do I hack into LWP and ascertain the IP number ? If there any low level way to examine the TCP/IP packets of an actual LWP call as it is being used ?

Obviously I had thought about a ping of something similar immediately before of afterwards, but that doesn't guarantee the exact same IP as the one the LWP was fed from ... they can change in milliseconds, purely depending on their loading conditions.

Replies are listed 'Best First'.
Re: LWP and Remote IP
by Kanji (Parson) on Nov 12, 2005 at 10:32 UTC

    You don't need to hack LWP at all: the information is available to you via the header() method on the response object.

    use LWP::UserAgent; my $ua = LWP::UserAgent->new; my $res = $ua->get('http://www.booble.com/'); print $res->header('client-peer');

    However, I don't see the availability of the client-peer header documented in the HTTP::Response or HTTP::Headers POD pages, so this may not work in previous or future LWP releases.

    (Works for me with LWP::UserAgent 2.033, HTTP::Message 1.56 and HTTP::Headers 1.62.)

        --k.


      The LWP which comes with ActivePerl 5.6.1 doesn't set this pseudo-header.
      >perl -e "use LWP; print $LWP::VERSION 5.64
      Kanji, thank you, that undocumented property was exactly what I was looking for. Working great :-)

      Thank (INSERT-DEITY-OF-YOUR-CHOICE) for Perl Monks
Re: LWP and Remote IP
by Zaxo (Archbishop) on Nov 12, 2005 at 09:02 UTC

    What you get back from GET is a HTTP::Response object. From it you can detect if you got a redirect with the is_redirect() method and the host of the responder with the base() method.

    Since HTTP::Response inherits from HTTP::Message, the methods and data of that class are also available. That includes a HTTP::Headers object available through the headers() method.

    I have no idea whether "www.booble.com" routing techniques leave traces of that kind, but it's the information that you have available right now. See perldoc of those parent and agglomerate modules for more about the info your responses carry.

    After Compline,
    Zaxo