Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

HTTP Proxy Detection

by spaz (Pilgrim)
on Sep 19, 2002 at 16:49 UTC ( [id://199225]=perlquestion: print w/replies, xml ) Need Help??

spaz has asked for the wisdom of the Perl Monks concerning the following question:

Fellow Monks,

I'm currently working on a project that, among other things, needs to fetch a series of pages through a proxy server. I'm using the LWP set of modules for this purpose.

The problem is that this proxy server isn't 100% reliable and this script needs to get on with life if the proxy isn't responding. I've tried changing the timeout values in my user agent initialization, but it's not quite good enough.

I've looked through the documentation for the various LWP modules but I don't see anything about how it handles broken proxies.

I'm not against perfoming a quick test before fetching, but I can't really think of a way to do that.

Any advice is greatly appreciated!

-- Dave

Replies are listed 'Best First'.
Re: HTTP Proxy Detection
by dws (Chancellor) on Sep 19, 2002 at 19:03 UTC
    The problem is that this proxy server isn't 100% reliable and this script needs to get on with life if the proxy isn't responding.

    It might help if you could characterize what "isn't 100% reliable" means in your situation. Does the proxy box go down? Does it refuse connections? Does the proxy server accept connections and then hang?

    You'll probably use the same strategy (setting 'timeout' when creating an LWP::UserAgent) to deal with most of these.

    I've looked through the documentation for the various LWP modules but I don't see anything about how it handles broken proxies.

    Read the code. At a low-level, it doesn't matter whether there's a proxy there or not. A request is make, and it either succeeds (possibly returning an HTTP-level error) or fails by timing out. The request may be structured for a proxy, which will forward the request, but the forwarding is beyond the client's (LWP's) visibility.

      It might help if you could characterize what "isn't 100% reliable" means in your situation.

      Usually the proxy software (squid) dies and refuses connections. Occasionally the machine itself locks up, but I'm primarily worried about squid dying.

      Read the code.

      I thought that's what documentation is for? :)

      What I'm most confused about is what 'timeout' means when using a proxy. How can I set the timeout such that if the proxy connects to a slow site, I don't time out. But if the proxy doesn't respond, I do time out? Or are these issues not visible to the agent?

      -- Dave
Re: HTTP Proxy Detection
by kabel (Chaplain) on Sep 19, 2002 at 18:38 UTC
    i think the quick test at the beginning of the script is a viable solution. but you need a certain criteria so that you can definitively say: proxy is up or down.

    perhaps Net::Ping is sufficient to do it. perhaps you got to try to get a website which is responding very quickly. you will need some time for this.

    an other way: AFAIK allows at least squid to be queried via SNMP. perhaps this works (i do not know it).
Re: HTTP Proxy Detection
by sauoq (Abbot) on Sep 19, 2002 at 19:56 UTC
    I've tried changing the timeout values in my user agent initialization, but it's not quite good enough.

    How is it "not quite good enough?" What would you like to change?

    this script needs to get on with life if the proxy isn't responding.

    What do you mean by "get on with life?" Do you want it to try another server? Avoid the proxy? Return an error to the user?

    -sauoq
    "My two cents aren't worth a dime.";
    
Found a solution
by spaz (Pilgrim) on Sep 19, 2002 at 21:32 UTC
    This will work for the time being:
    my $socket = IO::Socket::INET->new( PeerAddr => $cacheServer, PeerPort => $cachePort, Proto => 'tcp' ); if( $socket ) { print "Connected to $cacheServer at tcp port $cachePort\n"; close( $socket ); } else { print "*** Couldn't connect to $cacheServer at tcp port $cachePort +!!\n"; }
    Does anybody have any suggestions on improving this snippet? I assume it's fine as it is, but if there's anything glaring please let me know!

    -- Dave

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://199225]
Approved by krisahoch
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (6)
As of 2024-04-19 08:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found