hacker has asked for the wisdom of the Perl Monks concerning the following question:

I have a tool I'm writing which requires the randomization across a list of proxies. I'm doing that with:

my @proxies = read_file('proxies'); my $random_proxy = $proxies[rand @proxies]; $ua->proxy(['http', 'ftp'], "http://$random_proxy");

This works great, EXCEPT when I try to use one of those proxies, and it is either down, or giving me a 500, 403 or other status indicating that I can't use it for proxying.

Is there some programatic way of querying a proxy server to determine if it will allow itself to be used, and if not, skip to the next $random_proxy in the list?

I was thinking that testing $res->status_line after a HEAD test on it might work, but even if the IP is accessible, does that mean the proxy can be used? I don't think so.

Has anyone tried something like this? I need to be sure my list of proxies is only pointing to those which can be used as a public proxy.


Replies are listed 'Best First'.
Re: Testing proxy "health"
by gamache (Friar) on Oct 22, 2007 at 15:47 UTC
    I once dealt with this problem by giving each proxy N chances to fail, and after one reached N failures, I'd remove it from the list. It was a pound of cure rather than an ounce of prevention, but it more or less did the job. This method was used in a program where I was hitting 60-100 proxies at a time, so having a few not working at any given moment wasn't of high importance as long as things settled out eventually.

      Can you explain to me how you were able to "kill" the running process trying to reach a dead, down, unavailable proxy?

      I've tried this, and once I attempt to access content via a down proxy in my array, I have to sit and wait until that proxy times out, before I can try again.

      I'm not sure how to capture that "downed" state, kill the process attempting to use that proxy, and then loop across another proxy until I find one that works.

        What about just setting a shorter timeout?

        non-Perl: Andy Ford

Re: Testing proxy "health"
by lorn (Monk) on Oct 22, 2007 at 16:09 UTC
Re: Testing proxy "health"
by andyford (Curate) on Oct 22, 2007 at 15:53 UTC

    Unfortunately you're not going to be able to guarantee a proxy's functionality unless you actually use it to access something.

    If you're doing your connections on a periodic basis, I think you just need to keep a list of possibilities and send yourself alerts to refresh the list if it hits some low threshold as you drop non-responding proxies from the pool.

    Update: Wording updates. Remove syllogism.

    non-Perl: Andy Ford