Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

[Resolved]Proxy link rotation

by kazak (Beadle)
on Jan 27, 2012 at 14:20 UTC ( [id://950376]=perlquestion: print w/replies, xml ) Need Help??

kazak has asked for the wisdom of the Perl Monks concerning the following question:

Resolved:

First of all, thanks everyone for your help.

1.Client used up all traffic on rented proxies, without warning me about it, so LWP::UserAgent was not able to use rented parent-proxies.

2. In http://proxy... link, last slash was missing, I tried to add it\delete it but it was useless due to a "traffic issue", when this issue was resolved, I added the slash again and... voila, it works! In other words:

my $cur_proxy = "http://$valid_routes[$j]"; # <--- Not working; my $cur_proxy = "http://$valid_routes[$j]/"; #<--- Works as a charm
May be it's not supposed to be an issue, but for me it works somehow only with closing slash, atleast when I tried it. I have some remote proxies for my needs, and I'm trying to use them simultaneously. Script should to change proxy used for serving requests before each new request. Proxy IP's are stored in array and then should be chosen randomly. I tried this appoach for UserAgent rotation and it worked, but it's not working for for proxy rotation. I hope some one can help me with this, thanks in advance.
my (@agents, @raw_routes, @valid_routes); open( AGENTS, "<", "/etc/squid/ua.cfg" ); while( <AGENTS> ) { s/#.*//; next if /^(\s)*$/; chomp; push @agents, $_; } close(AGENTS); open( ROUTES, "<", "/etc/squid/repeater/lib/routes.cfg" ); while( <ROUTES> ) { s/#.*//; next if /^(\s)*$/; chomp; push @raw_routes, "$_,ENABLED"; } close(ROUTES); my $ua = LWP::UserAgent->new(); sub cb { my($request, $ua, $h) = @_; $#valid_routes = -1; my $i = $#agents + 1; $i = rand($i); $i = int $i; foreach my $item (@raw_routes) { $item =~ s/,ENABLED//; push @valid_routes, $item unless $item =~ m/DISABLED/; } my $j = $#valid_routes + 1; $j = rand($j); $j = int $j; my $cur_agent = $agents[$i]; my $cur_proxy = "http://$valid_routes[$j]"; $request->proxy_authorization_basic( 'uname', 'passwd'); $ua->proxy(['http'], $cur_proxy); $request->header('User-Agent' => $cur_agent); $ua->timeout(120); } $ua->add_handler( request_preprepare => \&cb, { m_method => 'GET' } );

Replies are listed 'Best First'.
Re: Proxy link rotation
by JavaFan (Canon) on Jan 27, 2012 at 15:28 UTC
    You say it's not working, without explaining what the "not working" is. Does it fail to compile? Does it die? Doesn't it use a proxy? Doesn't it rotate? Can't it read the file? Can it be something simple as not having permission to read "/etc/squid/repeater/lib/routes.cfg"? You're not checking whether the open succeeds or not.

    Can you rephrase your question and 1) explain what goes wrong (that is, what are you seeing that makes you think "it doesn't work"), and 2) contains a short, stand-alone, runnable piece of code that shows the problem?

      Sorry JavaFan my bad again, so:

      1. Not working:

      When I try to use a random array element for rotation, proxy is going directly without parent proxies at all, but when I'm copying\pasting any element of "@valid_routes" array into my code and use it directly, everything is going ok, proxy is used for serving requests. In other words:

      If I'm using:

      $cur_proxy$j - it's not working

      http://111.111.111.111:12345/ - it's working. This IP was pasted from simple .txt file. As far as I can see this should prove that request is being prepared, authorization on parent proxy is going Ok, and approach used for rotation of UserAgent string may be right. Also I checked @valid_routes, it's filled before each request.

      So now I'm just running out of ideas, what it might be.
        I don't quite know what you're saying -- I cannot relate everything to the code you're posted. For instance, now you're talking about $cur_proxy[$j], as if @cur_proxy is an array, but in your original code, $cur_proxy is a string.

        Let me ask it again, can you provide us with a small, standalone piece of code that shows the errorneous behaviour?

Re: Proxy link rotation
by OlegG (Monk) on Jan 27, 2012 at 17:33 UTC
    Some time ago I writed LWP::UserAgent subclass, which main purpose is to automate proxy rotation for each request. It is not finished, not well tested and not documented yet, but you can try: LWP::UserAgent::Proxified
    Simpliest variant of usage:
    use LWP::UserAgent::Proxified; my $ua = LWP::UserAgent::Proxified->new( proxylist => [ ['http', 'https'] => 'http://10.0.0.1:1080', ['http', 'https'] => 'http://10.0.0.2:1080' ], proxyrand => 1, # choose random proxy for each request # other lwp options goes here );
      I need to use 250-300 different proxies, all these proxies require authorization, also I need some special filter for disabling and enabling proxies in proxy-list (something like grey list),etc . Is there any mechanisms to implement these things within your class?
        I think you can
        use strict; use LWP::UserAgent::Proxified; open( ROUTES, "<", "/etc/squid/repeater/lib/routes.cfg" ) or die "open: $!"; my @proxylist; while( <ROUTES> ) { s/#.*//; next if /^(\s)*$/; chomp; push @proxylist, http => "http://uname:passwd\@$_" if !/DISABLED/; } close(ROUTES); my $ua = LWP::UserAgent::Proxified->new( agent => undef, proxylist => \@proxylist, proxyrand => 1 ); # do the job

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://950376]
Approved by Eliya
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2024-04-25 04:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found