Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Making Timeout for Yourself

by mcogan1966 (Monk)
on Dec 31, 2003 at 19:46 UTC ( [id://317980]=perlmeditation: print w/replies, xml ) Need Help??

I've been working on a project now for a couple of months. This project involved polling a number of websites, and getting specific information from that page.

From the beginning of the coding, I was intent on using LWP to get the pages, but I ran into a bit of an issue. Though LWP does have a lot of functionality, I wansn't using most of it, and I ended up wasting a lot of CPU cycles waiting for LWP overhead.

So, I changed over to using HTTP::Lite. Speed improved greatly, cutting processing and waiting times almost in half. This is a good thing, since the project required the polling of multiple pages to occur simultaneously. It's an on-demand world and I have an on-demand-er client.

HTTP::Lite lacks one small bit of functionality that I needed, though. A way to check for a timeout. What to do? Do I muddle through without it, and have some occasions with 15-20 second waits for a page? Or do I go back to using the slower, but more reliable LWP?

Answer: Neither.

The timeout factor for page loading is something that I was already considering in my code, and was contained in a variable. Since I could pass that to LWP, why not use it in another way for HTTP::Lite.

Here is the HTTP::Lite with a timeout :

sub HTTP_Request { (undef, $timeout, $url) = @_; $http = new HTTP::Lite; $st = time(); until ((time()-$st > $timeout) || ($req = $http->request($search))) { sleep(.1); } if ($req ne "200") { if ($req eq "") { return "Error: Timeout"; } else { return "Error: $http->status_message(); } } else { return $http->body(); } }
Usual caveats apply here, such as declaring variables and 'use' statements. Bonus here is that with adding the sleep(.1), the code acutally runs faster.

Replies are listed 'Best First'.
Re: Making Timeout for Yourself
by pg (Canon) on Dec 31, 2003 at 20:51 UTC

    LWP's timeout is nothing more than passing the timeout parameter to the socket it holding.

    The best solution here is to do the same thing as LWP does. A quick study of the source code shows that, HTTP::Lite uses raw socket instead of IO::Socket class, so just use setsockopt() call to set various timeouts.

    socket can handle timeout very nicely for you, so don't do it yourself.

      That would be perfect, but I don't see how to properly implement that. I found how HTTP::Lite makes the socket, but I haven't found anyway of using setsockopt() to set a timeout value. Any direction as to where to look to do this?
Re: Making Timeout for Yourself
by hardburn (Abbot) on Dec 31, 2003 at 20:19 UTC

    I don't understand. Won't $http->request() block until it gets something back or the underlieing socket times out? How can you check the timeout if it blocks?

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    : () { :|:& };:

    Note: All code is untested, unless otherwise stated

      Darnit, you're right.

      I reversed the conditions, and I seem to have blocked the timer check. Grr. And here I though I had solved the problem.

      Any thoughts on how to get around this?

        Best way is to e-mail the author of HTTP::Lite with a patch that adds a way to set a timeout value (which should be easy) and sets that timeout value on the socket it opens (which should also be easy).

        ----
        I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
        -- Schemer

        : () { :|:& };:

        Note: All code is untested, unless otherwise stated

Re: Making Timeout for Yourself
by revdiablo (Prior) on Dec 31, 2003 at 20:12 UTC

    This is not meant to be a criticism of your code, as I really haven't looked at it much. If it works, and you understand it, then you should probably stick with it. If I were to do this, though, I think I would go with an alarm for the timeout. This is recommended by perldoc -q timeout, and is quite simple to use. Here's an example from perldoc -f alarm:

    eval { local $SIG{ALRM} = sub { die "alarm\n" }; # NB: \n required alarm $timeout; $nread = sysread SOCKET, $buffer, $size; alarm 0; }; if ($@) { die unless $@ eq "alarm\n"; # propagate unexpected errors # timed out } else { # didn't }

      I usually try to avoid things that are not cross-platform, unless there is no other alternative. (I am definitely not saying that you are wrong.)

      alarm() is not implemented on win32, and this is true up to 5.8.2.

      The best solution is to use various socket timeouts that socket supports, not to implement by oneself.

        Indeed, I should have mentioned that alarm doesn't work on Windows. This is brought up in the perlfaq that I mentioned, and I thought about mentioning it, but for some reason never did.

        As for your comment about avoiding things "that are not cross-platform," I agree and disagree. In this case, using socket options to set a timeout is clearly the better option. I think so not just because it's cross-platform, but also because it's a mechanism already there. I see writing one's own timeout when a builtin timeout is supplied is reinventing the wheel.

        I think we should be careful about being cross-platform to a fault, though. If one was trying to make a timeout for a system call that didn't have its own way of doing so, I would recommend using alarm. Always aiming for complete portability is, in my mind, somewhat akin to premature optimization. If something can be written much cleaner, simpler, and better with an unportable method, then I wouldn't immediately frown on it. Just as in the case of optimizing things before an actual performance problem is encountered, I think it's bad to make things highly portable before the need to do so is encountered.

        That's not to say I wouldn't choose the portable method with all other things being equal, but portability often requires a bit of hoop-jumping that can be a problem. Just like most decisions of this nature, there is a balance that needs to be considered. Making a statement that you "try to avoid things that are not cross-platform" is a generalization that gets bandied about a lot, and while it is a noble goal, isn't always practical.

      I understand what what you are saying. And alarm is nice, in that way. In fact, that's what I started with. And I found out that using alarm took longer to process and more CPU usage to accomplish the same task. My whole purpose for doing this was to reduce the amound of CPU usage and processing time. The code as it is now already has enough forking and such to clog up the CPU.

      Just because that's the way it's been done before doesn't mean that the way we should keep doing it.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://317980]
Approved by Coruscate
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (3)
As of 2024-04-25 19:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found