http://qs321.pair.com?node_id=632737

Jeppe has asked for the wisdom of the Perl Monks concerning the following question:

Esteemed Perl professionals and enthusiasts alike,

I'm looking for a solution or a product. What I need to do is to scale an application across servers.

The preferred method of doing so would be to send hundreds (or a few thousand) messages to a queue a second, and then whoever is available will grab a message from the queue and process it. The messages are small, just a short string containing a tablename and an ID.

All nodes will have a database connection to the same database, so I am indeed considering implementing this in a database table. But are there better alternatives out there?

I know about these solutions:

Are there any I have missed that I should know about? What are your 2c?

Replies are listed 'Best First'.
Re: Distributed FIFO queues?
by Rhandom (Curate) on Aug 15, 2007 at 13:55 UTC
    I believe the flavor of the year is Gearman by Brad Fitzpatrick. I think that it should have plenty of power to handle what your looking for. There is also TheSchwartz (also by Brad Fitzpatrick) that may be more reliable but higher latency. If you combine Gearman with TheSwartz you should be able to do just about anything with job queuing.

    update - s/TheSwartz/TheSchwartz/ (thanks to clinton)

    my @a=qw(random brilliant braindead); print $a[rand(@a)];
      How on earth does that module send coderefs to other machines? That's magic.

      -Paul

        How on earth does that module send coderefs to other machines? That's magic.

        (me - confused... ) Um. Neither I nor the OP mentioned sending coderefs to other machines. If you are basing this off of the part where I said "should be able to do just about anything," I am sorry I wasn't more clear. I also think that it would not allow you to achieve world domination or various other things unrelated to message queuing (which I didn't specify specifically).

        On another note - I think that if you wanted to open up your system to arbitrary code execution, and if your coderefs were simple enough, you could pass them through the message queue using B::Deparse and eval.

        I guess really that module doesn't do anything that you don't tell it too.

        my @a=qw(random brilliant braindead); print $a[rand(@a)];
Re: Distributed FIFO queues?
by Joost (Canon) on Aug 15, 2007 at 13:38 UTC
    I've done something like that, using custom software (no framework modules, only IO::Socket/IO::Select etc) but it would probably be easy to implement using POE (provided you know POE already) for the Queue process - i.e. use a single-threaded single-process queue.

    As it is now, the whole server code is about 400 lines of perl, excluding the configuration parsing and daemonizing.

    If the requests are simple, you can probably write the client code in only a few lines using IO::Socket::INET. (Set up a connection to the queue process, and for each request do a $connection->print(), possibly followed by a $connection->readline to check the status).

    As for threads; as long as you're not using the same connection from multiple threads at the same time, there shouldn't be any problem.

Re: Distributed FIFO queues?
by exussum0 (Vicar) on Aug 15, 2007 at 15:21 UTC
    The only bit I can contribute is making it message driven vs request driven. Push instead of poll. Poll systems have the added overheard of the balance between lag between requests and requesting too much.

    In a very controlled, well administered messaging system, where jobs are "sent" out to others, you'll get uncanny, natural scaling vs a pull system where you're always tuning the pulling.

    Polling is great for detecting when things happen with no sense of predictability, such as figuring out if a system was completely unplugged.

    Consider that if you wish to create queue systems by hand.

    - segue here -

    Try putting permutations of those terms into google. You may find other systems as well. Tibco is supposed to be super awesome, but.. pricey.

Re: Distributed FIFO queues?
by rsmah (Scribe) on Aug 15, 2007 at 23:15 UTC
    A few thousand messages per second is not trivial to implement. Luckily, the OpenAMQ project (http://www.openamq.org/) has scaled to handle 500,000 messages per second. It's supported by large institutions, it's open source and it looks well engineered. Take a gander. Bindings for perl are available (as well as Java, C#, C/C++, TCL, etc, etc).
      that looks interesting but you are quoting from the planned/in-progress section of the faq. so there are currently no language bindings available. and the site only mentions 200k, not 500k.
Re: Distributed FIFO queues?
by hubb0r (Pilgrim) on Aug 15, 2007 at 20:32 UTC
    I've had good luck using IPC::DirQueue for a fifo queue that I run. It doesn't process as many items as you are talking about, but it is supposed to scale pretty well, and is transactional ( like Websphere MQ ). It is also supposed to be quite scalable, using NFS mounts ( I think ) to distribute the queues.

    It definitely seems to be worth a look. It works wonders for me, and has been ultra-reliable.
Re: Distributed FIFO queues?
by rcaputo (Chaplain) on Aug 17, 2007 at 09:28 UTC
Re: Distributed FIFO queues?
by NiJo (Friar) on Aug 16, 2007 at 20:23 UTC
    At the rate of 1k/sec messages you are limited by the round trip rate. Clients spend much time waiting for the next work package, not even thinking about the data base.

    I'd think about queuing work packages of 100 messages. That should be a lot easier on the common infrastructure.

Re: Distributed FIFO queues?
by john_oshea (Priest) on Aug 21, 2007 at 10:42 UTC

    A bit of a late reply, but you may find xmlBlaster worth a look - it seems to be in the rough ballpark, speed-wise (672 messages/s in 'acknowledge each message' mode, a couple of thousand/s in bulk mode) and has bindings for multiple languages ("PHP, Perl, Python, C, C++, C#, Visual Basic.net, Flash, J2ME, Java client samples are delivered in the xmlBlaster distribution.").

    I've no idea how good it may prove - it's in my "could come in handy at some point" list - hope that helps.