Jeppe has asked for the wisdom of the Perl Monks concerning the following question:
Esteemed Perl professionals and enthusiasts alike,
I'm looking for a solution or a product. What I need to do is to scale an application across servers.
The preferred method of doing so would be to send hundreds (or a few thousand) messages to a queue a second, and then whoever is available will grab a message from the queue and process it. The messages are small, just a short string containing a tablename and an ID.
All nodes will have a database connection to the same database, so I am indeed considering implementing this in a database table. But are there better alternatives out there?
I know about these solutions:
- Websphere MQ server - which is interesting but expensive, yet within reach for really large clients.
- Spread - which is not really suited for our purposes, as it sends one copy to each node.
- Spread::Queue - has not been maintained since 2002, not thread-safe, but looks OK otherwise.
- POE - a framework, so I need to create both server and client code.
Are there any I have missed that I should know about? What are your 2c?
Re: Distributed FIFO queues?
by Rhandom (Curate) on Aug 15, 2007 at 13:55 UTC
|
| [reply] |
|
How on earth does that module send coderefs to other machines? That's magic.
| [reply] |
|
| [reply] |
|
Re: Distributed FIFO queues?
by Joost (Canon) on Aug 15, 2007 at 13:38 UTC
|
I've done something like that, using custom software (no framework modules, only IO::Socket/IO::Select etc) but it would probably be easy to implement using POE (provided you know POE already) for the Queue process - i.e. use a single-threaded single-process queue.
As it is now, the whole server code is about 400 lines of perl, excluding the configuration parsing and daemonizing.
If the requests are simple, you can probably write the client code in only a few lines using IO::Socket::INET. (Set up a connection to the queue process, and for each request do a $connection->print(), possibly followed by a $connection->readline to check the status).
As for threads; as long as you're not using the same connection from multiple threads at the same time, there shouldn't be any problem.
| [reply] |
Re: Distributed FIFO queues?
by exussum0 (Vicar) on Aug 15, 2007 at 15:21 UTC
|
The only bit I can contribute is making it message driven vs request driven. Push instead of poll. Poll systems have the added overheard of the balance between lag between requests and requesting too much.
In a very controlled, well administered messaging system, where jobs are "sent" out to others, you'll get uncanny, natural scaling vs a pull system where you're always tuning the pulling.
Polling is great for detecting when things happen with no sense of predictability, such as figuring out if a system was completely unplugged.
Consider that if you wish to create queue systems by hand.
- segue here -
Try putting permutations of those terms into google. You may find other systems as well. Tibco is supposed to be super awesome, but.. pricey.
| [reply] |
Re: Distributed FIFO queues?
by rsmah (Scribe) on Aug 15, 2007 at 23:15 UTC
|
A few thousand messages per second is not trivial to implement. Luckily, the OpenAMQ project (http://www.openamq.org/) has scaled to handle 500,000 messages per second. It's supported by large institutions, it's open source and it looks well engineered. Take a gander. Bindings for perl are available (as well as Java, C#, C/C++, TCL, etc, etc). | [reply] |
|
that looks interesting but you are quoting from the planned/in-progress section of the faq. so there are currently no language bindings available. and the site only mentions 200k, not 500k.
| [reply] |
Re: Distributed FIFO queues?
by hubb0r (Pilgrim) on Aug 15, 2007 at 20:32 UTC
|
I've had good luck using IPC::DirQueue for a fifo queue that I run. It doesn't process as many items as you are talking about, but it is supposed to scale pretty well, and is transactional ( like Websphere MQ ). It is also supposed to be quite scalable, using NFS mounts ( I think ) to distribute the queues.
It definitely seems to be worth a look. It works wonders for me, and has been ultra-reliable. | [reply] |
Re: Distributed FIFO queues?
by rcaputo (Chaplain) on Aug 17, 2007 at 09:28 UTC
|
| [reply] |
Re: Distributed FIFO queues?
by NiJo (Friar) on Aug 16, 2007 at 20:23 UTC
|
At the rate of 1k/sec messages you are limited by the round trip rate. Clients spend much time waiting for the next work package, not even thinking about the data base.
I'd think about queuing work packages of 100 messages. That should be a lot easier on the common infrastructure. | [reply] |
Re: Distributed FIFO queues?
by john_oshea (Priest) on Aug 21, 2007 at 10:42 UTC
|
A bit of a late reply, but you may find xmlBlaster worth a look - it seems to be in the rough ballpark, speed-wise (672 messages/s in 'acknowledge each message' mode, a couple of thousand/s in bulk mode) and has bindings for multiple languages ("PHP, Perl, Python, C, C++, C#, Visual Basic.net, Flash, J2ME, Java client samples are delivered in the xmlBlaster distribution.").
I've no idea how good it may prove - it's in my "could come in handy at some point" list - hope that helps.
| [reply] |
|
|