InfiniteLoop has asked for the wisdom of the Perl Monks concerning the following question:
Greetings Monks,
In my latest project, we are planning to use a cluster of mysql, in a single master/multiple slave mode. Basically, we plan to have to do all writes (insert and update) to the master database, and all reads (select) happen on the slave database(s).
In all my model classes I want all database access to happen seamlessly, i.e the model class need not know which database it should connect to. I plan to write my own db access module, which base classes DBI and over-ride execute() for this.
Do you know of any easier ways ? Are there any existing perl module for this ?
Re: mysql cluster
by gmax (Abbot) on Aug 03, 2005 at 20:29 UTC
|
my own db access module, which base classes DBI and over-ride execute() for this
I advise you against doing this.
Even if you succeed in changing the host overriding the execute() method (perhaps you should aim at prepare()), you would undermine any transaction that could be under way.
The MySQL replication schema may be unconvenient for its division of write-on-master and read-from-slaves, but it's rather solid. In my experience, using it with transactional tables (InnoDB) guarantees replication of transactions without a glitch.
The solution you are proposing is filter-based, and it's similar to the one offered from some commercial clusters, (emic networks, for example), which lose their transaction capabilities in the bargain.
Remember that, when using transactions, you should direct to the master the whole transaction, including any reading instruction within it. If you don't, you may get completely wrong results.
Instead of filtering, you could create an application wrapper, to send all modifying statements and transactions to the master, and rotate the slaves who should receive a simple read statement.
I don't know of any (public) Perl modules that deal with this problem. I may only recommend reading Jeremy Zawodny's High Performance MySQL (there is a sample chapter about replication).
Update
If you are tempted to solve your problem by looking at MySQL Cluster, be aware that it's a completely different database system. Every node is at the same time master and slave, but you pay this luxury with a huge amount of RAM. According to the docs, in the current implementation, you need to have enough RAM to cover twice the size of your data, plus 20%. Therefore, if your data is 10 GB, your cluster must have 24 GB of RAM available (divided among all data nodes). There are plans of changing the in-memory architecture to something more resources-friendly, but don't hold your breath.
| [reply] |
Re: mysql cluster
by Joost (Canon) on Aug 03, 2005 at 20:45 UTC
|
this introduction to mysql cluster seems to indicate that you don't need any special client code to use it. Note that this is *not* a master/slave replication setup, and that I haven't used it myself, but it might be worth checking out.
update: gmax pointed out that this setup uses in-memory databases, and requires "huge amounds of RAM", so it might not be the solution you are looking for.
| [reply] |
Re: mysql cluster
by aufflick (Deacon) on Aug 04, 2005 at 03:09 UTC
|
Is your main aim load sharing or redundancy? | [reply] |
|
The aim is to share the load of accessing data, hence many "read" databases. For now I plan to go with gmax's suggestion of having an application wrapper.
| [reply] |
|
| [reply] |
Re: mysql cluster
by dragonchild (Archbishop) on Aug 04, 2005 at 18:27 UTC
|
Similar to gmax's solution, I have a wrapper (that I've written myself) that provides, among other things, a very simple do_select() and do_write(). (It also builds the SQL and does a lot of other book-keeping for me.) So, when you create this wrapper, you'd specify a set of read_dbs and a write_db. do_select() would pick the correct read_db and do_write() would go against the write_db.
Now, it sounds like you're building some sort of distributed app, like a web application. It's not going to be simple to do this round-robin'ing for the read_db's from within an Apache child process. And, frankly, I don't think you need to until you start hitting the 100 request/second mark. That is, not with a good schema and well-tuned server parameters. And, the master/slave relationship may actually SLOW you down when you're trying to speed up.
Frankly, I'd build a basic DB server in some 2/3-tier layout and see if that's good enough. If it is, you're good. If it's not, then you have a reason to spend more money. Often, it's cheaper to build a 4-CPU Xeon w/8G of RAM than it is to develop some sort of round-robin'ing DBI wrapper. YMMV
My criteria for good software:
- Does it work?
- Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
| [reply] |
|
|