http://qs321.pair.com?node_id=514755


in reply to OT: Scalable web application architecture

It sounds like you either need a substantial rethink of how your application works and does its work or you need some database optimization foo.

The former is hard to guess at without more information on the system, but the latter is easy enough to take a few shot at.

You might find the slides from my ApacheCon/MySQL UC/OSCON talk useful.

- ask (available for short contracts working on that sort of problems)

  • Comment on Re: OT: Scalable web application architecture

Replies are listed 'Best First'.
Re^2: OT: Scalable web application architecture
by badaiaqrandista (Pilgrim) on Dec 07, 2005 at 11:40 UTC
    # Are you using MyISAM tables? Consider switching the tables that are getting "congested" to InnoDB.

    Yes I am. But I did change to InnoDB but I got more "Can't connect" error than when using MyISAM tables, so I rolled it back to MyISAM. I haven't known much about InnoDB back then, so I probably didn't set its options correctly. The "Can't connect" error happen when multiple properties are trying to fill up the cache (see next answer).

    # Is it the caching table that's getting congested with reads and writes? Maybe you can do your caching more effectively? (Maybe use memcached?)

    Yes, it is. The cache is basically just a place to hold intermediate search results on a given date, so that future searches in that date don't have to do all the computation to decide whether a room-package combination is available on that date.

    Following is a simplified list of the cache table fields:

    field nametype
    property_idINT
    dateDATE
    room_idINT
    package_idINT
    valid_priceBOOLEAN
    available_room_countINT
    room_capacityINT
    package_min_guestsINT
    package_max_guestsINT
    available_to_travel_agentBOOLEAN

    The search only need to execute one SELECT query on this table if the content reflect the actual data. But when the actual data is modified, updating this table takes a long time, because the UPDATE query competes with massive SELECT queries. The update becomes worse when the update must be done for each date.

    Therefore, simple cache mechanism like memcached is not suitable for this. I do use it to cache other fairly static data, though.

    # Maybe you can partition your data? Have "cat reservations" on one database server and "dog reservations" on another.

    Because I am the only programmer and the only DBA, I hesitate in doing this. Even now, I have difficulty maintaining such a large code base I've written in the past 6 month.

    # Also be sure to do all the "usual" MySQL optimization tricks (is it using the proper amount of memory? Do you have proper indexes on all tables? Enable the slow query log and check with EXPLAIN SELECT ... that all your queries are using indexes when possible. If it's not possible, rethink your table schema.

    That's why I am asking on perlmonks. I want to know what's the optimal database schema should look like. Currently I have one database for each property, hence we have more than 400 databases with the same schema. I know it could be better. About optimization, I have done as much as possible everything in the Optimization chapter of mysql documentation.

      I have some experience with an online reservation system.

      You say that you have 1 database for each property, 400 in total with the same schema. It sounds like you should have all the properties in a single database. You can then link all your pricing, seasons and availablity etc to each property using your unique property id.

      My guess is that you are getting slowed down with all the database connects required for each query. Plus you will be repeating the same query on each database (400 times ouch).


      mr greywolf

        That's how the database structure when I inherited the software. That's also why I created this 'search cache' thingy. I want to put all things in one table to speed up searching. But it becomes too overloaded with SELECT, INSERT, and UPDATE queries at the same time.

        Changing the whole database structure is something I don't want to do right now because we don't have automated test that can give me security in mind.

        badaiaqrandista