Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Re^2: Seeking advice on generating a syndication feed

by fizbin (Chaplain)
on Nov 16, 2005 at 16:41 UTC ( #509096=note: print w/replies, xml ) Need Help??

in reply to Re: Seeking advice on generating a syndication feed
in thread Seeking advice on generating a syndication feed

In the past, we have found that adding an additional relational database to a system has had a negative impact on the reliability and maintainability of the system as a whole. I really don't want to add yet another thing for operations to have to monitor.

Although we are investigating database-based logging for other purposes, I also don't understand why it's better to turn data from a database query into html than it is to turn plain text sitting in files on the file system into html. In both cases, I query a module, get back some text, and turn it into html, except that in the case of using a relational database there's more setup work.

Am I missing something?

@/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/

Replies are listed 'Best First'.
Re^3: Seeking advice on generating a syndication feed
by Tanktalus (Canon) on Nov 16, 2005 at 17:01 UTC

    I'm not real fond of duplication of data, either. Ideally, you'd log directly to the database, as you mention you're investigating. The advantages are incredibly numerous:

    • You leverage (warning! Buzzword!) huge amounts of expertise that are likely already in your corporation on concepts such as reliability, security, scalability. That is, your DBAs can schedule backups and configure failover nodes, and even design for clustered servers. Most of this should be transparent to both the programs doing the logging and the programs that consume the logs (web app, whatever).
    • YOU don't need to worry about reliability, etc. Your RDBMS vendor has done that for you, and your DBA has been trained in how to do these things. Without the RDBMS, you now have to concern yourself with details like concurrency (writing to and reading from the same logfile) and transactional integrity (same thing - but imagine that the write is only partly finished when you try to read that record - RDBMS is supposed to prevent that from happening). Or even power failures - if you're in the middle of a write when the power goes down, you end up with damaged data. An RDBMS is supposed to be able to either be able to recover the damaged data or to remove it (lost data - but you lose the whole transaction or none of it).
    • You aren't nearly as stuck on a single technology (e.g., perl and CGI). You could, for example, give your DI folks a java application using JDBC that could do different fancy things. This is a great thing if they get really finicky - you can swap out front ends without worrying about the back end since the back end is completely standard. It's always nice to have choice in your tools - it allows you to select the best tool for the job. (If they're all on Windows, you could even use Visual Basic with ODBC, if that's what you have more skill in. Again with the choice of tools thing.)
    Imagine, for a minute, that this system becomes business critical. That means, no unscheduled outages are acceptable. Are you prepared to go business critical with it? Phone calls at 3AM? Given your description of this service, I could see that this type of service could be not only business critical, but a form of revenue. You don't want to be at the end of the "we're losing money without this working!" train. Being able to point fingers at the DBAs who point at the DB vendor, that's much more comforting. ;-)

      I keep hearing about how relational databases provide these benefits, and yet my experience is the opposite.

      This, especially, made me laugh:

      YOU don't need to worry about reliability, etc. Your RDBMS vendor has done that for you, and your DBA has been trained in how to do these things.
      Maybe in some organizations there's a glut of experienced, helpful DBAs, but not here. We basically have this one guy. He's pretty good, but he's just one guy. Database backups and restores are known sources of operator error. I've never seen a relational database as reliable as a file system. On scalability, sure, I could believe that relational databases beat flat files there, but that's not a design consideration here (well, not yet). The concerns right now are reliability and the ability to "just work" without the operators needing to necessarily know how DI reads log messages.

      Don't get me wrong - I'm a believer in relational databases in some contexts, but they're also fragile beasts requiring lots of baggage to support. (I already get those 3AM phone calls relating to business critical processes that failed, and have no desire to introduce anything so tempermental into the system)

      Relational logging is nice when you need the aggregation power, or the ability to say "show me the log messages issued from time A to time B by any process on any system" - that is, to perform queries across multiple jobs' log messages, but that's not what we need at all.

      @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/
Re^3: Seeking advice on generating a syndication feed
by ptum (Priest) on Nov 16, 2005 at 17:07 UTC
    To a guy with a hammer, everything looks like a nail -- I am guilty of that sometimes in applying databases to problems. :) I guess I would ask a few questions about your data, positive answers to which might lead me to prefer a database over plain text files:
    • Is there a lot of data?
    • Is the data scattered across multiple systems/platforms?
    • Does the data lend itself to aggregation or categorization?
    • Is the data more transient than I would prefer?
    • Are the rules to parse the data complex?
    • Is the server on which the data resides used for other mission-critical operations?
    • ... and so on
    If the answers to these questions are all 'no', then there may be no benefit to adding yet another database. But if any of the answers are 'yes', then you may see considerable benefit in terms of performance, reliability, maintainability, etc. Just a few thoughts. I've done this a couple of times and I have never (yet) said to myself, "Dang, I wish I had just done this with files." Maybe that just demonstrates that I am stubborn. :)

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://509096]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (3)
As of 2022-08-18 02:29 GMT
Find Nodes?
    Voting Booth?

    No recent polls found