Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Seeking advice on generating a syndication feed

by ptum (Priest)
on Nov 16, 2005 at 16:30 UTC ( [id://509093]=note: print w/replies, xml ) Need Help??


in reply to Seeking advice on generating a syndication feed

Why not create a simple relational database and redundantly log to that database whenever you log to a file? Then you can simply query your database from the webserver and get a customizable and real-time view of log activity without needing to re-parse your log files every time for every user. This approach would provide the ability to easily perform longer-term analysis of the log material as well as redundant backup of log information -- it also allows you to distribute the webserver and database on different platforms from your applications. Of course I know nothing about RSS feeds and this doesn't really answer your question, but it is perhaps an alternative you haven't considered. :)
  • Comment on Re: Seeking advice on generating a syndication feed

Replies are listed 'Best First'.
Re^2: Seeking advice on generating a syndication feed
by fizbin (Chaplain) on Nov 16, 2005 at 16:41 UTC

    In the past, we have found that adding an additional relational database to a system has had a negative impact on the reliability and maintainability of the system as a whole. I really don't want to add yet another thing for operations to have to monitor.

    Although we are investigating database-based logging for other purposes, I also don't understand why it's better to turn data from a database query into html than it is to turn plain text sitting in files on the file system into html. In both cases, I query a module, get back some text, and turn it into html, except that in the case of using a relational database there's more setup work.

    Am I missing something?

    --
    @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/

      I'm not real fond of duplication of data, either. Ideally, you'd log directly to the database, as you mention you're investigating. The advantages are incredibly numerous:

      • You leverage (warning! Buzzword!) huge amounts of expertise that are likely already in your corporation on concepts such as reliability, security, scalability. That is, your DBAs can schedule backups and configure failover nodes, and even design for clustered servers. Most of this should be transparent to both the programs doing the logging and the programs that consume the logs (web app, whatever).
      • YOU don't need to worry about reliability, etc. Your RDBMS vendor has done that for you, and your DBA has been trained in how to do these things. Without the RDBMS, you now have to concern yourself with details like concurrency (writing to and reading from the same logfile) and transactional integrity (same thing - but imagine that the write is only partly finished when you try to read that record - RDBMS is supposed to prevent that from happening). Or even power failures - if you're in the middle of a write when the power goes down, you end up with damaged data. An RDBMS is supposed to be able to either be able to recover the damaged data or to remove it (lost data - but you lose the whole transaction or none of it).
      • You aren't nearly as stuck on a single technology (e.g., perl and CGI). You could, for example, give your DI folks a java application using JDBC that could do different fancy things. This is a great thing if they get really finicky - you can swap out front ends without worrying about the back end since the back end is completely standard. It's always nice to have choice in your tools - it allows you to select the best tool for the job. (If they're all on Windows, you could even use Visual Basic with ODBC, if that's what you have more skill in. Again with the choice of tools thing.)
      Imagine, for a minute, that this system becomes business critical. That means, no unscheduled outages are acceptable. Are you prepared to go business critical with it? Phone calls at 3AM? Given your description of this service, I could see that this type of service could be not only business critical, but a form of revenue. You don't want to be at the end of the "we're losing money without this working!" train. Being able to point fingers at the DBAs who point at the DB vendor, that's much more comforting. ;-)

        I keep hearing about how relational databases provide these benefits, and yet my experience is the opposite.

        This, especially, made me laugh:

        YOU don't need to worry about reliability, etc. Your RDBMS vendor has done that for you, and your DBA has been trained in how to do these things.
        Maybe in some organizations there's a glut of experienced, helpful DBAs, but not here. We basically have this one guy. He's pretty good, but he's just one guy. Database backups and restores are known sources of operator error. I've never seen a relational database as reliable as a file system. On scalability, sure, I could believe that relational databases beat flat files there, but that's not a design consideration here (well, not yet). The concerns right now are reliability and the ability to "just work" without the operators needing to necessarily know how DI reads log messages.

        Don't get me wrong - I'm a believer in relational databases in some contexts, but they're also fragile beasts requiring lots of baggage to support. (I already get those 3AM phone calls relating to business critical processes that failed, and have no desire to introduce anything so tempermental into the system)

        Relational logging is nice when you need the aggregation power, or the ability to say "show me the log messages issued from time A to time B by any process on any system" - that is, to perform queries across multiple jobs' log messages, but that's not what we need at all.

        --
        @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/
      To a guy with a hammer, everything looks like a nail -- I am guilty of that sometimes in applying databases to problems. :) I guess I would ask a few questions about your data, positive answers to which might lead me to prefer a database over plain text files:
      • Is there a lot of data?
      • Is the data scattered across multiple systems/platforms?
      • Does the data lend itself to aggregation or categorization?
      • Is the data more transient than I would prefer?
      • Are the rules to parse the data complex?
      • Is the server on which the data resides used for other mission-critical operations?
      • ... and so on
      If the answers to these questions are all 'no', then there may be no benefit to adding yet another database. But if any of the answers are 'yes', then you may see considerable benefit in terms of performance, reliability, maintainability, etc. Just a few thoughts. I've done this a couple of times and I have never (yet) said to myself, "Dang, I wish I had just done this with files." Maybe that just demonstrates that I am stubborn. :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://509093]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (2)
As of 2024-04-25 22:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found