Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

YAML for logs?

by blazar (Canon)
on Nov 25, 2006 at 11:24 UTC ( [id://586000]=perlquestion: print w/replies, xml ) Need Help??

blazar has asked for the wisdom of the Perl Monks concerning the following question:

Since I've recently grown enthusiast at the happy combination between YAML's ease of use through suitable packages as a data serialization means and its human readability, I've been considering using it for some logs.

Indeed, since you can have multiple YAML streams in one string/file, as a format it's good for appending. However when it comes to logs, you often want to check them. YAML::Syck, for example, lets you easily store and recover multiple streams. So far, that's fine. But what if one wants to continuously inspect the logs as new "records" are appended?

I am aware of the various strategies for dealing with "tail -f" kinda stuff, and I know how to search for more, but the problem here is that a record itself would comprise multiple lines. I can imagine a few ways to overcome this, but more or less they all amount to do some parsing by hand.

Or else, at least with regular enough data, would it suffice to set $/ to "--- "? (Although even in that case it would be at the beginning of each record rather than at the end - I may still need to massage each record individually before processing it.) I would like to hear some opinion...

Replies are listed 'Best First'.
Re: YAML for logs?
by Fletch (Bishop) on Nov 25, 2006 at 15:32 UTC

    My first thought (and keep in mind I'm a YAML-for-everything kind of guy) is . . . eewwww. I might see maybe having a field that's Base64 encoded YAML for some structured data about the event, but for pure logging I can't think that YAML would be an ideal, let alone good, solution. Then again I may just be so used to the *NIX-y line-at-a-time log file style I'm missing something.

      Good points. And I'm not really sure if I'm gonna need it. Or even want it. But then I can imagine situations, call them corner cases, in which that would be desirable. As always, desirability is in the eye of the beholder, but the situations I'm referring to are those in which on line-at-a-time could become clumsy if you want to take into account human readability. Whatever, I'd still be interested in the feasability of the thing, and about how one would go to accomplish it.

Re: YAML for logs?
by diotalevi (Canon) on Nov 25, 2006 at 16:42 UTC

    I've seen people make the same arguments about using Lisp as a serialization format for log files. In your case, you lose if you use YAML because of the multiple version thing going on. Hopefully there's a stream reading YAML library. If so, just feed it a handle and go to town.

    Personally I hate YAML because I find it impossible to edit by hand even though people keep assuring me that it's designed for that. They're liars, all. I've found I nearly always need a computer to write the YAML for me. It's a stupid, stupid language and somehow, for some strange reason, it's the fad right now. It makes me want to puke on my shoes.

    ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

      Personally I hate YAML because I find it impossible to edit by hand even though people keep assuring me that it's designed for that. They're liars, all. I've found I nearly always need a computer to write the YAML for me. It's a stupid, stupid language and somehow, for some strange reason, it's the fad right now. It makes me want to puke on my shoes.

      I respect your opinion very much so I'll take into account and for sure do not disregard it with an "I don't agree" a priori. I may concede on the writing bit, and indeed it seems to me that it should be easy, but I've not really done that. Seriously, I mean. But here I was more focusing on the reading bit. For example, these are the last five lines for my (Apache's) access_log at http://blazar.perlmonk.org/:

      If I run them through

      #!/usr/bin/perl use strict; use warnings; use Parse::AccessLogEntry; use YAML::Syck; my $p=Parse::AccessLogEntry->new; chomp, print Dump $p->parse($_) while <>; __END__

      I get the following, which is only marginally more verbose than the original, but in a way that clears up the meaning of things, and IMHO largely more readable.

      I can imagine further "advantages" if some hierarachical info were to be included.

        I'm bitchy about YAML purely because of the last dozen times I tried writing some by hand, I failed because YAML syntax is more obscure than my current understanding. I recall going to the YAML web site to find a quick description but there was just a big reference doc and that was more overhead than I wanted. YAML is supposed to be simple, right? Well it's not. Actually, I don't think YAML is supposed to be simple. I'd rather write my data in Perl or lisp. The former is common to everyon here and the latter is so simple that any moron can write it.

        ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

      You're kidding, right??!?

      I've given up sliced bread for breakfast in favour of YAML! I've done a number of projects which have a lot of configuration, and YAML makes this as easy as pie. Or sliced bread.

      Call me a liar, ... oh you did already... but which of these is easier to read and maintain:

      Granted, I can't see any purpose at all in using it for logging...

      I like YAML, but I've also never managed to write it by hand. I generally create the appropriate data structure in Perl, then serialize it to YAML and use that as a template for further files. See hack #12 in Perl Hacks, for example.

        It's simple, really. Try spending several minutes reading this cookbook.

        My only gripe with YAML is that I haven't found a module to manipulate YAML while _preserving_ the comments.

      {snip} and somehow, for some strange reason, it's the fad right now.

      It's popular with the Ruby crowd. I think Rails also uses it as an alternative to CSV files.

Re: YAML for logs?
by mattr (Curate) on Nov 26, 2006 at 07:38 UTC
    I have only used YAML a little as yet but am going to be using it a lot more because of Catalyst and also for non-Catalyst work e.g. for object template configuration or whatever. For simple things I use either Config::IniFiles or more often just a simple tab-separated text file with hash keys (for error messages) though these might all end up in YAML soon.

    I thought about setting the input record separator (see perlvar, perlrun) to "---\n" which can be done, but on the command line perl only seems able to set a single character separator with the command line switches. Then I looked at File::Tail. You might like to contact the author is thinking of adding record separator support if you have a good reason for it, and this seems like a good one (or do it yourself and submit back to him later).

    Well I fiddled with the command line and got this far: cat test.yml|perl -nla -0777 -F/---/ -e 'foreach $s (@F) {print "***STREAM***$s\n";}'|more but it doesn't work with tail as far as I can see.

    So if you want to follow a file, I'd guess either build on File::Tail, or better yet roll your own by reading from a pipe within a perl program. In the past I've used an interactive shell based on Term::Readline to try out multiline scripts with the Gimp, and it worked great. Come to think of it what about ysh? ... and lo and behold I open /usr/bin/ysh and it is using Term::Readline. I would guess the easiest thing to do would be to just modify ysh. I got cat to work with it but not tail, not sure why. Also for some reason it gave a parse error (bad alias) for the long data structure posted in the thread.

    Oh, one more datapoint. Boulder is something like yaml made for bioperl, and used in piped workflows. So you aren't the first person to want to do this and it shouldn't be too hard. If I was doing this I would probably just roll my own program I think to watch a file and pull in lines, decoding from yaml when a separator is reached. Term::Readline might do it too. Bioperl does something like what you want and it might even work on yaml files as-is. It's used to slurp in long gene text files.

      I thought about setting the input record separator (see perlvar, perlrun) to "---\n" which can be done, but on the command line perl only seems able to set a single character separator with the command line switches.

      Well, whatever: I wouldn't be doing this from the command line anyway.

      Oh, one more datapoint. Boulder is something like yaml made for bioperl, and used in piped workflows. So you aren't the first person to want to do this and it shouldn't be too hard. If I was doing this I would probably just roll my own program I think to watch a file and pull in lines, decoding from yaml when a separator is reached.

      But the point is that "--- \n" wouldn't be a separator. It would rather be a sort of introductory line. The concept is close, but not quite the same. Which is why I wrote that I'd have to massage the chunks anyway. Perhaps, also in view of something like that hinted elsewhere, a multiline per record format based on a data serialization one "should" (from my blazarcentric POV) be paragraph oriented...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://586000]
Approved by Joost
Front-paged by andyford
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2024-04-19 08:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found