http://qs321.pair.com?node_id=840678

back-n-black has asked for the wisdom of the Perl Monks concerning the following question:

I've never been that good at regular expressions. What I want to do is parse many log entries for words, ultimately, in SQL like expressions.

For example.

$line = "05/04/2010 13:09:45 - A - somebody - ( ( my.my id >= 1 ) ) and ( ( is-relative.to code = 'sister' ) or ( is-relative.to code = 'brother' ) or ( is-mother.to code = 'dog' ) )";

What ultimately I need out of these strings are:

my.my id is-relative.to code is-relative.to code is-mother.to code

but something like this would be great!

( my.my id >= 1 ) ( is-relative.to code = 'sister' ) ( is-relative.to code = 'brother' ) ( is-mother.to code = 'dog' )

or

my.my id >= 1 ) is-relative.to code = 'sister' is-relative.to code = 'brother' is-mother.to code = 'dog'

I have been looking a while for hints to an elegant resolution for this problem. There is much dialogue about the use of Text::Balanced but not enough examples in the documentation for my little brain, to help me solve the riddle.

I have an example here that just pulls the expressions, I know what to do from there. I would like some ideas or code examples on a more elegant solution using one of the CPAN modules if that is possible.

What it basically does is:

  1. Split the text at the first close parens
  2. Parse the expression out of this "before" text
    • Remove everything up to and including the last open paren
    • Remove any beginning or trailing spaces
  3. Split the "after" text this time, and repeat the above operations

Here is a snippet of code that pulls the expressions

$text = "05/04/2010 13:09:45 - A - somebody - ( ( my.my id >= 1 ) ) an +d ( ( is-relative.to code = 'sister' ) or ( is-relative.to code = 'br +other' ) or ( is-mother.to code = 'dog' ) )"; my $new = $text; while ( 1 ) { $ind = index($new, ')'); # Split the text at the first close parens $before = substr($new,0,$ind); $after = substr($new,$ind); last if ( $before eq "" ); # Clean up the before string # Remove everything up to and including the last open paren # Remove any beginning or trailing spaces $before = substr($before,rindex($before,'(')+1); $before =~ s/^\s+//; $before =~ s/\s+$//; push(@list,$before); if ( $after =~ /\)/ ) { # Disgard chars up to the first open paren $after = substr($after,index($after,'(')+1); $new = $after; print "\n"; } else { last; } } foreach my $i (@list) { print "--".$i."--\n"; }