I've never been that good at regular expressions. What I want to do is parse many log entries for words, ultimately, in SQL like expressions.
For example.
$line = "05/04/2010 13:09:45 - A - somebody - ( ( my.my id >= 1 ) ) and ( ( is-relative.to code = 'sister' ) or ( is-relative.to code = 'brother' ) or ( is-mother.to code = 'dog' ) )";
What ultimately I need out of these strings are:
my.my id
is-relative.to code
is-relative.to code
is-mother.to code
but something like this would be great!
( my.my id >= 1 )
( is-relative.to code = 'sister' )
( is-relative.to code = 'brother' )
( is-mother.to code = 'dog' )
or
my.my id >= 1 )
is-relative.to code = 'sister'
is-relative.to code = 'brother'
is-mother.to code = 'dog'
I have been looking a while for hints to an elegant resolution for this problem. There is much dialogue about the use of Text::Balanced but not enough examples in the documentation for my little brain, to help me solve the riddle.
I have an example here that just pulls the expressions, I know what to do from there. I would like some ideas or code examples on a more elegant solution using one of the CPAN modules if that is possible.
What it basically does is:
- Split the text at the first close parens
- Parse the expression out of this "before" text
- Remove everything up to and including the last open paren
- Remove any beginning or trailing spaces
- Split the "after" text this time, and repeat the above operations
Here is a snippet of code that pulls the expressions
$text = "05/04/2010 13:09:45 - A - somebody - ( ( my.my id >= 1 ) ) an
+d ( ( is-relative.to code = 'sister' ) or ( is-relative.to code = 'br
+other' ) or ( is-mother.to code = 'dog' ) )";
my $new = $text;
while ( 1 ) {
$ind = index($new, ')');
# Split the text at the first close parens
$before = substr($new,0,$ind);
$after = substr($new,$ind);
last if ( $before eq "" );
# Clean up the before string
# Remove everything up to and including the last open paren
# Remove any beginning or trailing spaces
$before = substr($before,rindex($before,'(')+1);
$before =~ s/^\s+//;
$before =~ s/\s+$//;
push(@list,$before);
if ( $after =~ /\)/ ) {
# Disgard chars up to the first open paren
$after = substr($after,index($after,'(')+1);
$new = $after;
print "\n";
} else {
last;
}
}
foreach my $i (@list) {
print "--".$i."--\n";
}
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.