Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^3: need to parse firts part of SQL-query (regex question)

by grinder (Bishop)
on Jan 17, 2008 at 17:22 UTC ( #662924=note: print w/replies, xml ) Need Help??


in reply to Re^2: need to parse firts part of SQL-query (regex question)
in thread need to parse firts part of SQL-query (regex question)

That doesn't seem to do what the OP wants

Come now young man, where's your sense of adventure? With a bit of lookahead and a state machine you can easily massage the token stream into something useful:

use SQL::Tokenizer; my $query = q{f1,f2, SUM(f3),CONCAT(f4,f5, f6), sum((f1+f2)*f3)}; my @token = SQL::Tokenizer->tokenize($query); my $paren_depth = 0; my $cache = ''; while(my $val = shift @token) { if ($token[0] eq '(') { $paren_depth++; } if ($val eq ')') { $paren_depth--; if ($paren_depth == 0) { print $cache; $cache = ''; } } if ($paren_depth) { $cache .= $val; } else { print "$val\n"; } } __PRODUCES__ f1 , f2 , SUM(f3) , CONCAT(f4,f5, f6) , sum((f1+f2)*f3)

That's not too shabby. The tokenizer does the heavy lifting, you just have to put the pieces back together again.

• another intruder with the mooring in the heart of the Perl

Replies are listed 'Best First'.
Re^4: need to parse firts part of SQL-query (regex question)
by Not_a_Number (Prior) on Jan 17, 2008 at 17:37 UTC

    grinder++ !

    It's rather like a solution I came up with, albeit without using the module in question:

    my $str = 'f1,f2, SUM(f3),CONCAT(f4,f5, f6), f7'; my ( $tok, @toks, $parens ); while ( $str ) { my $char = substr $str, 0, 1, ''; $char eq ' ' and next; $char eq '(' and $parens++; $char eq ')' and $parens--; $char eq ',' && ! $parens and push( @toks, $tok ), $tok = '', next; $tok .= $char; push @toks, $tok if ! $str; } print join ' -- ', @toks;

    ...but I didn't want to post it for fear that it wouldn't be very robust. :)

      fear that it wouldn't be very robust

      Yes indeed. Real-world SQL is likely to have named the columns, so you have to deal with concat(f4, f5, f6) as "foobarquux (combined)". Before you know it, you have reimplemented SQL::Tokenizer!

      • another intruder with the mooring in the heart of the Perl

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://662924]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2020-11-27 03:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?