After trying to follow the
output of SQLite's "explain"
command, I waved Graph::Easy at it and produced
reasonable graphs of the possible execution flow.
The script below is quite rudimentary but I'm interested
in getting feedback early on.
I'd also like pointers to information on converting
stack-based languages like vdbe and postscript into
a conventional (applicative?) language, if it's possible.
$ sqlite3 db 'create table exampl (one, two)';
$ sqlite3 -separator $'\t' db 'explain select one from exampl'; | tee
+ops |head -n 3
0 Goto 0 10
1 Integer 0 0
2 OpenRead 0 2
# Edited output in asc format
$ ./vdbe_graph.pl -f asc ops
[...]
+------------------------+
| 2 OpenRead (0, 2) |
+------------------------+
v
+------------------------+
| 3 SetNumColumns (0, 1) |
+------------------------+
v
+------------------------+
| 4 Rewind (0, 8) | -+
+------------------------+ |
v |
+------------------------+ |
+> | 5 Column (0, 0) | |
| +------------------------+ |
| v |
| +------------------------+ |
| | 6 Callback (1, 0) | |
| +------------------------+ |
| v |
| +------------------------+ |
+- | 7 Next (0, 5) | |
+------------------------+ |
v |
+------------------------+ |
| 8 Close (0, 0) | <+
+------------------------+
v
[...]
The code for vdbe_graph.pl
#!/usr/bin/perl -w
# vdbe_graph.pl
use strict;
use YAML;
use Graph::Easy;
use Getopt::Std;
binmode STDOUT, ':utf8';
my %opt;
getopts('o:f:', \%opt);
my $g = Graph::Easy->new();
$g->set_attribute('flow', 'south');
my $prev;
my @ops;
while (<>) {
chomp;
my @fields = split /\t/, $_;
next unless $fields[0] =~ /^\d+$/;
my %curr;
@curr{qw(n op p1 p2 p3)} = @fields;
my $params = join(", ", @fields[2..$#fields]);
my $n = $g->add_node($curr{n});
$n->set_attribute(label => "$curr{n} $curr{op} ($params)");
$n->set_attribute(align => 'left');
if ($prev) {
$g->add_edge($prev->{n} => $curr{n});
}
$prev = \%curr;
push @ops, \%curr;
}
# If P2 is not zero and one or more of the entries are NULL, then jum
+p to the address given by P2
# MakeRecord MoveGe MoveGt MoveLt MoveLe Rewind
my %jumps = map { $_ => 1 } qw(
Next Prev Gosub ForceInt MustBeInt Eq Ne Lt Le Gt Ge
If IfNot IsNull NotNull MakeRecord MoveGe MoveGt MoveLt MoveLe
Distinct Found NotFound IsUnique NotExists Last Rewind
IdxGT IdxGE IdxLT FifoRead IfMemPos IfMemNeg IfMemZero VFilter VNe
+xt);
for my $n (grep { exists $jumps{$_->{op}} } @ops) {
$g->add_edge($n->{n} => $n->{p2});
}
for my $n (grep { $_->{op} eq 'Goto' } @ops) {
$g->add_edge($n->{n} => $n->{p2});
if ($n->{n} != $n->{n}+1) {
$g->del_edge($g->edge($n->{n} => $n->{n}+1));
}
}
for my $n (grep { $_->{op} eq 'Halt' } @ops) {
next unless $g->node($n->{n}+1);
$g->del_edge($g->edge($n->{n} => $n->{n}+1));
}
my $ext = ($opt{o} && $opt{o} =~ /\.(...)$/) ? $1 : '';
my %ext_formats = (
asc => 'ascii',
html=> 'html',
txt => 'txt',
box => 'boxart',
viz => 'graphviz',
png => 'graphviz',
);
my $format = $ext_formats{ $opt{f} || ''}
|| $opt{f} || $ext_formats{ext} || 'ascii';
# Where does the output go
if (defined $opt{o}) {
if ($ext eq 'png') {
open OUT, "| dot -Tpng -o $opt{o}";
} else {
open OUT, "> $opt{o}";
}
select OUT;
}
if ($format eq 'ascii') {
print $g->as_ascii;
} elsif ($format eq 'boxart') {
print $g->as_boxart;
} elsif ($format eq 'graphviz') {
print $g->as_graphviz;
} elsif ($format eq 'txt') {
print $g->as_txt;
} elsif ($format eq 'html') {
print $g->as_html;
} else {
die;
}
Convert stack-based code to "conventional" (was Re: Graphing SQLite's VDBE)
by roboticus (Chancellor) on Jan 24, 2007 at 12:39 UTC
|
bsb:
I really like your conversion from the stack-based code to the graphical format. One suggestion though: You don't show the stack-based code for the graph shown, so it's not as easy as one would like to see how they tie together.
As far as code pointers go, I'll leave that to monks better-versed in Perl than I.
Anyway, on to your other question: Assuming that by "conventional code" you mean a more algebraic form, then here's my take on it:
The most trivial form is to build it by parenthesizing every expression as you encounter it. Suppose, for example, you have the following bit of code:
[1] push 3
[2] push 5
[3] add
[4] push 7
[5] push 9
[6] add
[7] mul
[8] print
We'd build an algebraic expression with something like this:
#!/usr/bin/perl -w
use strict;
use warnings;
my @stack = ();
while (<DATA>) {
chomp;
print "TOS='" . (@stack ? $stack[$#stack] : '-nil-') . "', "
. "stmt = '" . $_ . "'\n";
my ($operator, $operand) = split;
if ($operator eq 'push') {
push @stack, $operand;
}
elsif ($operator eq 'print') {
print "out: '" . (pop @stack) . "'\n";
}
elsif ($operator eq 'add') {
my $lhs = pop @stack;
my $rhs = pop @stack;
push @stack, '(' . $lhs . ' + ' . $rhs . ')';
}
elsif ($operator eq 'mul') {
my $lhs = pop @stack;
my $rhs = pop @stack;
push @stack, '(' . $lhs . '*' . $rhs . ') ';
}
}
__DATA__
push 3
push 5
add
push 7
push 9
add
mul
print
Which gives us the output:
root@swill ~/PerlMonks
$ ./stack_to_infix.pl
TOS='-nil-', stmt = 'push 3'
TOS='3', stmt = 'push 5'
TOS='5', stmt = 'add'
TOS='(5 + 3)', stmt = 'push 7'
TOS='7', stmt = 'push 9'
TOS='9', stmt = 'add'
TOS='(9 + 7)', stmt = 'mul'
TOS='((9 + 7)*(5 + 3)) ', stmt = 'print'
out: '((9 + 7)*(5 + 3)) '
root@swill ~/PerlMonks
$
Now this trivial technique will do it correctly, but will give you too many parenthesis. With a bit more work, you can get rid of the extra parenthesis to get a minimal expression. Here's how to do it: Instead of storing a single string on the stack, keep a pair: The string and the precedence of the operator. Now, when you're working on an operator, check the precedence of the items you pop from the stack with the precedence of the operator you're processing now. If any operator has a lower precedence, wrap it in parenthesis.
--roboticus
| [reply] [d/l] [select] |
|
#!/usr/bin/perl -w
use strict;
use warnings;
my @stack = ();
sub binop {
my $op = shift;
my $prec = shift;
my $lh = pop @stack;
my ($lhop, $lhprec) = ($$lh[0], $$lh[1]);
my $rh = pop @stack;
my ($rhop, $rhprec) = ($$rh[0], $$rh[1]);
# Wrap operand(s) with lower precedence
$lhop = '(' . $lhop . ')' if $prec > $lhprec;
$rhop = '(' . $rhop . ')' if $prec > $rhprec;
push @stack, [ $lhop . $op . $rhop, $prec ];
}
while (<DATA>) {
chomp;
print "TOS='" . (@stack ? $stack[$#stack][0] : '-nil-') . "' "
. "stmt = '" . $_ . "'\n";
my ($operator, $operand) = split;
if ($operator eq 'push') {
# A constant is atomic, so it has high precedence so
# we don't wrap it
push @stack, [$operand, 999];
}
elsif ($operator eq 'print') {
my $op = pop @stack;
my ($operand, $prec) = ($$op[0], $$op[1]);
print "out: '" . $operand . "'\n";
}
elsif ($operator eq 'add') {
binop '+', 1;
}
elsif ($operator eq 'mul') {
binop '*', 2;
}
}
__DATA__
push 3
push 5
add
push 7
push 9
add
mul
print
Which gives us:
root@swill ~/PerlMonks
$ ./stack_to_infix2.pl
TOS='-nil-' stmt = 'push 3'
TOS='3' stmt = 'push 5'
TOS='5' stmt = 'add'
TOS='5+3' stmt = 'push 7'
TOS='7' stmt = 'push 9'
TOS='9' stmt = 'add'
TOS='9+7' stmt = 'mul'
TOS='(9+7)*(5+3)' stmt = 'print'
out: '(9+7)*(5+3)'
root@swill ~/PerlMonks
$
--roboticus
P.S. I don't like my code for getting arguments off the stack, but all my attempts at simplifying it failed miserably. I'm sure it's something simple, but I haven't figured it out. What's a better way to do the following? (Or is my problem a bit earlier and I'm pushing the wrong thing on the stack?)
my $op = pop @stack;
my ($operand, $prec) = ($$op[0], $$op[1]);
| [reply] [d/l] [select] |
|
my ($operand, $prec) =
map { $_->[0], $_->[1] }
pop @stack;
It seems to do what you want. Cheers, JohnGG | [reply] [d/l] |
|
OK ... now for a quick round of golf. Here's a version that's a bit tighter and easier to expand:
#!/usr/bin/perl -w
use strict;
use warnings;
my @stack = ();
sub binop {
my ($op, $prec) = @_;
my ($rhop, $rhprec) = @{pop @stack};
my ($lhop, $lhprec) = @{pop @stack};
# Wrap operand(s) with lower precedence
$lhop = '(' . $lhop . ')' if $prec > $lhprec;
$rhop = '(' . $rhop . ')' if $prec > $rhprec;
push @stack, [ $lhop . $op . $rhop, $prec ];
}
my %handlers = (
'push' => sub { push @stack, [shift, 999]; },
'print' => sub { print "out: '" . ${@{pop @stack}}[0] . "'\n";
+ },
'add' => sub { binop '+', 1; },
'mul' => sub { binop '*', 2; }
);
while (<DATA>) {
chomp;
print "TOS='" . (@stack ? $stack[$#stack][0] : '-nil-') . "' "
. "stmt = '" . $_ . "'\n";
my ($operator, $operand) = split;
if ($handlers{$operator}) {
$handlers{$operator}->($operand);
}
else {
print "Unknown operator: '$operator' for line '$_'\n";
}
}
__DATA__
push 3
push 5
add
push apples
add
push oranges
push 9
push bananas
push peels
mul
add
add
mul
print
Which gives us:
TOS='-nil-' stmt = 'push 3'
TOS='3' stmt = 'push 5'
TOS='5' stmt = 'add'
TOS='3+5' stmt = 'push apples'
TOS='apples' stmt = 'add'
TOS='3+5+apples' stmt = 'push oranges'
TOS='oranges' stmt = 'push 9'
TOS='9' stmt = 'push bananas'
TOS='bananas' stmt = 'push peels'
TOS='peels' stmt = 'mul'
TOS='bananas*peels' stmt = 'add'
TOS='9+bananas*peels' stmt = 'add'
TOS='oranges+9+bananas*peels' stmt = 'mul'
TOS='(3+5+apples)*(oranges+9+bananas*peels)' stmt = 'print'
out: '(3+5+apples)*(oranges+9+bananas*peels)'
--roboticus | [reply] [d/l] [select] |
|
You don't show the stack-based code for the graph shown, so it's not as easy as one would like to see how they tie together
Actually, most of the operations do something to the stack,
sometimes conditionally, and the textual description is the
only source of information. I suppose that I could add
some notation like "-2" for something that pops two items.
I'll think about your suggestion.
Thanks for the conversion code, I thought that I'd be
bumping into the Halting Problem but maybe your technique
would work for most or even all of the VDBE code encountered
in practice. There are loops, conditionals and off-stack
state used by the opcodes so I think the Halting Problem is
lurking there somewhere.
create table tape (cur_state, scanned, write, move, next_state);
| [reply] [d/l] |
Re: Graphing SQLite's VDBE
by bsb (Priest) on Jan 30, 2007 at 23:21 UTC
|
Updated but no stack transforming yet
| [reply] [d/l] |
|
|