While many programmers prefer processing data and structural information in their heads without any visual aids, visualization and graphical presentation are nonetheless often an efficient and effective way to convey nonlinear ideas, simple or complex.
Requirements
Visio or ArgoUML is good and commonly used for interactively creating flowcharts or diagrams of any kind.
But suppose otherwise. Your client (a bill collector) wants some access control for the application you're going to
build (i.e. who can see and do and control what). In order to do that, you need to find out their organizational structure. Virtually all organizations don't have one documented. You need to
construct one by a series of triads (me, my boss, my subordinates) by asking all the users/employees.
After gathering such data, it would be helpful to construct a complete or some partial org chart to ask the client to verify. Hand drawing one with interactive software won't be very practical. If your software has some wizard or macro to generate a data-driven org chart, by all means, use it. Alternatively, you can use
GraphViz.
use strict;
use GraphViz;
use XML::Twig;
# mkg($xml, $output_file_name)
mkg("@{[<DATA>]}", "orgchart");
sub mkg {
my ($xml, $file) = @_;
my $root = XML::Twig->new()->parse($xml)->root;
my $g = GraphViz->new();
render($g, $root);
$g->as_jpeg("$file.jpg");
}
sub render {
my ($g, $root) = @_;
my $super = mkname($root);
$g->add_node($super);
foreach my $child ($root->children) {
my $subord = mkname($child);
$g->add_edge($super => $subord, dir => 'back');
render($g, $child);
}
}
sub mkname {
$_[0]->att('title') . " (" . $_[0]->att('name') . ")";
}
__DATA__
<Boss title="CEO" name="Joe">
<C title="COO" name="Liz">
<VP title="VP HR" name="Fu"></VP>
<VP title="VP Sales" name="Ty"></VP>
</C>
<C title="CFO" name="Lo">
<VP title="VP HR" name="Fu"></VP>
<VP title="VP Sales" name="Ty"></VP>
</C>
<C title="CTO" name="Gi">
<VP title="VP HR" name="Fu"></VP>
<VP title="Sysadmin" name="TJ"></VP>
</C>
<C title="Adviser" name="Bob"/>
</Boss>
The code above will generate a graph like this. The example assumes your data are stored or can be converted to XML.
Such software as Visio is programmable. Why not use it? Consider the following two example. First, you try to rename a shape in an existing file:
use strict;
use warnings;
use Win32::OLE;
$Win32::OLE::Warn = 3;
my $path = "c:\\files\\visio\\";
my $file = "test.vsd";
my $Visio = Win32::OLE->new('Visio.Application', 'Quit');
my $VDocs = $Visio->Documents;
my $VDoc = $VDocs->Open("$path$file");
my $VPage = $VDoc->Pages->Item(1);
my $VShapes = $VPage->Shapes;
my $VShape = $VShapes->Item(1);
$VShape->{Text} = "New Name";
print $VShape->{Text};
$VDoc->SaveAs($path."test2.vsd");
You have to go through an object hierarchy six levels deep. Programming with
GraphViz is rather straightforward.
use strict;
use warnings;
use GraphViz;
my $g = GraphViz->new(node => {fontsize => 10}, edge => {fontsize => 9
+}, rankdir => 'LR');
$g->add_node('email', label => "Email App:\nperiodic DB query\nto send
+ emails", shape => 'box');
$g->add_node('report', label => "periodic financial\nstatement");
$g->add_node('cond', label => "amount\nowed", shape => 'Mdiamond');
$g->add_node('msgS', label => "very frightening\nmessage");
$g->add_node('msgM', label => "extremely frightening\nmessage with\nra
+ndom death threat");
$g->add_node('msgL', label => "very frightening\nmessage without\ndeat
+h threat");
$g->add_node('thankC', label => "thank you\nfor your business");
$g->add_node('thankD', label => "thank you\nfor your payment");
$g->add_edge('email' => 'report', label => 'creditor/collector');
$g->add_edge('email' => 'cond', label => 'debtor');
$g->add_edge('cond' => 'msgS', label => 'small');
$g->add_edge('cond' => 'msgM', label => 'medium');
$g->add_edge('cond' => 'msgL', label => 'large');
$g->add_edge('email' => 'thankC', label => 'creditor/collector');
$g->add_edge('email' => 'thankD', label => 'debtor');
$g->as_jpeg("email01.jpg");
The code above will create a flow diagram like this
(the first email diagram).
The code basically consists of a series of add_node and add_edge.
You could add_edge without add_node actually, where new nodes will be automatically
created. add_node gives you additional control over the appearance
of an individual node.
The two sample codes above show sometimes GraphViz (and procedural
programming) could make things easier than, say, Visio (and OOP). But if you
need your graph to be programmable and interactive at the same time, you
would need OOP, where each shape has associated instance methods, which GraphViz
doesn't have. (Also, GraphViz doesn't have any UML-compliant shapes by itself.)
Back to the application, the graph above was pretty much a literal
translation of one of the things that the client described and wanted the
application to do--to periodically look at the database, and send emails with
corresponding content to creditors or bill collectors (who use the
application to collect bills) and debtors (who can pay bills online).
Let's say the client has looked at the graphs and signed off the
requirements specification. We may now proceed to the design phase.
Design
One of the major goals of Requirements Analysis and Design is minimization--from programmers' perspective at least, as users may or may care if your code is efficient and effective or not as long as it does the work for
them. (Note that minimization is not same as generalization. A database with one Employee table is as minimized as a database with one Person table. But Person is more generalized than Employee. How much generalization should there be
is an open debate.)
Simple graph doesn't always mean simple coding (nothing simple about "translate English into Russian") but a complicated one is certainly a warning sign.
In fact, during Requirements Analysis and Design, a few questions we
should always ask:
- Why do we need this requirement? What does it accomplish?
- Do these requirements complement each other? Any one contradicts some
other?
- How can some of them be conceptually and/or logically combined?
- Is every requirement logically complete?
- Am I working too much?
It's a common and fatal mistake to translate user's raw requirements directly
into code without critically questioning the legitimacy, validity and usefulness
of each requirement, as well as the project as a whole.
Graph could help spotting "loose end" and determine the completeness
of the requirements, as every edge has to end up at a "meaningful"
node.
Graph is also a great aid when doing simplification--by generalizing and/or
minimizing the graph. In our first email graph example, though the graph is
logically correct, for fear that it might mislead a designer or programmer into
turning each "bubble" in the graph into a template or script
individually, we might want to simplify the graph (sensible to designers and
programmers, not necessary to the client, as it's now primarily for technical
purpose). That is, to minimize the number of bubbles, so to speak.
Each bubble loosely represents an action or a message. We notice that the
messages can be categorized into two categories: a report, and a thank-you note,
visualized as follows.
$g = GraphViz->new(node => {fontsize => 10}, edge => {fontsize => 9},
+rankdir => 'LR');
$g->add_node('email', label => "Email App:\nperiodic DB query\nto send
+ emails", shape => 'box');
$g->add_node('report', label => "periodic financial\nstatement", clust
+er => 'report');
$g->add_node('cond', label => "amount\nowed", shape => 'Mdiamond');
$g->add_node('msgS', label => "very frightening\nmessage", cluster =>
+'report');
$g->add_node('msgM', label => "extremely frightening\nmessage with\nra
+ndom death threat", cluster => 'report');
$g->add_node('msgL', label => "very frightening\nmessage without\ndeat
+h threat", cluster => 'report');
$g->add_node('thankC', label => "thank you\nfor your business", cluste
+r => 'thank');
$g->add_node('thankD', label => "thank you\nfor your payment", cluster
+ => 'thank');
$g->add_edge('email' => 'report', label => 'creditor/collector');
$g->add_edge('email' => 'cond', label => 'debtor');
$g->add_edge('cond' => 'msgS', label => 'small');
$g->add_edge('cond' => 'msgM', label => 'medium');
$g->add_edge('cond' => 'msgL', label => 'large');
$g->add_edge('email' => 'thankC', label => 'creditor/collector');
$g->add_edge('email' => 'thankD', label => 'debtor');
$g->as_jpeg("email02.jpg");
The code above generates this
second email diagram. "cluster" attribute was added to group the
bubbles.
Turning the diagram into a "design," we generate the
third email diagram with the following code.
my @color = (color => 'lightgray', fontcolor => 'lightgray');
$g = GraphViz->new(node => {shape => 'box', fontsize => 10}, edge => {
+fontsize => 9}, rankdir => 'LR');
$g->add_node('email', label => "Email App:\nperiodic DB query\nto send
+ emails");
$g->add_node('report', shape => 'ellipse');
$g->add_node('thank', label => "thanks you\nnote", shape => 'ellipse')
+;
$g->add_node('report XSL', @color);
$g->add_node('thank XSL', @color);
$g->add_node('XML data', @color);
$g->add_edge('email' => 'report', label => "all");
$g->add_edge('email' => 'thank', label => "all");
$g->add_edge('report' => 'report XSL', label => 'use', @color);
$g->add_edge('thank' => 'thank XSL', label => 'use', @color);
$g->add_edge('report' => 'XML data', label => 'use', @color);
$g->add_edge('thank' => 'XML data', label => 'use', @color);
$g->as_jpeg("email03.jpg");
The XSL boxes in the graph signifies that we'll embed our business logic into
XSL modules, whereas XML box represents data logic and module. Combining things
into a couple of (XSL) modules may simplify the design of the application if
we're allowed to compromise a little of the flexibility of the layout and the
content of each type of the email message.
For architectural discussion, it's often more effective and efficient to look
at a graphical DB schema instead of a textual one or a bunch of create table scripts.
Suppose we have created the following tables in MySQL.
DROP TABLE IF EXISTS org;
CREATE TABLE org (
id int NOT NULL,
name varchar(255) NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY id (id)
) TYPE=InnoDB;
DROP TABLE IF EXISTS employee;
CREATE TABLE employee (
id int NOT NULL,
name varchar(255) NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY id (id)
) TYPE=InnoDB;
DROP TABLE IF EXISTS orgstruct;
CREATE TABLE orgstruct (
org_id int NOT NULL,
employee_id int NOT NULL,
subord_id int NOT NULL,
PRIMARY KEY (org_id, employee_id, subord_id),
INDEX (org_id),
INDEX (employee_id),
INDEX (subord_id),
FOREIGN KEY (org_id) REFERENCES org (id),
FOREIGN KEY (employee_id) REFERENCES employee (id),
FOREIGN KEY (subord_id) REFERENCES employee (id)
) TYPE=InnoDB;
We can reverse engineer the tables simply like this:
use strict;
use warnings;
use DBI;
use GraphViz::DBI;
my $dbh = DBI->connect("DBI:mysql:test", "user", "password");
GraphViz::DBI->new($dbh)->graph_tables->as_jpeg("dbi.jpg");
$dbh->disconnect;
Here's the result.
Of course, there're plenty of powerful database tools out there that do
reverse and even round trip engineering that you can (and probably should) use.
Coding/Testing
Many people use some script to generate a HTML directory tree to help
developers browse through scripts and module and retrieve them from CVS or
whatever.
You could also more fancily use GraphViz to generate image map instead (which
could be useful at times).
use strict;
use GraphViz;
use XML::Twig;
# write $file.jpg and $file.html to files
# mkg($xml, $output_file_name)
my $file = "modules";
my $map = mkg("@{[<DATA>]}", $file);
my $html = <<HTML;
<HTML>
<BODY>
<MAP NAME=mymap>
$map
</MAP>
<IMG SRC="$file.jpg" USEMAP="#mymap">
</BODY>
</HTML>
HTML
open OUT, ">$file.html";
print OUT $html;
close OUT;
sub mkg {
my ($xml, $file) = @_;
my $root = XML::Twig->new()->parse($xml)->root;
my $g = GraphViz->new();
render($g, $root);
$g->as_jpeg("$file.jpg");
return $g->as_cmap;
}
sub render {
my ($g, $root) = @_;
$g->add_node($root->att('name'), URL => $root->att('src'), shape
+=> 'record');
foreach my $child ($root->children) {
$g->add_edge($root->att('name') => $child->att('name'));
render($g, $child);
}
}
__DATA__
<script name="report" src="file://usr/perl/pl/report.pl">
<module name="MyApp::DataXML" src="file://user/perl/pm/MyApp/DataX
+ML.pm">
<module name="DBI" src="http://search.cpan.org/author/TIMB/DBI
+-1.37/DBI.pm"/>
<module name="XML::libXML" src="http://search.cpan.org/author/
+PHISH/XML-LibXML-1.54/LibXML.pm"/>
</module>
<module name="MyApp::RptXSL" src="file://user/perl/pm/MyAp/RptXSL.
+pm">
<module name="XML::libXSLT" src="http://search.cpan.org/author
+/MSERGEANT/XML-LibXSLT-1.53/LibXSLT.pm"/>
</module>
<module name="Mail::Sender" src="http://search.cpan.org/author/JEN
+DA/Mail-Sender-0.8.06/Sender.pm"/>
</script>
Here is the image
map generated by the code above. The CPAN modules' links are real; the
others are dummy.
GraphViz also comes with Devel::GraphVizProf, which is a graphical
version of Devel::SmallProf.
If you wish, you may also color highlight your subroutines in your profile
graph based on their relative execution times, as in the following example.
use strict;
use GraphViz;
use XML::Twig;
use List::Util qw/ max /;
my @profile;
for (<DATA>) {
chomp;
push @profile, [split /\s+/];
}
push @profile, [undef, 'end'];
my $max = max( map {$_->[0]} @profile );
my $g = GraphViz->new();
for my $i (0..($#profile-1)) {
my $w1 = ($profile[$i][0])/$max ;
my $w2 = 1-$w1/2;
my $color = "$w1,$w2,$w2";
$g->add_node($profile[$i][1], fontcolor => $color, color => $color
+);
$g->add_node($profile[$i+1][1]);
$g->add_edge($profile[$i][1] => $profile[$i+1][1], label => $profi
+le[$i][0], color => $color, fontcolor => $color);
}
$g->as_jpeg("profile.jpg");
# millisec sub
__DATA__
1 fetchXML
2 preprocessXML
5 generateReport
1 randomThreat
6 generateReport
2 sendemail
Here
is how the graph looks like.