epoptai has asked for the wisdom of the Perl Monks concerning the following question:
I'm working on a newest nodes client and need some advice with the code that generates a threaded display of root nodes and replies like this:
I've tried coding this from scratch twice and ended up with two different ways to make the same mistake!
The Newest Nodes XML Generator makes tags like this (for example a root node and reply):
<NODE nodetype="categorized question" author_user="11732"
node_id="76108" createtime="20010427142501">
<NODE nodetype="categorized answer" author_user="11732"
node_id="76130" createtime="20010427150308" parent_node="76108">
The strategy has been to find replies to root nodes, then find replies to the replies, and so on. But the way i'm doing it only finds replies to the depth that's been coded for. For example the following code's maximum depth is Re:(3) because there are 3 for loops, it doesn't look for Re:(4) and beyond.
sub thread
{
print '<ul>';
for my $node (sort {$b <=> $a} keys %created){ # for all nodes
# if doesn't have a parent and isn't a user
if((!$parent{$node}) && ($nodetype{$node} ne 'user')){
if($created{$node}=~/^(....)(..)(..)(..)(..)(..)$/){
$created{$node} = "$4:$5:$6 $2/$3"
}
print qq~$types{$nodetype{$node}} ($created{$node})<br>
$content{$node}
by $whom{$author{$node}}<br>~;
my$d = 0;
for my $p1 (sort {$a <=> $b} keys %parent){ # re
if( ($parent{$p1} == $node) ){
unless($d > 0){ print '<ul>'}
print qq~$content{$p1}
by $whom{$author{$p1}}<br>~;
$d++;
my$e = 0;
for my $p2 (sort {$a <=> $b} keys %parent){ # re(2)
if( ($parent{$p2} == $p1) ){
unless($e > 0){ print '<ul>'}
print qq~$content{$p2}
by $whom{$author{$p2}}<br>~;
$e++;
my$f = 0;
for my $p3 (sort {$a <=> $b} keys %parent){ # re(3)
if( ($parent{$p3} == $p2) ){
unless($f > 0){ print '<ul>'}
print qq~$content{$p3}
by $whom{$author{$p3}}<br>~;
$f++;
}
}
if($f > 0){ print '</ul>'}
}
}
if($e > 0){ print '</ul>'}
}
}
if($d > 0){ print '</ul>'}
}
}
print '</ul>';
}
Ugh! I tried again with a hash like the one in Re: Graphical Hierarchical Tree (that should've been a hash of arrays i guess), but had the same problem.
sub thread
{
my%children;
for my $root (sort {$a <=> $b} keys %created){ # for all nodes
my(@kids,$kids);
# if doesn't have a parent and isn't a user
if( ($nodetype{$root} ne 'user') ){
for my $child (sort {$a <=> $b} keys %parent){ # for all childre
+n
if( ($parent{$child} == $root) ){ # if this is a child
push @kids, $child;
} # if
$kids = join '|', @kids;
$children{$root} = $kids;
}
}
} the rest is predictable noise...
What's the best way to find the entire tree of replies?
Re: threaded display of replies
by mikfire (Deacon) on Apr 28, 2001 at 00:20 UTC
|
What if you were to turn this around a bit and not order
the datastructure but added an element that preserved
the order?
My idea would be to have the primary data structure look
something like this and keyed by the node id.
my %created = ( AUTHOR => $author,
TYPE => $node_type,
TIME => $node_time,
PARENT => $parent_id || '',
KIDS => [],
);
One pass will load the structure from the data stream. A
second pass will populate the KIDS array something like
this. If the datastream is in order, or you are willing
to assume that any referenced parent node will be included
in the stream, you could actually include this in the first
pass and save a loop. Personally, I prefer to keep the
data acquisition loop seperate from the data manipulation
loop - it makes it easier to debug.
for my $root ( sort { $a <=> $b } keys %created ) {
my $parent = $created{$root}{PARENT};
push @{$created{$parent}{KIDS}}, $root;
}
One final loop, then, can print out all the nodes in, hopefully
a hierchical order by using a bit of recursion ( breadth-first? )
sub print_nodes {
my @kids = @_;
for my $id ( @nodes ) {
print "Interesting data from node\n";
if ( @{$created{$id}{KIDS} ) {
print_nodes( @{$created{$id}{KIDS};
}
}
}
# Some work to make sure we start at the parents and move
# our way down correctly. grep would work just as well.
my @topnodes = ();
for my $id ( keys %created ) {
push @topnodes, $id unless ( defined( $created{$id}{PARENT} ) );
}
print_nodes( @topnodes );
mikfire | [reply] [d/l] [select] |
Re: threaded display of replies
by Masem (Monsignor) on Apr 28, 2001 at 00:14 UTC
|
Assuming that all the data contains is the node number and the node number of the parent node, you simply cannot find all children; you can only work backwards up from a node to the parent.
However, and where it will get messy, is to take the child node, find it's parent (and all nodes in between), then increament on the node number starting with the ultimate parent node; if the parent of that node is one that is in the thread already, add it to the thread. This will get trees like:
P
+--1
+--2
+--3
+--4
+--5 <- the current node
+--6
+--7
+--8
Without increamenting on the node number, you cannot get 6, 7 and 8. Nor would you get 2.
Dr. Michael K. Neylon - mneylon-pm@masemware.com
||
"You've left the lens cap of your mind on again, Pinky" - The Brain
| [reply] [d/l] |
Re: threaded display of replies
by ZZamboni (Curate) on Apr 29, 2001 at 07:48 UTC
|
For my PerlMonks::NewestNodes
module, I used a technique very similar to the one described
by mikfire. All the nodes are stored in a hash indexed by
node ID, and each element is a hash reference containing the
different pieces of information about the node.
The thread
subroutine receives as argument a list of nodes to thread
(this allows adding new nodes to the existing structure without
rethreading everything). Then it goes through those nodes, and
for each one it visits its parent and inserts its ID into
the parent's "kids" element, which is an array ordered
chronologically. Nodes that do not have a parent_node element
are marked as "top nodes".
This is complicated a little bit by the fact that the
Newest Nodes XML Generator will give you a node, but
not its parent, when the parent is already marked as seen.
So the thread subroutine also checks to see which nodes
it doesn't have yet, and adds those to a list of "nodes to
get". The nodes without parents are added to a list of
"nodes to rethread". Nodes that are threaded correctly are
removed from the list. At the end, any necessary nodes are
grabbed (using the Node Query XML Generator, you pass it
an argument of the form "nodes=id1,id2,..."), added to the
list, and the loop repeats.
Once every node has its kids element properly populated, it's
just a matter of descending recursively from each "top node"
to generate its tree of replies.
Shameless plug: check out Shendal's Perl/Tk Newest Nodes Client,
which uses my module to show you a very nice threaded display
of the Newest Nodes.
--ZZamboni
| [reply] |
Will this break your program?
by Sprad (Hermit) on Apr 28, 2001 at 00:21 UTC
|
What happens when somebody doesn't use the Re: Re: Re: Re: pattern, and perversely changes the title in their reply? (like I did)
Cases like this aside, though, it'd be a welcome addition.
---
A fair fight is a sign of poor planning. | [reply] |
|
Sprad, if you look at the XML stream, the parent node id
is included probably to avoid exactly this issue.
mikfire
| [reply] |
|
well, each node should be represented by a node_id in a DB or whatever file U use and this should know (store) the id of its parent, so you can't mess up things. And btw, there is no need to let anode know if it has children or not. If there are children THEY WILL KNOW their parent. Keep it simple: stupid. :-)
btw BmeCat works this way and made it to a standard.
Have a nice day
All decision is left to your taste
| [reply] |
|
so said little: And btw, there is no need to let anode [sic] know if it has children or not. If there are children THEY WILL KNOW their parent.
Unless, of course, you want to find all the children of a particular node without doing an exhaustive search of the DB. While not strictly necessary, it can make life simpler in some cases. I think this application might qualify as one of those cases (see Masem's reply). :-)
bbfu
Seasons don't fear The Reaper.
Nor do the wind, the sun, and the rain.
We can be like they are.
| [reply] |
|
|
|
|
|