Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

threaded display of replies

by epoptai (Curate)
on Apr 27, 2001 at 23:52 UTC ( [id://76249]=perlquestion: print w/replies, xml ) Need Help??

epoptai has asked for the wisdom of the Perl Monks concerning the following question:

I'm working on a newest nodes client and need some advice with the code that generates a threaded display of root nodes and replies like this:

I've tried coding this from scratch twice and ended up with two different ways to make the same mistake!

The Newest Nodes XML Generator makes tags like this (for example a root node and reply):

<NODE nodetype="categorized question" author_user="11732" node_id="76108" createtime="20010427142501"> <NODE nodetype="categorized answer" author_user="11732" node_id="76130" createtime="20010427150308" parent_node="76108">
The strategy has been to find replies to root nodes, then find replies to the replies, and so on. But the way i'm doing it only finds replies to the depth that's been coded for. For example the following code's maximum depth is Re:(3) because there are 3 for loops, it doesn't look for Re:(4) and beyond.
sub thread { print '<ul>'; for my $node (sort {$b <=> $a} keys %created){ # for all nodes # if doesn't have a parent and isn't a user if((!$parent{$node}) && ($nodetype{$node} ne 'user')){ if($created{$node}=~/^(....)(..)(..)(..)(..)(..)$/){ $created{$node} = "$4:$5:$6 $2/$3" } print qq~$types{$nodetype{$node}} ($created{$node})<br> $content{$node} by $whom{$author{$node}}<br>~; my$d = 0; for my $p1 (sort {$a <=> $b} keys %parent){ # re if( ($parent{$p1} == $node) ){ unless($d > 0){ print '<ul>'} print qq~$content{$p1} by $whom{$author{$p1}}<br>~; $d++; my$e = 0; for my $p2 (sort {$a <=> $b} keys %parent){ # re(2) if( ($parent{$p2} == $p1) ){ unless($e > 0){ print '<ul>'} print qq~$content{$p2} by $whom{$author{$p2}}<br>~; $e++; my$f = 0; for my $p3 (sort {$a <=> $b} keys %parent){ # re(3) if( ($parent{$p3} == $p2) ){ unless($f > 0){ print '<ul>'} print qq~$content{$p3} by $whom{$author{$p3}}<br>~; $f++; } } if($f > 0){ print '</ul>'} } } if($e > 0){ print '</ul>'} } } if($d > 0){ print '</ul>'} } } print '</ul>'; }
Ugh! I tried again with a hash like the one in Re: Graphical Hierarchical Tree (that should've been a hash of arrays i guess), but had the same problem.
sub thread { my%children; for my $root (sort {$a <=> $b} keys %created){ # for all nodes my(@kids,$kids); # if doesn't have a parent and isn't a user if( ($nodetype{$root} ne 'user') ){ for my $child (sort {$a <=> $b} keys %parent){ # for all childre +n if( ($parent{$child} == $root) ){ # if this is a child push @kids, $child; } # if $kids = join '|', @kids; $children{$root} = $kids; } } } the rest is predictable noise...
What's the best way to find the entire tree of replies?

Replies are listed 'Best First'.
Re: threaded display of replies
by mikfire (Deacon) on Apr 28, 2001 at 00:20 UTC
    What if you were to turn this around a bit and not order the datastructure but added an element that preserved the order?

    My idea would be to have the primary data structure look something like this and keyed by the node id.

    my %created = ( AUTHOR => $author, TYPE => $node_type, TIME => $node_time, PARENT => $parent_id || '', KIDS => [], );
    One pass will load the structure from the data stream. A second pass will populate the KIDS array something like this. If the datastream is in order, or you are willing to assume that any referenced parent node will be included in the stream, you could actually include this in the first pass and save a loop. Personally, I prefer to keep the data acquisition loop seperate from the data manipulation loop - it makes it easier to debug.
    for my $root ( sort { $a <=> $b } keys %created ) { my $parent = $created{$root}{PARENT}; push @{$created{$parent}{KIDS}}, $root; }
    One final loop, then, can print out all the nodes in, hopefully a hierchical order by using a bit of recursion ( breadth-first? )
    sub print_nodes { my @kids = @_; for my $id ( @nodes ) { print "Interesting data from node\n"; if ( @{$created{$id}{KIDS} ) { print_nodes( @{$created{$id}{KIDS}; } } } # Some work to make sure we start at the parents and move # our way down correctly. grep would work just as well. my @topnodes = (); for my $id ( keys %created ) { push @topnodes, $id unless ( defined( $created{$id}{PARENT} ) ); } print_nodes( @topnodes );
    mikfire
Re: threaded display of replies
by Masem (Monsignor) on Apr 28, 2001 at 00:14 UTC
    Assuming that all the data contains is the node number and the node number of the parent node, you simply cannot find all children; you can only work backwards up from a node to the parent.

    However, and where it will get messy, is to take the child node, find it's parent (and all nodes in between), then increament on the node number starting with the ultimate parent node; if the parent of that node is one that is in the thread already, add it to the thread. This will get trees like:

    P +--1 +--2 +--3 +--4 +--5 <- the current node +--6 +--7 +--8
    Without increamenting on the node number, you cannot get 6, 7 and 8. Nor would you get 2.


    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
Re: threaded display of replies
by ZZamboni (Curate) on Apr 29, 2001 at 07:48 UTC
    For my PerlMonks::NewestNodes module, I used a technique very similar to the one described by mikfire. All the nodes are stored in a hash indexed by node ID, and each element is a hash reference containing the different pieces of information about the node.

    The thread subroutine receives as argument a list of nodes to thread (this allows adding new nodes to the existing structure without rethreading everything). Then it goes through those nodes, and for each one it visits its parent and inserts its ID into the parent's "kids" element, which is an array ordered chronologically. Nodes that do not have a parent_node element are marked as "top nodes".

    This is complicated a little bit by the fact that the Newest Nodes XML Generator will give you a node, but not its parent, when the parent is already marked as seen. So the thread subroutine also checks to see which nodes it doesn't have yet, and adds those to a list of "nodes to get". The nodes without parents are added to a list of "nodes to rethread". Nodes that are threaded correctly are removed from the list. At the end, any necessary nodes are grabbed (using the Node Query XML Generator, you pass it an argument of the form "nodes=id1,id2,..."), added to the list, and the loop repeats.

    Once every node has its kids element properly populated, it's just a matter of descending recursively from each "top node" to generate its tree of replies.

    Shameless plug: check out Shendal's Perl/Tk Newest Nodes Client, which uses my module to show you a very nice threaded display of the Newest Nodes.

    --ZZamboni

Will this break your program?
by Sprad (Hermit) on Apr 28, 2001 at 00:21 UTC
    What happens when somebody doesn't use the Re: Re: Re: Re: pattern, and perversely changes the title in their reply? (like I did)

    Cases like this aside, though, it'd be a welcome addition.

    ---
    A fair fight is a sign of poor planning.

      Sprad, if you look at the XML stream, the parent node id is included probably to avoid exactly this issue.

      mikfire

      well, each node should be represented by a node_id in a DB or whatever file U use and this should know (store) the id of its parent, so you can't mess up things. And btw, there is no need to let anode know if it has children or not. If there are children THEY WILL KNOW their parent. Keep it simple: stupid. :-)
      btw BmeCat works this way and made it to a standard.

      Have a nice day
      All decision is left to your taste

        so said little:

          And btw, there is no need to let anode [sic] know if it has children or not. If there are children THEY WILL KNOW their parent.

        Unless, of course, you want to find all the children of a particular node without doing an exhaustive search of the DB. While not strictly necessary, it can make life simpler in some cases. I think this application might qualify as one of those cases (see Masem's reply). :-)

        bbfu
        Seasons don't fear The Reaper.
        Nor do the wind, the sun, and the rain.
        We can be like they are.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://76249]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (8)
As of 2024-04-24 11:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found