Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

recursive complex structure or something like that?

by Isanchez (Acolyte)
on Jul 26, 2003 at 02:06 UTC ( [id://278024]=perlquestion: print w/replies, xml ) Need Help??

Isanchez has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, I have a dataset that contains vocabulary entries such as:

term_Id Parent_Id
0 1 domestic_animal
2 1 dog
3 2 terrier
4 2 collie
5 3 fox_terrier

The first number in the column is a unique index for the word. The second indicates the parent category represented by some other word that it belongs to. So, terrier has the unique index 3, but it is a type of dog so its parent id is the unique Id of dog i.e. 2. Collie is another type of dog, it has the unique id 4 and since it is a dog too its parent is also 2. fox terrier is a type of terrier, so its parent is the index of terrier 3.

In the actual file there are categories that have up to 17 levels in depth. The root term, i.e. the highest node in the hierarchy is "vocabulary" and has 12 immediate doughters. I have an input such as the one above and I have to come up with an output that will have all their descendant terms. Something like:

domestic_animal dog, terrier, collie, fox_terrier, cat, chesire_cat
vehicles car, SUV, Ford, Ford_Passat, airplane, boeing_747

I imagine that a recursive function is needed to keep collecting doughter terms. and the structure in which they have to be stored is probably a hash of arrays.

Can anyone lead me at least to a begining or some piece of code, algorithm to do this?
thank you very much,
Ivo

  • Comment on recursive complex structure or something like that?

Replies are listed 'Best First'.
Re: recursive complex structure or something like that?
by diotalevi (Canon) on Jul 26, 2003 at 02:20 UTC

    I established a dictionary where each term can have children. The actual rooted structure is a tree. Or what Zaxo said. You have to watch out for circular references in my code.

    my %dict; # A reverse index on %dict by id. while (my $line = <$fh>) { my ( $id, $parent, $term ) = split ' ', $line, 3; # Fixed from -3 +per sauoq $dict{$id}{'id'} = $id; $dict{$id}{'parent'} = $parent; push @{$dict{$parent}{'children'}}, $dict{$id}; } use Data::Dumper; print Dumper( \ %dict );

    Or with the circular reference problem nixed

    use Util::Scalar 'weaken'; my %dict; # A reverse index on %dict by id. while (my $line = <$fh>) { my ( $id, $parent, $term ) = split ' ', $line, 3; # Fixed per sauo +q $dict{$id}{'id'} = $id; $dict{$id}{'parent'} = $parent; push @{$dict{$parent}{'children'}}, $dict{$id}; weaken $dict{$parent}{'children'}[-1]; } use Data::Dumper; print Dumper( \ %dict );
      = split ' ', $line, -3;

      You almost certainly want

      = split ' ', $line, 3;
      not -3.

      -sauoq
      "My two cents aren't worth a dime.";
      

      Dear Saint Diotalevi

      thank you so much for your reply,

      I used the input file:

      1 0 domestic_animal 2 1 dog 3 2 terrier 4 2 collie 5 3 fox_terrier
      I used your code (only modified for input):
      $in = "sample.txt"; open (IN, $in) or die "cant open the in\n"; # my %dict; # A reverse index on %dict by id. # my %dict; # A reverse index on %dict by id. while (not eof (IN)) { $line = <IN>; my ( $id, $parent, $term ) = split (' ', $line, 3) ; $dict{$id}{'id'} = $id; $dict{$id}{'parent'} = $parent; push @{$dict{$parent}{'children'}}, $dict{$id}; } use Data::Dumper; print Dumper( \ %dict );

      and this is what i get:

      $VAR1 = { '4' => { 'parent' => '2', 'id' => '4' }, '1' => { 'parent' => '0', 'children' => [ { 'parent' => '1', 'children' => [ { 'parent' => '2' +, 'children' => [ + { + 'parent' => '3', + 'id' => '5' + } ] +, 'id' => '3' }, $VAR1->{'4'} ], 'id' => '2' } ], 'id' => '1' }, '3' => $VAR1->{'1'}{'children'}[0]{'children'}[0], '0' => { 'children' => [ $VAR1->{'1'} ] }, '2' => $VAR1->{'1'}{'children'}[0], '5' => $VAR1->{'1'}{'children'}[0]{'children'}[0]{'children' +}[0] };

      what I would need is a print out such that each mother term has all its descendents printed to right.

      Am I doing something wrong?

      Thank you very much, Ivo

        No, its just that the code I originally provided doesn't assume we know what the root is. If you follow $dict{1} you'll notice that everything works out alright. So the data is correct, you just may be mistaken on what you think the correct answer is (and that also handles cycles. You should use my second example with weaken(). The first example leaks memory).

Re: recursive complex structure or something like that?
by Zaxo (Archbishop) on Jul 26, 2003 at 02:19 UTC

    Graph is great for this kind of thing. Use the id numbers to name nodes, the string as an attribute, and the parent number to form an edge. With Graph.pm, you don't need to worry about memory leaks due to circular parentage.

    After Compline,
    Zaxo

Re: recursive complex structure or something like that?
by gmax (Abbot) on Jul 26, 2003 at 08:10 UTC

    I believe that Tree::DAG_Node should serve your purposes.

    Just to give you an example:

    #!/usr/bin/perl -w use strict; use Tree::DAG_Node; my $vocabulary = Tree::DAG_Node->new; $vocabulary->name("vocabulary"); my $vehicles = Tree::DAG_Node->new; $vehicles->name("vehicles"); my $animals = Tree::DAG_Node->new; $animals->name("animals"); $animals->new_daughter->name("domestic"); $animals->new_daughter->name("wild"); ($animals->daughters)[0]->new_daughter->name("dog"); ($animals->daughters)[1]->new_daughter->name("tiger"); my $sciences = Tree::DAG_Node->new; $sciences->name("sciences"); $vocabulary->add_daughters($animals, $sciences, $vehicles); print "tree\n"; print map "$_\n", @{$vocabulary->draw_ascii_tree}; print "\ndump names\n"; print $vocabulary->dump_names; print "\nanimals and descendants\n"; print join(",", map {$_->name} $animals->self_and_descendants), "\n"; __END__ tree | <vocabulary> /--------------+----------\ | | | <animals> <sciences> <vehicles> /---------\ | | <domestic> <wild> | | <dog> <tiger> dump names 'vocabulary' 'animals' 'domestic' 'dog' 'wild' 'tiger' 'sciences' 'vehicles' animals and descendants animals,domestic,dog,wild,tiger

    One of the great features of Tree::DAG_Node is that you can use the module to build your structure, and then export it to a list of lists.

    my $lol =$vocabulary->tree_to_lol; my $code =$vocabulary->tree_to_lol_notation({multiline=>1}); print "$code\n"; __END__ [ [ [ [ 'dog' ], 'domestic' ], [ [ 'tiger' ], 'wild' ], 'animals' ], [ 'sciences' ], [ 'vehicles' ], 'vocabulary' ],

    You can save the code in text mode and then eval it, or you can use Storable to store it and then retrieve it from another script.

    Update. It also works the other way around, i.e. you can create a new tree from a lol using the appropriate constructor Tree::DAG_Node->lol_to_tree.

    See the basics in Introduction to Tree::DAG_Node (in Tutorials). Be aware that Tree::DAG_Node does not handle circular references, AFAIK.

     _  _ _  _  
    (_|| | |(_|><
     _|   
    
Re: recursive complex structure or something like that?
by Limbic~Region (Chancellor) on Jul 26, 2003 at 04:23 UTC
    Isanchez,
    Since you already have other apparently working solutions, I think you have domestic_animal backwards. I would think that it should be 1 0 instead of 0 1. The first number is supposed to be the index and the second is supposed to be the parent - both dog and domestic_animal have the same parent, but there is no item in the list of 1 for the index.

    I figured this might help if you were trying to track down a bug.

    Cheers - L~R

Re: recursive complex structure or something like that?
by Anonymous Monk on Jul 26, 2003 at 11:38 UTC
    Tree::Nary is a less uptodate but working alternativ for n-ary trees. If you are adventerous you can dig Tree::Nary::Extended which slurps your one-level hash or dbi table from/into a tree. bye, Murat

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://278024]
Approved by Zaxo
Front-paged by rnahi
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2024-04-19 04:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found