Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

XPath matching for object trees

by samtregar (Abbot)
on Feb 11, 2003 at 02:49 UTC ( [id://234301]=perlmeditation: print w/replies, xml ) Need Help??

Greetings monks. Lately I've been working on a class implimenting a tree data structure. Each object of the class has a list of children which may contain any number of other objects of the same class, which may themselves have children. Each node has a name which identifies the type of node, like 'page' or 'paragraph'.

Many of the most common operations performed on these objects only want to affect a sub-set of the available children. For example, a loop that wants to process all paragraphs might look like:

foreach my $child ($obj->children()) { next unless $child->name eq 'paragraph'; # code goes here }

More complex cases involve needing to operate on a subset of children defined by two or more levels of containership. For example, to collect all the headers on all the pages:

foreach my $child ($obj->children()) { next unless $child->name eq 'page'; foreach my $grandchild ($child->children()) { next unless $grandchild->name eq 'header'; # code goes here } }

Suddenly it occurred to me that I've seen patterns like this before, in code dealing with XML. And I've seen a good solution to the problem, XPath, which I've never really gotten a good chance to use. So, today I added a method to do XPath-style matching called match(). It only supports the simplest patterns, but I plan to improve it incrementally.

For example, the first block of code can be translated to:

foreach my $para ($obj->match('paragraph')) { # code goes here }

That's no big deal, but the fun starts translating the second one:

foreach my $header ($obj->match('/page/header')) { # code goes here }

Once I wrote match(), I realized I had another tool I didn't even realize I could use. I now have a way to unqiuely identify a node in a tree. For example, to get the third paragraph on the fourth page:

   ($para) = $obj->match('/page[3]/paragraph[2]');

So not only is match() a simpler and cleaner interface to selecting nodes from the tree, it also offers an entirely novel feature: unique indentifiers for nodes inside the tree. I added an xpath() method to return this path for a given node. I've already found one area in the application where this significantly reduces code complexity and I expect to find more.

-sam

Replies are listed 'Best First'.
Re: XPath matching for object trees
by tomhukins (Curate) on Feb 11, 2003 at 13:28 UTC

    I like this approach. I find XPath a powerful, simple language and use it often for parsing XML. The idea of using XPath to process tree structures that aren't stored natively as XML interests me.

    I wonder if writing your own match() method to implement XPath-like syntax is the best approach, though. Rather than effectively writing another XPath parser, why not serialise your data tree to XML, process it with an existing parser such as XML::LibXML or XML::XPath, then convert the XML back into your initial tree format? That would be slower, but would allow you to reuse existing modules.

    The speed issue may be significant for you, but this approach might help other monks with similar problems.

Re: XPath matching for object trees
by perrin (Chancellor) on Feb 11, 2003 at 16:12 UTC
    Looks like a good approach. I've seen people solve similar problems by implementing LDAP query syntax, but that doesn't allow for your unique identity trick.
Re: XPath matching for object trees
by diotalevi (Canon) on Feb 11, 2003 at 15:32 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://234301]
Approved by pfaut
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-03-29 05:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found