Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

(dkubb) Re: (2) XML parsing and SAX event handlers

by dkubb (Deacon)
on Jul 13, 2002 at 10:13 UTC ( #181474=note: print w/replies, xml ) Need Help??


in reply to Predefining complex data structures?

Many of the approaches in this thread centered around using XML::Simple. Why not try using XML::SAX and build your own SAX event handler. I believe it can satify your requirements while at the same time providing more flexibility than XML::Parser's interface.

A good introduction to creating SAX event handlers can be found at XML::SAX::Intro in the XML::SAX distribution on CPAN.

To address you're question here's a working example:

#!/usr/bin/perl -wT use strict; use XML::SAX; use Data::Dumper qw(DumperX); my $handler = My::SAXParser->new; my $parser = XML::SAX::ParserFactory->parser(Handler => $handler); #pass the XML document at the bottom __DATA__ tag to the parser $parser->parse_string(do { local $/; <DATA> }); print DumperX($handler->nodes); { #this class keeps track of the processed nodes package My::SAXParser; use strict; use base qw(XML::SAX::Base); use Class::MethodMaker get_set => ['nodes'], list => ['element_stack']; use constant SKIP_NODE => 'xml'; sub start_document { shift->nodes({}) } sub start_element { my $self = shift; my $el = shift; return if $el->{Name} eq SKIP_NODE; #make note of which element we are processing - in the stack $self->element_stack_push(\my %element); foreach my $attribute (values %{$el->{Attributes}}) { push @{$element{attributes}}, @$attribute{qw(Name Value)}; } #keep track of all interesting element nodes push @{ $self->nodes->{$el->{Name}} }, \%element; return $self->SUPER::start_element($el); } sub characters { my $self = shift; return unless $self->element_stack_count; #are there any pending +element nodes to process? return $self->SUPER::characters($self->element_stack->[-1]->{text} + .= shift->{Data}); } sub end_element { my $self = shift; $self->element_stack_pop; #element has been processed, pop it off + the stack return $self->SUPER::end_element(shift); } } __DATA__ <xml> <requirement contactname="Joe Average">A power cord.</requirement> <requirement contactname="Jane Smith" contactnumber="555-1212">A node +name</requirement> </xml>

This should produce the following output:

$VAR1 = { 'requirement' => [ { 'text' => 'A power cord.', 'attributes' => [ 'contactname', 'Joe Average' ] }, { 'text' => 'A node name', 'attributes' => [ 'contactnumber', '555-1212', 'contactname', 'Jane Smith' ] } ] };

I tested this code with the other XML document example you posted in this thread. It can parse it and I believe it produces a pretty reasonable output.

Also if performance is an issue it's possible to gain further speed increases using XML::LibXML::SAX::Parser or XML::SAX::Expat. Either of these modules can pretty much just be dropped into the above script by modifying two lines of the script's code: the use and new constructor statements.

Replies are listed 'Best First'.
Re: (4) XML parsing and SAX event handlers
by grantm (Parson) on Jul 15, 2002 at 08:16 UTC

    Either of these modules can pretty much just be dropped into the above script by modifying two lines of the script's code

    Actually, it shouldn't be necessary to modify the code at all. Your sample code uses XML::SAX::ParserFactory which will use the system default SAX parser (as defined in lib/XML/SAX/ParserDetails.ini). So if you install XML::SAX::Expat, your script will immediately make use of it.

Re: XML parsing and SAX event handlers
by Ionizor (Pilgrim) on Aug 02, 2002 at 20:13 UTC

    I found the SAX documentation rather confusing the first time I read it over so I put it down for a while. Now I've picked it back up and with a little help from O'Reilly's Perl and XML I'm recoding into XML::SAX.

    On a related note, I highly recommend O'Reilly's Safari service. Online books! It's very cool.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://181474]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (3)
As of 2022-12-09 01:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?