Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Applying XSL stylesheet specified in XML file to the XML

by blm (Hermit)
on Mar 24, 2009 at 07:41 UTC ( [id://752807]=perlquestion: print w/replies, xml ) Need Help??

blm has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I have an xml file that I get using WWW::Mechanise and it is XML. It starts with:

<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="/foo/test.xsl"?>

I want to parse the xml with XML::LibXML, find the xsl file url, retrieve it and apply it. At least that is what I think if have to do. I don't know how to get XML::LibXML to give me the declarations at the top (ie the <? ?> things).

All I really want is the HTML that results from applying the XSL to the XML. (I know, all you really want is a pony but this is about my question at the moment ;-) )

Thanks for any and all help. I am not fixed on using XML::LibXML so I would be interested in any other useful modules.

Replies are listed 'Best First'.
Re: Applying XSL stylesheet specified in XML file to the XML
by dHarry (Abbot) on Mar 24, 2009 at 08:50 UTC

    You need libxslt for that (in a libxml context). There are alternatives of course, see CPAN. I'm not too familiar with those alternatives. My personal favorite is Xalan.

    HTH
    dHarry

      Hi, Thanks for the reply. I realized that I need libxslt. But unless I am missing something I don't see how to pull the xsl uri out of the xml and feed it to libxslt (XML::LibXSLT). (Maybe I just need to grep for it.) That is my problem. Can you show me some code?

      Here is my code:
      use lib qw|/home/blm/perl/lib|; use strict; use WWW::Mechanize; use XML::LibXML; use XML::LibXSLT; my $mech = WWW::Mechanize->new(agent => 'Mozilla/5.0 (X11; U; Linux i6 +86; en-US;+ rv:1.9.0.1) Gecko/2008070206 Firefox/3.0.1' ); my $url = 'https://some.url.here/'; $mech->delete_header('accept-encoding'); $mech->get($url); $mech->update_html($mech->content()); print $mech->content; my $parser = XML::LibXML->new(); my $style_parser = XML::LibXML->new(); my $xslt = XML::LibXSLT->new(); my $doc = $parser->parse_string($mech->content()); print $doc->toString(); my $stylesheet_location = ***Here is my problem*** $mech->get($stylesheet_location); my $stylesheet_string = $mech->content(); my $styledoc = $style_parser->parse_string($stylesheet_string); my $stylesheet = $xslt->parse_stylesheet($styledoc); my $results = $xslt->transform($doc); print $results;

        Ah, I see. And I have to disappoint you, I use XML::Twig for XML processing in Perl and my own tools for XSLT stuff. I would not grep for it, instead there must be more XML-ish way of doing things. After all it's just a Node of a specific type, i.e. NodeType 'processing-instruction'. So I imagine parsing the file and retrieving the information should do the trick, i.e. walk the DOM tree. Suddenly the grepping doesn't sound so bad anymore;) Another option is to use SAX, I see a processingInstructionSAXFunc in the libxml2 API. However there is another Perl module that might come in handy: XML::LibXML::PI can't you do a getData?

        Mind you, in "my" environment it's as simple as one method call: getAssociatedStylesheet()!

Re: Applying XSL stylesheet specified in XML file to the XML
by ForgotPasswordAgain (Priest) on Mar 25, 2009 at 09:26 UTC

    I don't know how to tell what href is relative to, but even just to get the value of href is a bit gimpy for "processing instructions", which is what <? ... ?> are.

    #!/usr/bin/perl -w use strict; use XML::LibXML; my $parser = XML::LibXML->new; my $doc = $parser->parse_string(<<'EOX'); <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href ='abc"efg'?> <_/> EOX foreach my $node ($doc->findnodes('//processing-instruction()')) { my $name = $node->nodeName; if ($name eq 'xml-stylesheet') { # getData is a string like q{type="text/xsl" href="/test.xsl"} # which is what makes it annoying my $attr_str = $node->getData; # manually parse the string like href='abc"efg'; # there might be a better way of doing this $attr_str =~ m{href\s*=\s*(['"])([^\1]+)\1}; my $href = defined $2 ? $2 : ''; print "$name href: >>>$href<<<\n"; } }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://752807]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (5)
As of 2024-04-23 13:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found