eyepopslikeamosquito has asked for the wisdom of the Perl Monks concerning the following question:
I've been able to mostly avoid XML until today. We need to update hundreds of MS vs2010 project (XML) files automatically. Tedious and error-prone to do by hand, so I'd like to write a script to do it. I've prepared an illustrative cut-down example of such a script, which changes the directory "ReleaseDLL" to "ReleaseDLL32" in various places in the XML.
Since this is my first attempt to parse XML using Perl, I welcome any advice you may have to offer. In particular:
- After some random googling, I chose to use XML::LibXML. Is that a wise choice?
- Given that I want to make minor updates to many XML files, is the overall approach below ok? Is there a better approach?
- I had a hell of a time getting XPath to work (see code below). And I don't really understand what I did with namespaces, though it does appear to work. Suggestions welcome.
- The XPath query "PropertyGroup[contains(\@Condition,'$proj')]" is inelegant in that it selects the required PropertyGroup, then manually iterates through each element in the group. It seems better to select the required nodes directly as part of a more complicated XPath expression and avoid the iteration, but I have no clue how to write an XPath query to do that.
Here is an example (cut-down) project XML file to be updated, fred.vcxproj:
<?xml version="1.0" encoding="utf-8"?> <Project DefaultTargets="Build" ToolsVersion="4.0" xmlns="http://schem +as.microsoft.com/developer/msbuild/2003"> <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug Tan +dem|x64'"> <OutDir>.\DebugTandem\</OutDir> <IntDir>.\DebugTandem\</IntDir> <TargetName>fred$(ProjectName)</TargetName> </PropertyGroup> <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release D +LL|Win32'"> <OutDir>.\../../products/bin/ReleaseDLL\</OutDir> <IntDir>.\ReleaseDLL\</IntDir> <LinkIncremental>false</LinkIncremental> <TargetName>fred$(ProjectName)</TargetName> </PropertyGroup> <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release D +LL|x64'"> <OutDir>.\../../products/bin/ReleaseDLL\</OutDir> <IntDir>.\ReleaseDLL\</IntDir> <LinkIncremental>false</LinkIncremental> <TargetName>fred$(ProjectName)</TargetName> </PropertyGroup> </Project>
Here is my cut-down test program, txml1.pl:
use strict; use warnings; use XML::LibXML; use XML::LibXML::XPathContext; sub read_file_contents { my $fname = shift; open( my $fh, '<', $fname ) or die "error: open '$fname': $!\n"; binmode $fh; local $/ = undef; # slurp mode my $s = <$fh>; close($fh); return $s; } sub write_file_contents { my ( $fname, $data ) = @_; my $overw = -e $fname ? " (overwriting)" : ""; print "creating '$fname'$overw..."; open( my $fh, '>', $fname ) or die "error: open '$fname': $!"; binmode($fh); print {$fh} $data or die "error: write '$fname': $!"; close($fh); print "done.\n"; } my $fname = shift or die "usage: $0 fname\n"; print "xml file : '$fname'\n"; my $xmlstring = read_file_contents($fname); # XXX: Hack for utf8 BOM. # my $UTF8_BOM = chr(0xef) . chr(0xbb) . chr(0xbf); my $UTF8_BOM = ""; # XXX: Without this damned billygates namespace I could not get XPath +to work. my $xpath_ns = 'billygates'; my $vs2010_ns = 'http://schemas.microsoft.com/developer/msbuild/2003'; my $outfile = 'fred.tmp'; my $proj = 'Release DLL|Win32'; my $targ = 'ReleaseDLL'; my $repl = 'ReleaseDLL32'; my $query = "PropertyGroup[contains(\@Condition,'$proj')]"; my $ns_query = "//$xpath_ns:$query"; my $parser = XML::LibXML->new(); my $doc = $parser->parse_string($xmlstring); my $xc = XML::LibXML::XPathContext->new( $doc->documentElement( +) ); $xc->registerNs( $xpath_ns => $vs2010_ns ); print "query : $ns_query:\n"; for my $q ( $xc->findnodes($ns_query) ) { print $q->nodeName(), ":\n"; for my $c ( $q->childNodes() ) { my $name = $c->nodeName(); my $val = $c->textContent(); print " ", ref($c), ":", $name, ":\n"; if ( defined($val) && $val =~ m{[/\\](?:$targ)[/\\]} ) { print " $name: val=$val: matches '$targ'\n"; for my $t ( $c->childNodes() ) { my $v = $t->data; print " ", ref($t), ":", $t->nodeName(), ":", $v, ":\n" +; print " old:", $v, ":\n"; $v =~ s{([/\\])$targ([/\\])}{$1$repl$2} or die "oops"; $t->setData($v); print " new:", $v, ":\n"; } } } } write_file_contents( $outfile, $UTF8_BOM . $doc->toString(0) );
An example run of this program seems to more-or-less work, as shown below:
$ perl txml1.pl fred.vcxproj xml file : 'fred.vcxproj' query : //billygates:PropertyGroup[contains(@Condition,'Release DL +L|Win32')]: PropertyGroup: XML::LibXML::Text:#text: XML::LibXML::Element:OutDir: OutDir: val=.\../../products/bin/ReleaseDLL\: matches 'ReleaseDLL' XML::LibXML::Text:#text:.\../../products/bin/ReleaseDLL\: old:.\../../products/bin/ReleaseDLL\: new:.\../../products/bin/ReleaseDLL32\: XML::LibXML::Text:#text: XML::LibXML::Element:IntDir: IntDir: val=.\ReleaseDLL\: matches 'ReleaseDLL' XML::LibXML::Text:#text:.\ReleaseDLL\: old:.\ReleaseDLL\: new:.\ReleaseDLL32\: XML::LibXML::Text:#text: XML::LibXML::Element:LinkIncremental: XML::LibXML::Text:#text: XML::LibXML::Element:TargetName: XML::LibXML::Text:#text: creating 'fred.tmp' (overwriting)...done. $ diff fred.vcxproj fred.tmp 2c2 < <Project DefaultTargets="Build" ToolsVersion="4.0" xmlns="http://sch +emas.microsoft.com/developer/msbuild/2003"> --- > <Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003" + DefaultTargets="Build" ToolsVersion="4.0"> 9,10c9,10 < <OutDir>.\../../products/bin/ReleaseDLL\</OutDir> < <IntDir>.\ReleaseDLL\</IntDir> --- > <OutDir>.\../../products/bin/ReleaseDLL32\</OutDir> > <IntDir>.\ReleaseDLL32\</IntDir>
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Some questions from beginning user of XML::LibXML and XPath
by Corion (Patriarch) on Oct 16, 2012 at 09:16 UTC | |
Re: Some questions from beginning user of XML::LibXML and XPath
by choroba (Cardinal) on Oct 16, 2012 at 10:10 UTC | |
Re: Some questions from beginning user of XML::LibXML and XPath
by Jim (Curate) on Oct 16, 2012 at 16:27 UTC | |
Re: Some questions from beginning user of XML::LibXML and XPath
by KevinZwack (Chaplain) on Oct 16, 2012 at 16:50 UTC | |
Re: Some questions from beginning user of XML::LibXML and XPath
by Jenda (Abbot) on Oct 17, 2012 at 14:10 UTC |