I've been able to mostly avoid XML until today.
We need to update hundreds of MS vs2010 project (XML) files automatically.
Tedious and error-prone to do by hand, so I'd like to write a script to do it.
I've prepared an illustrative cut-down example of such a script, which changes
the directory "ReleaseDLL" to "ReleaseDLL32" in various places in the XML.
Since this is my first attempt to parse XML using Perl,
I welcome any advice you may have to offer.
In particular:
- After some random googling, I chose to use XML::LibXML. Is that a wise choice?
- Given that I want to make minor updates to many XML files, is the overall approach below ok? Is there a better approach?
- I had a hell of a time getting XPath to work (see code below). And I don't really understand what I did with namespaces, though it does appear to work. Suggestions welcome.
- The XPath query "PropertyGroup[contains(\@Condition,'$proj')]" is inelegant in that it selects the required PropertyGroup, then manually iterates through each element in the group. It seems better to select the required nodes directly as part of a more complicated XPath expression and avoid the iteration, but I have no clue how to write an XPath query to do that.
Here is an example (cut-down) project XML file to be updated, fred.vcxproj:
<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" ToolsVersion="4.0" xmlns="http://schem
+as.microsoft.com/developer/msbuild/2003">
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug Tan
+dem|x64'">
<OutDir>.\DebugTandem\</OutDir>
<IntDir>.\DebugTandem\</IntDir>
<TargetName>fred$(ProjectName)</TargetName>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release D
+LL|Win32'">
<OutDir>.\../../products/bin/ReleaseDLL\</OutDir>
<IntDir>.\ReleaseDLL\</IntDir>
<LinkIncremental>false</LinkIncremental>
<TargetName>fred$(ProjectName)</TargetName>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release D
+LL|x64'">
<OutDir>.\../../products/bin/ReleaseDLL\</OutDir>
<IntDir>.\ReleaseDLL\</IntDir>
<LinkIncremental>false</LinkIncremental>
<TargetName>fred$(ProjectName)</TargetName>
</PropertyGroup>
</Project>
Here is my cut-down test program, txml1.pl:
use strict;
use warnings;
use XML::LibXML;
use XML::LibXML::XPathContext;
sub read_file_contents
{
my $fname = shift;
open( my $fh, '<', $fname ) or die "error: open '$fname': $!\n";
binmode $fh;
local $/ = undef; # slurp mode
my $s = <$fh>;
close($fh);
return $s;
}
sub write_file_contents
{
my ( $fname, $data ) = @_;
my $overw = -e $fname ? " (overwriting)" : "";
print "creating '$fname'$overw...";
open( my $fh, '>', $fname ) or die "error: open '$fname': $!";
binmode($fh);
print {$fh} $data or die "error: write '$fname': $!";
close($fh);
print "done.\n";
}
my $fname = shift or die "usage: $0 fname\n";
print "xml file : '$fname'\n";
my $xmlstring = read_file_contents($fname);
# XXX: Hack for utf8 BOM.
# my $UTF8_BOM = chr(0xef) . chr(0xbb) . chr(0xbf);
my $UTF8_BOM = "";
# XXX: Without this damned billygates namespace I could not get XPath
+to work.
my $xpath_ns = 'billygates';
my $vs2010_ns = 'http://schemas.microsoft.com/developer/msbuild/2003';
my $outfile = 'fred.tmp';
my $proj = 'Release DLL|Win32';
my $targ = 'ReleaseDLL';
my $repl = 'ReleaseDLL32';
my $query = "PropertyGroup[contains(\@Condition,'$proj')]";
my $ns_query = "//$xpath_ns:$query";
my $parser = XML::LibXML->new();
my $doc = $parser->parse_string($xmlstring);
my $xc = XML::LibXML::XPathContext->new( $doc->documentElement(
+) );
$xc->registerNs( $xpath_ns => $vs2010_ns );
print "query : $ns_query:\n";
for my $q ( $xc->findnodes($ns_query) )
{
print $q->nodeName(), ":\n";
for my $c ( $q->childNodes() )
{
my $name = $c->nodeName();
my $val = $c->textContent();
print " ", ref($c), ":", $name, ":\n";
if ( defined($val) && $val =~ m{[/\\](?:$targ)[/\\]} )
{
print " $name: val=$val: matches '$targ'\n";
for my $t ( $c->childNodes() )
{
my $v = $t->data;
print " ", ref($t), ":", $t->nodeName(), ":", $v, ":\n"
+;
print " old:", $v, ":\n";
$v =~ s{([/\\])$targ([/\\])}{$1$repl$2} or die "oops";
$t->setData($v);
print " new:", $v, ":\n";
}
}
}
}
write_file_contents( $outfile, $UTF8_BOM . $doc->toString(0) );
An example run of this program seems to more-or-less work, as shown below:
$ perl txml1.pl fred.vcxproj
xml file : 'fred.vcxproj'
query : //billygates:PropertyGroup[contains(@Condition,'Release DL
+L|Win32')]:
PropertyGroup:
XML::LibXML::Text:#text:
XML::LibXML::Element:OutDir:
OutDir: val=.\../../products/bin/ReleaseDLL\: matches 'ReleaseDLL'
XML::LibXML::Text:#text:.\../../products/bin/ReleaseDLL\:
old:.\../../products/bin/ReleaseDLL\:
new:.\../../products/bin/ReleaseDLL32\:
XML::LibXML::Text:#text:
XML::LibXML::Element:IntDir:
IntDir: val=.\ReleaseDLL\: matches 'ReleaseDLL'
XML::LibXML::Text:#text:.\ReleaseDLL\:
old:.\ReleaseDLL\:
new:.\ReleaseDLL32\:
XML::LibXML::Text:#text:
XML::LibXML::Element:LinkIncremental:
XML::LibXML::Text:#text:
XML::LibXML::Element:TargetName:
XML::LibXML::Text:#text:
creating 'fred.tmp' (overwriting)...done.
$ diff fred.vcxproj fred.tmp
2c2
< <Project DefaultTargets="Build" ToolsVersion="4.0" xmlns="http://sch
+emas.microsoft.com/developer/msbuild/2003">
---
> <Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003"
+ DefaultTargets="Build" ToolsVersion="4.0">
9,10c9,10
< <OutDir>.\../../products/bin/ReleaseDLL\</OutDir>
< <IntDir>.\ReleaseDLL\</IntDir>
---
> <OutDir>.\../../products/bin/ReleaseDLL32\</OutDir>
> <IntDir>.\ReleaseDLL32\</IntDir>