Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
Is there a way to walk through an XML file? I want to parse an XML file line-by-line, taking action depending on the tag found. I'm making a simple backup script that will back up filesystems as well as {My | Postgre}SQL databases. The config file would look like this:
<config>
<logprefix>/var/log/sbackup</logprefix>
<storage>/home/backups</storage>
<item name = "SomeDatabase" type="MySQLDB">
<dbName>SomeDatabase</dbName>
<dbUser>SomeUser</dbuser>
<dbPass>password</dbPass>
</item>
<item name = "SomeFilesFiles" type="Filesystem">
<path>/home/globalherald</path>
</item>
</config>
I tried using XML::Simple, dumping the contents into a hash structure, and walking through that, but I've done something wrong. Is what I'm trying to do possible? Here's the script:
#!/usr/bin/perl
# sbackup.pl: Backing up the server. Looks for a file
# called sbackup.xml in /etc which specifies, which databases
# and which directories to back up.
#
# TODO:
# Add tar/database dump logs to logs
# Add postgresql dumper
# Add tar/database dump options to XML
use strict;
use warnings;
use XML::Simple;
use IO::File;
use vars qw($XMLConfig $logprefix $fh);
sub LogMsg
{
my $consecho = 1;
my $deadly = shift;
my $message = shift;
my $timestamp = localtime(time);
open LogFile, ">>$logprefix/sbackup.log" || die "Can't open logfil
+e!";
print LogFile "$timestamp: $message\n";
if ($consecho == 1)
{
print "$timestamp: $message\n";
}
return;
}
sub ParseConfig
{
$fh = new IO::File('/etc/sbackup.xml') or LogMsg(1, "Can't open sb
+ackup.xml!");
$XMLConfig = XMLin($fh);
return;
}
sub BackupMysqlDatabase
{
my $item = shift;
my $dbname = shift;
my $dbuser = shift;
my $dbpass = shift;
my $command = "mysqldump --quick --add-locks --add-drop-table -a -
+e -F -K -u $dbuser -p $dbpass $dbname";
#my $result = system($command);
print $command;
my $result = 0;
if ($result == 0)
{
LogMsg(0, "Backup of $item, database $dbname successful!");
}
else
{
LogMsg(0, "Backup of $item, database $item failed!");
}
return;
}
sub BackupFS
{
my $item = shift;
my $frompath = shift;
my $storage = shift;
my $command = "tar cf $storage/$item.tar $frompath";
#my $result = system($command);
print $command;
my $result = 0;
if ($result == 0)
{
LogMsg(0, "Backup of $item, pathname $frompath successful!");
}
else
{
LogMsg(0, "Backup of $item, pathname $frompath failed!");
}
return;
}
my $localstorage;
my %currentFS = ("item", "", "source", "") ;
my %currentDB = ("item", "", "dbname", "", "dbuser", "", "dbpass", ""
+);
my $currentitem;
ParseConfig();
foreach my $element ($XMLConfig)
{
print "The element is $element!\n";
my %somehash = {$element};
my @hashkeys = keys %somehash;
print "Keys are: @hashkeys\n";
if ($element eq "Logprefix")
{
$logprefix = $XMLConfig->logprefix;
}
if ($element eq "Storage")
{
$localstorage = $XMLConfig->storage;
}
if ($element eq "Item")
{
if ($element->{type} eq "MySQLDB")
{
foreach my $element2 ($element)
{
if ($element2 eq "DbName")
{
$currentDB{item} = $element->{item};
$currentDB{dbname} = $element2->{DbName};
}
if ($element2 eq "DbUser")
{
$currentDB{dbuser} = $element2->{DbUser};
}
if ($element2 eq "DbPass")
{
$currentDB{dbpass} = $element2->{DbPass};
}
}
BackupMySQLDatabase($currentDB{item}, $currentDB{dbname},
+$currentDB{dbuser}, $currentDB{dbpass});
}
if ($element->{type} eq "Filesystem")
{
foreach my $element2 ($element)
{
if ($element2 eq "Path")
{
$currentFS{item} = $element->{Item};
$currentFS{path} = $element2->{Path};
}
}
BackupFS($currentFS{item}, $currentFS{path}, $localstorage
+);
}
}
}
close ($fh);
Thanks everybody!
Re: Walking thru XML
by arturo (Vicar) on Nov 20, 2003 at 23:48 UTC
|
The two classic XML-handling strategies are "Tree-based", such as XML::Simple and, in a more heavyweight and full-featured fashion, XML::DOM, and "Stream" or "event-based", such as SAX, which is sort of defined for Java primarily, although it's not surprising that XML::SAX exists for Perl. The tree-based strategy loads a whole XML document into memory, which allows for some neat tricks. The stream-based strategy deals with elements as they are encountered -- SAX turns various parts of an XML file into events (e.g. "here's a start element", "here are some characters", and so forth). Your question makes it sound as if what you want is a stream-based API, and you say you want to process the file "line-by-line," but your example suggests otherwise.
Your goal seems to be to take the individual <config> elements and turn them into hashes or objects. That's not a "line-by-line" strategy, that's "little trees" or, as one might call them, twigs ... (blatant plug for XML::Twig here).
Your example suggests a half-way strategy: you want to grab each config element and its subelements and deal with that chunk, processing them one at a time. You could load up everything into one master tree, then "walk" through the tree selecting each config element in turn. If you have a lot of things to process,though, that could get expensive memory-wise. If it's not a problem then feel free to stick with XML::Simple.
Now, with respect to your actual goal here, XML::Simple can do a perfectly fine job, although I find it a little bit hard to use (probably because I haven't fully internalized how it turns elements and their attributes into data structures -- forgive me, grantm -- I know this behavior is configurable =). With a little study and care, you could certainly make better use of it than what follows as an example.
I do know enough to point out that you're using it incorrectly, though. $XMLConfig is a reference to a complex data structure, which (assuming you have some element wrapping a bunch of config elements similar to the one you have posted above), will be a reference to a hash that has a key called config, whose value is a reference to an array of other things, which are in turn quite complex themselves ... each of those "other things' ( the elements of the array reference) correponds to a config element and its contents in your file. So the basic outer processing loop would look like this:
foreach my $config ( @{ $XMLConfig->{config} } ) {
my $logprefix = $config->{logprefix};
#etc ...
}
Finishing that up is left as an exercise for the reader =) If you want to get a better handle on what the data structure looks like at any point, use Data::Dumper to print out the structure for you.
As an aside, I know your code is skeletal, but you can't capture the output of system commands; you could use backticks or qx//, but let me suggest that you pipe mysqldump's output to a file and then deal appropriately with the file).
Finally, let me give you a start on how you might use XML::Twig for this job. The basic framework might look like this:
#!/usr/bin/perl
use strict;
use XML::Twig;
# create a new Twig object that will call the "config"
# subroutine once it's seen a complete "config" element
my $twig = XML::Twig->new(
twig_handlers => {
'config' => \&config
});
$twig->parsefile("configs.xml");
sub config {
my ($t, $config ) = @_; # $config is a config element
my $logprefix = $config->child("logprefix")->text;
my @items = $config->children("item");
foreach my $item ( @items ) {
my $name = $item->att('name');
my $type = $item->att('type');
# and so forth
}
}
YMMV, of course, but I find the twiggish way of doing it easier to understand. HTH!
If not P, what? Q maybe? "Sidney Morgenbesser"
| [reply] [d/l] [select] |
Re: Walking thru XML
by princepawn (Parson) on Nov 20, 2003 at 22:21 UTC
|
| [reply] [d/l] |
Re: Walking thru XML
by mirod (Canon) on Nov 21, 2003 at 11:37 UTC
|
| [reply] [d/l] [select] |
Re: Walking thru XML
by Jaap (Curate) on Nov 20, 2003 at 22:22 UTC
|
| [reply] |
|
|