(tye)Re: Data::Dumper Efficiency Problem
by tye (Sage) on Jan 04, 2001 at 00:03 UTC
|
My partially wild guess is that Data::Dumper's stuffing
everything into one big string causes lots of realloc()s
which can't be done in-place due to Perl malloc()ing things
in between so the growing string is repeatedly copied
around to new places where there is enough space to hold it
all in one piece.
The correct solution is for Data::Dumper to be fixed to
know how to write to a Perl file handle!
-
tye
(but my friends call me "Tye")
| [reply] |
Re (tilly) 1: Data::Dumper Efficiency Problem
by tilly (Archbishop) on Jan 04, 2001 at 00:08 UTC
|
| [reply] |
Re: Data::Dumper Efficiency Problem
by Trinary (Pilgrim) on Jan 04, 2001 at 00:12 UTC
|
I used to swear by Data::Dumper, but I don't anymore. I'm not sure about the internals, but I have to say that recently I've really come to be frustrated by it. Doing performance analysis under Win32 (Win32::PerfLib, not in CPAN), I end up dumping large hash structures all the time. When I was doing research into the format of these things I tried to get a dump of one base level object (System, for those who care). It ended up running out of memory, swap...everything. Wouldn't finish running, it was using well over 200M of memory. I have since then written my own (somewhat dumb) replacement, took an hour or two, and suggest either following suit or searching around here for something that has enough functionality for what you need and is simpler than Data::Dumper. If there's interest, I'll post my lil snippet, but it's basically trivial.
Trinary | [reply] |
|
Trinary,
Please do post! I'm very interested in this.
Thanks,
madhatter
| [reply] |
|
Ask, and ye shall recieve:
This is just a sub, pretty basic actually and probably broken in a couple ways. takes a ref as argument, and starts-a-printin. Haven't done any performance testing vs. Data::Dumper.
Begin code
sub dumpref {
my $testref = shift;
my $levels = shift;
if (ref($testref) eq 'HASH') {
print "{\n";
$levels++;
my $maxlevel = scalar(keys %$testref);
my $curlevel = 0;
foreach my $key (keys %$testref) {
$curlevel++;
print " " x $levels;
print $key;
print " => ";
my $val = $testref->{$key};
if (ref($val)) {
&dumpref($val,$levels);
} else {
$val =~ s#\\#\\\\#;
$val =~ s#'#\\'#;
print "'$val'";
}
print "," if $curlevel < $maxlevel;
print "\n";
}
print " " x ($levels - 1) . "}";
} elsif (ref($testref) eq 'ARRAY') {
print "[\n";
$levels++;
my $maxlevel = scalar(@$testref);
foreach my $val (@$testref) {
$curlevel++;
print " " x $levels;
if (ref($val)) {
&dumpref($val,$levels);
print " " x ($levels - 1);
} else {
$val =~ s#\\#\\\\#;
$val =~ s#'#\\'#;
print "'$val'";
}
print "," if $curlevel < $maxlevel;
print "\n";
}
print " " x ($levels - 1) . "]";
} else {
print ref($testref);
print "\n";
}
}
End Code Use at your own risk, but it handles basic stuff ok, I think. =b
Trinary | [reply] [d/l] |
|
Re: Data::Dumper Efficiency Problem
by repson (Chaplain) on Jan 04, 2001 at 06:34 UTC
|
Another method depending on data is XML::Simple.
XMLout can take a filename or filehandle which may reduce
memory used during running by immediate output instead of storing (I don't know
if it does). XMLin is supposed to always create the original
data structure...
It does allow buzzword compliance, and a structure parseable
without needing perl.
As to your original question, Data::Dumper may be
creating self referential output, this means that is has to remember and constantly process everything
that has already passed through it. Read the module docs to find out if this may be happening and what you should do about it (call $OBJ->Reset under the OO interface possibly, depending on how you are doing things).
| [reply] [d/l] |
|
This is not the way XML::Simple works. XML::Simple is designed to let you input an XML file (with some restrictions) and use the data it contains or update it and output it back. Altough I have never tried it I would bet it will not output arbitrary data structures as XML (although that might be fun!).
On the other hand XML::Dumper and Data::DumpXML will dump data to XML. I have no idea how fast they are though (and considering Data::DumpXML is also written by Gisle AAs I don't think it will be faster than Data::Dumper).
| [reply] |
|
Directly from the XML::Simple docs:
XMLout()
Takes a data structure (generally a hashref) and returns
an XML encoding of that structure. If the resulting XML is
parsed using XMLin(), it will return a data
structure equivalent to the original.
That sounds similar to what Data::Dumper is being used for here.
| [reply] |
|
|