http://qs321.pair.com?node_id=505431

fokat has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks:

I have a (very) large archive of processed reports stored in the filesystem. Those reports are actually Perl objects that have been stored using code resembling this (heavy editing for simplification):

use strict; use warnings; use Storable qw/nstore/; my $rep = bless { phrase => 'All your base belong to us' }, 'SomeClass +'; my $file = 'my_file_name'; eval { nstore($rep, $file) || warn "Storable::nstore failed with $ +!\n"; }; if ($@) { warn "Failed to store the object: $@\n"; }

By using gzip/Compress::Zlib/IO::Zlib we could be saving, on average, 24% - 30% of the disk space according to the testing we've done. Then I thought that it would be very simple to ask Storable to use IO::Zlib to read the compressed, serialized objects.

So, here I am writing this Template::Plugin that can read and present those objects... And banging my head against this code:

if ($path =~ m/\.gz$/) { $fh = new IO::Zlib } else { $fh = new IO::File } $ctx->throw('Abuse.open', "Problem opening Abuse report: $!") unless $fh->open($path, "r"); $rep = fd_retrieve($fh); close $fh;

This code is throwing an exception that reads: Not a GLOB reference at /usr/lib/perl5/site_perl/5.8.5/IO/Zlib.pm line 566. when taking the IO::Zlib path, but works beautifully when taking the IO::File branch.

This is the list of things that I've tested, without success:

I can't believe that I am the only one trying to use those two modules together, although I am ready to admit that I am the only one failing :)

I know that likely, something along these lines...

my $fh = IO::File->new("gunzip $file |");

...would work, but I want to avoid calling external programs and all the issues that may come out of it. Specially when all the required code is already within Perl's module library. (Yes, I know that IO::Zlib will fallback to more or less that when no Compress::Zlib is around, but then I don't have to support IO::Zlib).

For the record, these are the relevant versions of what I am using:

$ find /usr/lib/perl* -type f -name Zlib.pm -o -name Storable.pm | xar +gs egrep -i '\$version = ' /usr/lib/perl5/5.8.5/i386-linux-thread-multi/Storable.pm:$VERSION = '2 +.15'; /usr/lib/perl5/5.8.5/Memoize/Storable.pm:$VERSION = 0.65; /usr/lib/perl5/site_perl/5.8.5/IO/Zlib.pm:$VERSION = "1.04"; /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi/Compress/Zlib.p +m:$VERSION = "1.41" ;

Update: Although I failed to mention this originally, I did look the code of both IO::File and Storable as suggested by frodo72 and samtregar...

This code:

sub AUTOLOAD { my $self = shift; $AUTOLOAD =~ s/.*:://; $AUTOLOAD =~ tr/a-z/A-Z/; return tied(*{$self})->$AUTOLOAD(@_); # line 566 }

Is calling ->READ() when ->read() is called. That call is coming from C code, part of Storable (which I'll be looking at shortly, although I'm very weak with XS.

Update: Prompted by frodo72 and samtregar, I probed deeper in the IO::Zlib and Storable code...

Adding a flag, I found this:

IO::Zlib AUTOLOAD=IO::Zlib::FILENO self=IO::Zlib=HASH(0x9d3c8a0) at /usr/lib/perl5/site_perl/5.8.5/IO/Zlib.pm line 567

The call that is causing the exception, is happening within fd_retrieve and is looking to verify that it was passed a filehandle.

sub fd_retrieve { my ($file) = @_; my $fd = fileno($file); # <== THIS IS THE CALL logcroak "not a valid file descriptor" unless defined $fd; my $self; my $da = $@; # Could be from exception +handler eval { $self = pretrieve($file) }; # Call C routine logcroak $@ if $@ =~ s/\.?\n$/,/; $@ = $da; return $self; }

Adding this to IO::Zlib...

sub FILENO { 1 }

...gets rid of the exception, but causes the check of the version number of the serialized file to fail, so there's something else.

Update: Following advice from sauoq, I dropped IO::Zlib in favor of PerlIO::gzip. The now fully working code looks similar to this...

$ctx->throw('Abuse.open', "Problem opening Abuse report: $!") unless $fh->open($path, ($path =~ /\.gz$/? "<:gzip" : "<"));

Best regards

-lem, but some call me fokat

Replies are listed 'Best First'.
Re: Do Storable and IO::Zlib like to play together?
by sauoq (Abbot) on Nov 03, 2005 at 19:32 UTC

    Solving this is going to take some effort. Here is a small piece of code that reproduces the error:

    #!/usr/bin/perl use Storable qw(nstore_fd fd_retrieve); use IO::Zlib; use Data::Dumper; my $test = { foo => "bar" }; my $fh = IO::Zlib->new("out.gz", "wb"); nstore_fd($test, $fh); $fh->close; my $fh2 = IO::Zlib->new("out.gz", "rb"); my $href = fd_retrieve($fh2); $fh2->close; print Dumper $href;

    The error (message) is the same but I'm not positive the error itself is. In particular, $AUTOLOAD in this instance is IO::Zlib::FILENO (not READ) and it is being called from Storable::_store_fd (which is in _store_fd.al) which is in turn being called by Storable::nstore_fd (in nstore_fd.al)... But, going back and printing the argument when in each of those functions shows it to be IO::Zlib=GLOB(0x81242f8) which is a glob ref as desired. It is only when we get to Zlib's AUTOLOAD that it prints as IO::Zlib=HASH(0x81b84b8). So, it looks like the call to fileno() in _store_fd() is where things are breaking.

    And, sure enough...

    perl -MIO::Zlib -e '$fh = IO::Zlib->new("test.gz", "wb"); print fileno +($fh)'
    That's a minimal demonstration of the problem. (So, you can stop looking at Storable as the real problem, anyway.)

    Adding some quick prints to see what's going on...

    sub AUTOLOAD { print "AUTOLOAD(@_)\n"; my $self = shift; print $self, " $AUTOLOAD\n@{[caller]}\n"; $AUTOLOAD =~ s/.*:://; $AUTOLOAD =~ tr/a-z/A-Z/; return tied(*{$self})->$AUTOLOAD(@_); }
    And running that:
    $ perl -MIO::Zlib -e '$fh = IO::Zlib->new("test.gz", "wb"); print file +no($fh)' AUTOLOAD(IO::Zlib=HASH(0x81e8b30)) IO::Zlib=HASH(0x81e8b30) IO::Zlib::FILENO main -e 1 Not a GLOB reference at /usr/lib/perl5/site_perl/5.8.0/IO/Zlib.pm line + 566.
    Changing fileno($fh) to $fh->fileno results in:
    $ perl -MIO::Zlib -e '$fh = IO::Zlib->new("test.gz", "wb"); print $fh- +>fileno' AUTOLOAD(IO::Zlib=GLOB(0x8124538)) IO::Zlib=GLOB(0x8124538) IO::Zlib::fileno main -e 1 AUTOLOAD(IO::Zlib=HASH(0x81e8b3c)) IO::Zlib=HASH(0x81e8b3c) IO::Zlib::ILENO IO::Zlib /usr/lib/perl5/site_perl/5.8.0/IO/Zlib.pm 566 Not a GLOB reference at /usr/lib/perl5/site_perl/5.8.0/IO/Zlib.pm line + 566.
    Ick! Where's that "IO::Zlib::ILENO" coming from?

    Oh well, that's where I'm at now. I've got to set it down for a bit as I have real paying work to do. :-) Maybe you or someone else would like to pick up there. I'll look deeper later on if no one else nails it.

    -sauoq
    "My two cents aren't worth a dime.";
    

      Thanks sauoq.

      That's a minimal demonstration of the problem.

      Yep. I guess we arrived to the problem quite close. IO::Zlib does not seem to implement fileno. But even after faking it (as shown in my Updated node), there is something else going bad.

      (So, you can stop looking at Storable as the real problem, anyway.)

      Well, I don't understand why Storable wants to verify that it got a real handle. It should try to do its thing with whatever it got as an argument.

      But in this case, this does not look like the cause.

      Best regards

      -lem, but some call me fokat

        IO::Zlib does not seem to implement fileno.

        That's not really the problem, I don't think. At least, not the first one. We are never getting out of AUTOLOAD routine (and not because a method is missing.) The problem is that a call to fileno() doesn't play well with the tied handle provided by IO::Zlib.

        It should try to do its thing with whatever it got as an argument.

        That's debatable. I'd be inclined to do it Storable's way.

        -sauoq
        "My two cents aren't worth a dime.";
        
Re: Do Storable and IO::Zlib like to play together? (one solution)
by sauoq (Abbot) on Nov 03, 2005 at 20:58 UTC

    Screw IO::Zlib. Use PerlIO::gzip instead.

    You're using 5.8.5, so that should work for you just fine. Of course, that advice won't help if you really want to be compatible with earlier perls or if you changed your defaults... It's a nice clean solution for most 5.8.0+ installations though.

    -sauoq
    "My two cents aren't worth a dime.";
    
      Screw IO::Zlib. Use PerlIO::gzip instead.

      And as is to be expected from good advice, it works...

      And since PerlIO::gzip does not depend on gzip, it works for me.

      But I am still confused about where the problem was :(

      Thanks a lot

      Best regards

      -lem, but some call me fokat

Re: Do Storable and IO::Zlib like to play together?
by polettix (Vicar) on Nov 03, 2005 at 17:41 UTC
    It will probably shed no light, but in IO::Zlib you can find this:
    sub AUTOLOAD { my $self = shift; $AUTOLOAD =~ s/.*:://; $AUTOLOAD =~ tr/a-z/A-Z/; return tied(*{$self})->$AUTOLOAD(@_); # <== line 566 }
    I'm no AUTOLOAD expert, but it could be that instead of Storable::fd_retrieve, the interpreter is trying to call $fh->fd_retrieve and AUTOLOAD does the rest. Try to disambiguate explicitly:
    $rep = Storable::fd_retrieve($fh);
    Hope this can help!

    Flavio
    perl -ple'$_=reverse' <<<ti.xittelop@oivalf

    Don't fool yourself.

      Yes, I did look at that code. I tried your suggestion but it does not change the outcome. However, I'll take a bigger dig into Storable to see what the XS code is doing...

      Thanks and ++ for your suggestion.

      Best regards

      -lem, but some call me fokat

Re: Do Storable and IO::Zlib like to play together?
by samtregar (Abbot) on Nov 03, 2005 at 17:49 UTC
    Have you looked at the code that's throwing the error, Zlib.pm line 566? If I were tracking this down that's where I would look. It could give you a clue to why IO::Zlib isn't behaving as you expect.

    -sam

      I did, but I'll do again... Thanks and ++ for your suggestion.

      Best regards

      -lem, but some call me fokat