Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

How to read zipped file in perl

by Perlseeker_1 (Acolyte)
on Jan 09, 2014 at 09:08 UTC ( [id://1069928]=perlquestion: print w/replies, xml ) Need Help??

Perlseeker_1 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Experts,

I am reading a zipped file from and counting the number of occurences based on some condition and i am using the zlib to read the zipped file.

The input zipped file
01,AB1,CDEF,CHINA,T1, 02,AB2,CDEF,CHINA,T1, 03,AB1,CDEF,JAPAN,T2, 04,AB2,CDEF,JAPAN,T2, 05,AB3,CDEF,JAPAN,T2, 06,AB4,CDEF,JAPAN,T2, 07,AB3,CDEF,CHINA,T1, 08,AB4,CDEF,CHINA,T1, 09,AB1,CDEF,CHINA,T1, 10,AB2,CDEF,CHINA,T1, 11,AB1,CDEF,JAPAN,T2, 12,AB2,CDEF,JAPAN,T2, 13,AB3,CDEF,JAPAN,T2, 14,AB4,CDEF,JAPAN,T2, 15,AB3,CDEF,CHINA,T1, 16,AB4,CDEF,CHINA,T1,
$handle = IO::Zlib->new( "$file", 'rb' ) or die "Zlib failed for $file +"; $result{$_}++ for map { join '|', ( split /\|/ )[4] } <$handle>; print Dumper(%result);

Output what i am getting

$VAR1 = 'T1'; $VAR2 = 8; $VAR3 = ''; $VAR4 = 2; $VAR5 = 'T2'; $VAR6 = 8;

output what I am Expecting is

AB1 --> T1 --> 2 AB2 --> T1 --> 2 AB1 --> T2 --> 2 AB2 --> T2 --> 2 AB3 --> T2 --> 2 AB4 --> T2 --> 2 AB3 --> T1 --> 2 AB4 --> T1 --> 2

Any ideas on this is helpful

Replies are listed 'Best First'.
Re: How to read zipped file in perl
by kcott (Archbishop) on Jan 09, 2014 at 14:49 UTC

    G'day Perlseeker_1,

    As already stated by others, there's a disconnect between your title, code and output.

    The following produces the output you seem to want based on the data you posted. Perhaps replacing my <DATA> with your <$handle> will achieve the result you're after.

    #!/usr/bin/env perl -l use strict; use warnings; my %result; ++$result{join ' --> ' => (split /,/)[1,4]} while <DATA>; print "$_ --> $result{$_}" for sort keys %result; __DATA__ 01,AB1,CDEF,CHINA,T1, 02,AB2,CDEF,CHINA,T1, 03,AB1,CDEF,JAPAN,T2, 04,AB2,CDEF,JAPAN,T2, 05,AB3,CDEF,JAPAN,T2, 06,AB4,CDEF,JAPAN,T2, 07,AB3,CDEF,CHINA,T1, 08,AB4,CDEF,CHINA,T1, 09,AB1,CDEF,CHINA,T1, 10,AB2,CDEF,CHINA,T1, 11,AB1,CDEF,JAPAN,T2, 12,AB2,CDEF,JAPAN,T2, 13,AB3,CDEF,JAPAN,T2, 14,AB4,CDEF,JAPAN,T2, 15,AB3,CDEF,CHINA,T1, 16,AB4,CDEF,CHINA,T1,

    Output:

    AB1 --> T1 --> 2 AB1 --> T2 --> 2 AB2 --> T1 --> 2 AB2 --> T2 --> 2 AB3 --> T1 --> 2 AB3 --> T2 --> 2 AB4 --> T1 --> 2 AB4 --> T2 --> 2

    -- Ken

Re: How to read zipped file in perl
by Anonymous Monk on Jan 09, 2014 at 09:39 UTC

    Any ideas on this is helpful

    I have some ideas :)

    The title of your question has nothing to do with output what I am Expecting is -- you seem to be reading the zipped file just fine

    The output what I am Expecting is has nothing to do with the output you get

    So, maybe, try to write some code to get output what I am Expecting is and pretend like the part thats working doesn't exist, like this

    #!/usr/bin/perl -- use strict; use warnings; use Data::Dump qw/ dd /; open my($handle), '<:raw', \' 01,AB1,CDEF,CHINA,T1, 02,AB2,CDEF,CHINA,T1, 03,AB1,CDEF,JAPAN,T2, 04,AB2,CDEF,JAPAN,T2, 05,AB3,CDEF,JAPAN,T2, 06,AB4,CDEF,JAPAN,T2, 07,AB3,CDEF,CHINA,T1, 08,AB4,CDEF,CHINA,T1, 09,AB1,CDEF,CHINA,T1, 10,AB2,CDEF,CHINA,T1, 11,AB1,CDEF,JAPAN,T2, 12,AB2,CDEF,JAPAN,T2, 13,AB3,CDEF,JAPAN,T2, 14,AB4,CDEF,JAPAN,T2, 15,AB3,CDEF,CHINA,T1, 16,AB4,CDEF,CHINA,T1, '; my %result; while( <$handle>){ my @fudge = join '|', ( split /\|/, $_ )[4] ; dd( $_ => \@fudge ); $result{$_}++ for @fudge; dd( \%result ); } dd( \%result ); __END__
Re: How to read zipped file in perl
by Eily (Monsignor) on Jan 09, 2014 at 09:43 UTC

    This works far better than it should, (split /\|/)[4] means that you read values separated by a |, and take the fifth (Edit: not fourth, thanks choroba). Since your values are separated by commas, that's either not your real input or the code you actually used. Do read the documentation on split if you intend to use it. The [4] means "only the fifth value", and join is used to join several values, so in this case, it is useless.

    For your expected result, you could do something like that :

    while (my $line = <$handle>) { my ($number, $key1, $dontcare, $country, $key2) = split /:/, $line; + # Something to change here $result{$key1}{$key2}++; } print Dumper(\%result);
    You still have to change something for the split to work, that's because I didn't want you to just copy and paste my code without understanding anything, and because I'm not really sure what you values separator actually is. In the list of variable in the my(), you should rename key1 and key2 to give them meaningful names, and you can replace all the variables you won't use by undef.

    To have something more useful with Data::Dumper, you should write print Dumper(\%result);, the backslash returns a reference to your hash, which means that instead of using Dumper on each element (key or value) of %result, you'll do it on %result as a whole.

    There is one thing you do correctly though, and it's reading from a zipped file. So you don't have to ask for that.

Re: How to read zipped file in perl
by Utilitarian (Vicar) on Jan 09, 2014 at 09:41 UTC
    Try escaping the hash... print Dumper(\%result);

    print "Good ",qw(night morning afternoon evening)[(localtime)[2]/6]," fellow monks."

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1069928]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2024-04-25 23:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found