Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

File ext number

by Kirche (Sexton)
on Mar 19, 2010 at 22:53 UTC ( [id://829716]=perlquestion: print w/replies, xml ) Need Help??

Kirche has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to count number of files for each extension in a folder:
opendir H, './'; $ext{$1} += (m/([^.]+)$/) for grep { -f } readdir H; closedir H; print "$_ - $ext{$_}\n" for keys %ext;
I'm getting following error: Use of uninitialized value in hash element at line 4. But if i replace += with = it shows no error. What i'm doing wrong?

Replies are listed 'Best First'.
Re: File ext number
by toolic (Bishop) on Mar 19, 2010 at 23:24 UTC
    Since I find your code very difficult to understand, I really do not know what the exact problem is. But, since you are having difficulties, try re-coding, like:
    use strict; use warnings; my %ext; opendir H, './'; my @files = grep { -f } readdir H; closedir H; for (@files) { my ($e) = ($_ =~ m/([^.]+)$/); $ext{$e}++; } print "$_ - $ext{$_}\n" for keys %ext;
Re: File ext number
by ikegami (Patriarch) on Mar 19, 2010 at 23:20 UTC

    Your match doesn't match when readdir returns ., so $1's value isn't changed by the match, and it was previously undefined.

    use strict; use warnings; my %ext; opendir my $dh, './' or die $!; for (grep { -f } readdir $dh) { ++$ext{$1} if /\.([^.]+)$/; } print "$_ - $ext{$_}\n" for keys %ext;

    Alternate:

    use strict; use warnings; my %ext; opendir my $dh, './' or die $!; ++$ext{$1} for map { -f && /\.([^.]+)$/ ? $1 : () } readdir $dh; print "$_ - $ext{$_}\n" for keys %ext;
      Your match doesn't match when readdir returns .
      Are you saying that the dot (.) directory somehow makes it past the grep {-f} filter in the OP's code? I freely admit I have no idea what that whole line does in the OP's code, but it would be surprising to me if . made it past the filter.

        Are you saying that the dot (.) directory somehow makes it past the grep {-f} filter in the OP's code?

        I guess I was without realizing it. Obviously, that's not true and my explanation is wrong. Take two:

        $ext{$1} += (m/([^.]+)$/);
        means something close to
        $ext{$1} = $ext{$1} + (m/([^.]+)$/);

        Well, actually more like the following, but it doesn't matter for this discussion:

        alias $temp = $ext{$1}; $temp = $temp + (m/([^.]+)$/);

        Either way, you are relying on Perl evaluating the RHS of the "+=" operator before its LHS, and that's not how Perl operates. In fact, Perl doesn't document how it operates in this circumstance, and that's the reason it's generally a no-no to change and use the same variable in the same expression.

        Even though my earlier explanation was wrong, the solutions I posted still avoid the problem.

        Update: Improved phrasing by inlining footnotes.

Re: File ext number
by BrowserUk (Patriarch) on Mar 19, 2010 at 23:43 UTC
    perl -MData::Dump=pp -wle"opendir D,'.';m[\.([^.]+$)]&&++$exts{$1}while$_=readdir D;pp\%ext +s" { "0_01" => 1, 1 => 2, 2 => 1, 3 => 1, 4 => 1, au3 => 1, bak => 2, "bak'" => 1, bin => 2, bmp => 1, c => 15, csv => 2, dat => 14, dat1 => 1, dat2 => 1, dis => 1, dll => 3, emf => 1, evt => 1, exe => 9, "exp" => 1, file => 1, flv => 1, h => 2, htm => 1, jpg => 10, js => 1, lib => 1, obj => 8, out => 1, pl => 207, pm => 8, png => 13, sortex => 1, swf => 1, tar => 1, tws => 1, txt => 13, xls => 1, xml => 1, zip => 3, }

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: File ext number
by Lady_Aleena (Priest) on Mar 20, 2010 at 01:13 UTC

    Here is a more verbose solution and assumes that your files are being stored in an array. In the following code, the array is named @files. I have only tested this on files which all have extensions.

    my %extensions; for my $file (sort @files) { my @split = split(/\./,$file); my $key = $split[1]; my $num = 1; if (exists $extensions{$key}) { $extensions{$key} += $num; } else { $extensions{$key} = $num; } } while (my ($key,$value) = each %extensions) { print $key." - ".$value."\n"; }
    Have a nice day!
    Lady Aleena

      Here is a more verbose solution and assumes that your files are being stored in an array.

      Not really.

      for my $file (sort grep -f, readdir $dh)
      works just as well as
      for my $file (sort @files)

      my $num = 1; is useless and detrimental. It reminds me of

      use constant TWO => 2;

      when one should rather be doing

      use constant NUM_EYES => 2;

      There's a point where constants become a hindrance.

      my $num = 1; if (exists $extensions{$key}) { $extensions{$key} += $num; } else { $extensions{$key} = $num; }
      should be
      if (defined $extensions{$key}) { $extensions{$key} += 1; } else { $extensions{$key} = 1; }
      But why not just use ++? It even works great at incrementing previously undefined values.
      ++$extensions{$key};

        I had made the assumption that the files were being stored in an array. I hadn't thought of any other way to get the file list from the directory outside of File::Find which creates an array.

        I really overdid counting the instances of each file extension. I had thought of incrementation, but I hadn't thought to use it to define a previously undefined variable. So, the code below is better without the constant my $num = 1;.

        my %extensions; for my $file (sort @files) { my @split = split(/\./,$file); my $key = $split[1]; ++$extensions{$key}; } while (my ($key,$value) = each %extensions) { print $key." - ".$value."\n"; }
        Have a nice day!
        Lady Aleena
      my @split = split(/\./,$file); my $key = $split[1];

      You are assuming that there is only one period in the file name.    That would probably be better as:

      my $key = ( split /\./, $file )[ -1 ];

      Or use File::Basename or File::Spec to get the extention.

        You are right, I made the assumption that there will only be one period in the file name. I have not seen all that many files with more than one period in their names. It may be because that I am a Windows user and have to to special lengths to see a file extension. The .htaccess file always looks odd on my file list.

        So to further refine the code, including previous refinements, it would be...

        my %extensions; for my $file (sort @files) { my @split = split(/\./,$file)[-1]; my $key = $split[1]; ++$extensions{$key}; } while (my ($key,$value) = each %extensions) { print $key." - ".$value."\n"; }
        Have a nice day!
        Lady Aleena
Re: File ext number
by snopal (Pilgrim) on Mar 19, 2010 at 23:06 UTC

    You are not accounting for the files with no '.' character. Your regex using '+' requires that at least one period characters will appear before the end of text. If one does not, it will return an undef value.

    One or more files ends in a '.', which returns an undef match.

    undef is your uninitialized value because it doesn't convert to zero.

      If one does not, it will return an undef value.

      No, m// never returns undef. (It could, it just doesn't.)

      because it doesn't convert to zero.

      No, undef DOES convert to zero when used as a number.

      In fact, when undef is or would be converted to a string ("") or number (0) is when you get that and similar warnings.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://829716]
Approved by Perlbotics
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (8)
As of 2024-04-25 11:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found