http://qs321.pair.com?node_id=829716

Kirche has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to count number of files for each extension in a folder:
opendir H, './'; $ext{$1} += (m/([^.]+)$/) for grep { -f } readdir H; closedir H; print "$_ - $ext{$_}\n" for keys %ext;
I'm getting following error: Use of uninitialized value in hash element at line 4. But if i replace += with = it shows no error. What i'm doing wrong?

Replies are listed 'Best First'.
Re: File ext number
by toolic (Bishop) on Mar 19, 2010 at 23:24 UTC
    Since I find your code very difficult to understand, I really do not know what the exact problem is. But, since you are having difficulties, try re-coding, like:
    use strict; use warnings; my %ext; opendir H, './'; my @files = grep { -f } readdir H; closedir H; for (@files) { my ($e) = ($_ =~ m/([^.]+)$/); $ext{$e}++; } print "$_ - $ext{$_}\n" for keys %ext;
Re: File ext number
by ikegami (Patriarch) on Mar 19, 2010 at 23:20 UTC

    Your match doesn't match when readdir returns ., so $1's value isn't changed by the match, and it was previously undefined.

    use strict; use warnings; my %ext; opendir my $dh, './' or die $!; for (grep { -f } readdir $dh) { ++$ext{$1} if /\.([^.]+)$/; } print "$_ - $ext{$_}\n" for keys %ext;

    Alternate:

    use strict; use warnings; my %ext; opendir my $dh, './' or die $!; ++$ext{$1} for map { -f && /\.([^.]+)$/ ? $1 : () } readdir $dh; print "$_ - $ext{$_}\n" for keys %ext;
      Your match doesn't match when readdir returns .
      Are you saying that the dot (.) directory somehow makes it past the grep {-f} filter in the OP's code? I freely admit I have no idea what that whole line does in the OP's code, but it would be surprising to me if . made it past the filter.

        Are you saying that the dot (.) directory somehow makes it past the grep {-f} filter in the OP's code?

        I guess I was without realizing it. Obviously, that's not true and my explanation is wrong. Take two:

        $ext{$1} += (m/([^.]+)$/);
        means something close to
        $ext{$1} = $ext{$1} + (m/([^.]+)$/);

        Well, actually more like the following, but it doesn't matter for this discussion:

        alias $temp = $ext{$1}; $temp = $temp + (m/([^.]+)$/);

        Either way, you are relying on Perl evaluating the RHS of the "+=" operator before its LHS, and that's not how Perl operates. In fact, Perl doesn't document how it operates in this circumstance, and that's the reason it's generally a no-no to change and use the same variable in the same expression.

        Even though my earlier explanation was wrong, the solutions I posted still avoid the problem.

        Update: Improved phrasing by inlining footnotes.

Re: File ext number
by BrowserUk (Patriarch) on Mar 19, 2010 at 23:43 UTC
    perl -MData::Dump=pp -wle"opendir D,'.';m[\.([^.]+$)]&&++$exts{$1}while$_=readdir D;pp\%ext +s" { "0_01" => 1, 1 => 2, 2 => 1, 3 => 1, 4 => 1, au3 => 1, bak => 2, "bak'" => 1, bin => 2, bmp => 1, c => 15, csv => 2, dat => 14, dat1 => 1, dat2 => 1, dis => 1, dll => 3, emf => 1, evt => 1, exe => 9, "exp" => 1, file => 1, flv => 1, h => 2, htm => 1, jpg => 10, js => 1, lib => 1, obj => 8, out => 1, pl => 207, pm => 8, png => 13, sortex => 1, swf => 1, tar => 1, tws => 1, txt => 13, xls => 1, xml => 1, zip => 3, }

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: File ext number
by Lady_Aleena (Priest) on Mar 20, 2010 at 01:13 UTC

    Here is a more verbose solution and assumes that your files are being stored in an array. In the following code, the array is named @files. I have only tested this on files which all have extensions.

    my %extensions; for my $file (sort @files) { my @split = split(/\./,$file); my $key = $split[1]; my $num = 1; if (exists $extensions{$key}) { $extensions{$key} += $num; } else { $extensions{$key} = $num; } } while (my ($key,$value) = each %extensions) { print $key." - ".$value."\n"; }
    Have a nice day!
    Lady Aleena

      Here is a more verbose solution and assumes that your files are being stored in an array.

      Not really.

      for my $file (sort grep -f, readdir $dh)
      works just as well as
      for my $file (sort @files)

      my $num = 1; is useless and detrimental. It reminds me of

      use constant TWO => 2;

      when one should rather be doing

      use constant NUM_EYES => 2;

      There's a point where constants become a hindrance.

      my $num = 1; if (exists $extensions{$key}) { $extensions{$key} += $num; } else { $extensions{$key} = $num; }
      should be
      if (defined $extensions{$key}) { $extensions{$key} += 1; } else { $extensions{$key} = 1; }
      But why not just use ++? It even works great at incrementing previously undefined values.
      ++$extensions{$key};

        I had made the assumption that the files were being stored in an array. I hadn't thought of any other way to get the file list from the directory outside of File::Find which creates an array.

        I really overdid counting the instances of each file extension. I had thought of incrementation, but I hadn't thought to use it to define a previously undefined variable. So, the code below is better without the constant my $num = 1;.

        my %extensions; for my $file (sort @files) { my @split = split(/\./,$file); my $key = $split[1]; ++$extensions{$key}; } while (my ($key,$value) = each %extensions) { print $key." - ".$value."\n"; }
        Have a nice day!
        Lady Aleena
      my @split = split(/\./,$file); my $key = $split[1];

      You are assuming that there is only one period in the file name.    That would probably be better as:

      my $key = ( split /\./, $file )[ -1 ];

      Or use File::Basename or File::Spec to get the extention.

        You are right, I made the assumption that there will only be one period in the file name. I have not seen all that many files with more than one period in their names. It may be because that I am a Windows user and have to to special lengths to see a file extension. The .htaccess file always looks odd on my file list.

        So to further refine the code, including previous refinements, it would be...

        my %extensions; for my $file (sort @files) { my @split = split(/\./,$file)[-1]; my $key = $split[1]; ++$extensions{$key}; } while (my ($key,$value) = each %extensions) { print $key." - ".$value."\n"; }
        Have a nice day!
        Lady Aleena
Re: File ext number
by snopal (Pilgrim) on Mar 19, 2010 at 23:06 UTC

    You are not accounting for the files with no '.' character. Your regex using '+' requires that at least one period characters will appear before the end of text. If one does not, it will return an undef value.

    One or more files ends in a '.', which returns an undef match.

    undef is your uninitialized value because it doesn't convert to zero.

      If one does not, it will return an undef value.

      No, m// never returns undef. (It could, it just doesn't.)

      because it doesn't convert to zero.

      No, undef DOES convert to zero when used as a number.

      In fact, when undef is or would be converted to a string ("") or number (0) is when you get that and similar warnings.