Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Hash to count characters

by amittleider (Initiate)
on Aug 12, 2010 at 02:56 UTC ( [id://854562]=perlquestion: print w/replies, xml ) Need Help??

amittleider has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, I've been working diligently on an assignment to write a perl function that will count the number of each letter a-Z in a file given by a command line argument. I have to say, I have tried as I might with no success. Here are two attempts, the first is commented (But I think it is very close!)
#sub countChar() { # open (DAT, "@ARGV"); # print "Character count\n"; # while ($line = <DAT>){ # my @line_words = split (//, $line); # foreach my $char (@line_char){ # if ($charCount{$char}){ # $charCount{$char}++; # }else { # $charCount{$char}=1; # } # } # } # foreach $char (keys %charCount) { # print "$char => $charCount{$char}\n"; # } # close(DAT, "@ARGV"); #} #foreach $char (keys %charCount) { # print "$char => $charCount{$char}\n"; #} sub countChar() { open (DAT, "@ARGV"); print "Character count\n"; while ($line = <DAT>){ do (@word = split (/\W/, $line)); foreach $word (keys %charCount){ do (@letter = split (/\w+/, $word); $letter = (keys %charcount)} if ($charCount){$char}){ $charCount{$char}++; }else { $charCount{$char}=1; } foreach $char (keys %charCount) { print "$char => $charCount{$char}\n"; } close(DAT, "@ARGV"); } }
Thanks for any/all comments! AJ

Replies are listed 'Best First'.
Re: Hash to count characters
by jwkrahn (Abbot) on Aug 12, 2010 at 03:51 UTC
    open (DAT, "@ARGV");

    "@ARGV" is short for join( $", @ARGV ).    If you just want the first argument from the command line then use $ARGV[0] instead.    You should really be using the three argument form of open and you should always verify that the file opened correctly, so:

    open mt $DAT, '<', $ARGV[ 0 ] or die "Cannot open '$ARGV[0]' $ +!";

    while ($line = <DAT>){ do (@word = split (/\W/, $line)); foreach $word (keys %charCount){ do (@letter = split (/\w+/, $word); $letter = (keys %charcount)} if ($charCount){$char}){ $charCount{$char}++; }else { $charCount{$char}=1; }

    Since you say that you only want letters you need something like this:

    my %charCount; while ( my $line = <$DAT> ) { my @letters = $line =~ /[a-zA-Z]/g; foreach my $char ( @letters ) { $charCount{ $char }++; } } foreach my $char ( keys %charCount ) { print "$char => $charCount{$char}\n"; }

    close(DAT, "@ARGV");

    close only accepts one argument, the filehandle that was previously opened.

    close $DAT;

Re: Hash to count characters
by nvivek (Vicar) on Aug 12, 2010 at 03:27 UTC
    Your first attempt is correct but you need to change the @line_char to @line_words because you split the line and store all the characters into @line_words array only @line_char.One more suggestion whenever you do program, use the following in your code.
    use strict; use warnings;
    Both the modules help you to correct the problems in your program.If you use any scalar, array or hash without declaration,it will warn you.
      Thanks a lot for your responses! nvivek's post worked, however, there is just one slight bug. This will produce an output that includes spaces and newline characters, which are unwanted. I tried to change the regex to /\w+/, because this says that there will be only alphanumeric strings plus underscores, but this produces an empty output. I just don't understand why it would produce characters with a // regex, but nothing with /\w+/

        amittleider:

        Regarding the unwanted items in your report: There are three general ways to approach it:

        1. Remove unwanted characters before counting,
        2. Delete them after counting but before reporting, or
        3. Delete or ignore them during the report.

        Each method has situations where it is better than the others, but frequently any of them are good enough. Examples:

        # Case 1: don't count unwanted characters for my $char(@letters) { ++$charCount{$char} if $char !~ /[a-zA-Z]/; } # Case 2: delete unwanted characters my %t = %charCount; $t{$_}=$charCount{$_} for grep {/[a-zA-Z]/} keys %charCount; %charCount=%t; # Case 3: ignore unwanted items during report for my $char (sort keys %charCount) { next unless $char =~ /[a-zA-Z]/; # print report entry }

        ...roboticus

Re: Hash to count characters
by dasgar (Priest) on Aug 12, 2010 at 05:48 UTC

    Both nvivek and jwkrahn gave you good tips on correcting your code while staying with your algorithm. However, I had a different route to get the character counts in a file. Instead of breaking the data down into words and then breaking it down further into characters, I say break down the data into the characters from the start.

    I'll give you a hint at what I'm thinking about. Consider the following lines of code:

    my $line = "This is sample data simulating a line from a file."; my (@chars) = ($line =~ m/([st])/gi);

    What you'll end up with is an array whose elements are [T s s s t s t], which are the s's and t's from the variable $line. If you combine that with a hash, you should be able to accomplish what you want to do.

    Since you said that this was an assignment, this sounds like something you're doing for a class. That's why I'm just giving hints rather than saying "Here's the code to do your assignment.", which won't be much help for future assignments and tests.

    If you really, really want to see code, check out my scratchpad. Just keep in mind that you copy my stuff verbatim, your teacher/instructor will probably realize that it's not your code since it won't match your code style and might use stuff that might not have been covered yet.

Re: Hash to count characters
by JavaFan (Canon) on Aug 12, 2010 at 09:05 UTC
    As a one-liner:
    perl -0777E '$s{$_}++ for split//,<>; say "$_ ", $s{$_}||0 for "a".."z +", "A".."Z"' your-data-file
    I would count all characters, and at the end only display the characters you are interested in.
Re: Hash to count characters
by FunkyMonk (Chancellor) on Aug 12, 2010 at 09:22 UTC
    if ($charCount{$char}){ $charCount{$char}++; }else { $charCount{$char}=1; }
    Perl will happily increment an undefined variable. In other words, the block above does exactly the same as just
    $charCount{$char}++;

Re: Hash to count characters
by roboticus (Chancellor) on Aug 12, 2010 at 12:26 UTC

    amittleider:

    Just for grins, here's another way to do it:

    #!/usr/bin/perl use strict; use warnings; my %charCount; my $corpus = join('', <DATA>); $corpus =~ tr/A-Z/a-z/d; # Map uppercase to lowercase $corpus =~ tr/a-z//cd; # Delete all but lowercase $charCount{$_}++ for split //, $corpus; for (sort keys %charCount) { print "$_ : $charCount{$_}\n"; } __DATA__ Now is the time for all good men to come to the aid of their party. The quick red fox jumped over the lazy brown dog. The warrior swings the +6 axe at the orcs standing in front of him.

    ...roboticus

      Whoa! So many great ideas so fast. You monks really are lifesavers! Here's the final working code! (I'll be sure to use strict and warnings in the future!)
      print "Counting from @ARGV \n"; &countWords(); &countChar(); sub countWords() { open DAT, "< @ARGV[0]" or die "Can't open @ARGV : $!"; print "Word Count\n"; while($line = <DAT>){ my @line_words = split(/\W/, $line); foreach my $word (@line_words){ if ($wordCount{$word}){ $wordCount{$word}++; }else { $wordCount{$word}=1; } } } close(DAT); for $word (sort keys %wordCount) { print "$word => $wordCount{$word}\n"; } } sub countChar() { open DAT, "< @ARGV[0]" or die "Can't open @ARGV : $!"; print "Character count\n"; while ($line = <DAT>){ my @line_words = split (//, $line); foreach my $char (@line_words){ if ($charCount{$char}){ $charCount{$char}++; }else { $charCount{$char}=1; } } } for $char (sort keys %charCount) { next unless $char =~ /[a-zA-Z]/; print "$char => $charCount{$char}\n"; } close(DAT); }
      <3<3 AJ

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://854562]
Approved by planetscape
Front-paged by SuicideJunkie
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (2)
As of 2024-04-25 19:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found