Counting incidents of names in a file

Bishma has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Counting incidents of names in a file by Anonymous Monk on Feb 13, 2002 at 04:54 UTC
Ask yourself this: How can I keep track of unique "keys" using a native perl data structure? %IthinkYouKnowTheAnswer;	[reply]
Re: Re: Counting incidents of names in a file by Bishma (Beadle) on Feb 13, 2002 at 05:02 UTC
Yeah, but I really don't like hashes. I like to keep my data in the order I want it to be in. It's a completely irrational and unfounded prejudice, I know, but it's still there.	[reply]
Re: Re: Re: Counting incidents of names in a file by dreadpiratepeter (Priest) on Feb 13, 2002 at 05:27 UTC
That's like saying I like Carpentry but I don't like drills. Then you spend your time trying to bore a hole with your screwdriver, the cabinet takes for ever to build and it's not all that sturdy. Or, I'm going to write a novel, but I'm not going to use adjectives. Hashes are one of the basic tools of the language. You wouldn't code a large C project without pointers, would you? Problems that would be innefficient using arrays like existance checks and counting occurances are quick and painless with hashes. And order is as simple as: `foreach (sort keys %hash) { my $item = $hash{$_}; ... }` [download] Not much worse than: `foreach my $item (@array) { ... }` [download] Plus there is no effort involved in inserting and delete and maintaining order. I usually judge the progress of junior perl programmers by their use of hashes. When they stop trying to use arrays to do the job of a hash, they've leveled up in perl. (BTW, regexp are the second tier, then map/grep) Of course this is all just my opinion, -pete Entropy is not what is used to be.	[reply] [d/l] [select]
Re: Re: Re: Counting incidents of names in a file by rjray (Chaplain) on Feb 13, 2002 at 05:44 UTC
If your concern is just in keeping the names in the same order in which they're seen, there are two approaches. The easiest is to look into Tie::IxHash. This is a variant of the hash that preserves the order of keys as they are inserted. The second way, that doesn't require installing a new module, is to have your loop also push all newly-discovered names onto an array, then use the array to iterate over the hash rather than the `keys` keyword. Don't be so quick to dismiss the basic constructs that Perl provides. They are here for a reason, and when you ask a very basic question you have to expect that your initial answers are going to be pointers to the these basic elements. At the very least, if you are going to ask such a basic question then you should state up front why you don't want to use the basic solution. --rjray	[reply]
Re: Counting incidents of names in a file by dvergin (Monsignor) on Feb 13, 2002 at 05:17 UTC
The solution of the Unnamed One is correct but a little terse if you are still learning. Here's a more spelled-out version of the same general idea: `#!/usr/bin/perl -w use strict; my %hash; # Do it for (<DATA>) { my ($name, $score, $date) = split /\\|/; $hash{$name}++; } #Show it for (keys %hash) { print "$_ = $hash{$_}\n"; } __DATA__ name_x\|score\|date name_y\|score\|date name_z\|score\|date name_x\|score\|date name_z\|score\|date name_z\|score\|date` [download] And just to explain what is happening with AM's solution: `$name{(split/\\|/)[0]} += 1;` (split/\\|/) returns a list which is then subscripted to get the zeroth element which is then used as the key for the %name hash. The value associated with that key (which may be magically created if it didn't exist before) is increased by one using the += assignment operator. Just a word of encouragement about hashes. They are a wonderful tool for many purposes. And it is commonly said that you are not really programming in Perl until you can think in terms of hashes. Your point about their lacking a fixed order is well taken, but once you begin using hashes, you may be pleasantly surprised to discover how often that doesn't matter.	[reply] [d/l] [select]
Re: Re: Counting incidents of names in a file by Bishma (Beadle) on Feb 13, 2002 at 06:34 UTC
Ok, the hash makes sense. Thanks for you help. Now (since I know next to nothing about hashes and I don't have my books) I need to ask another question. I kept my question simple for ease of understanding, but now I need to get more indepth. My data set also contain a "class" element like so: `name_x\|score\|date\|class1 name_y\|score\|date\|class2 name_y\|score\|date\|class2 name_a\|score\|date\|class2 name_b\|score\|date\|class3 name_z\|score\|date\|class1 name_b\|score\|date\|class3 name_x\|score\|date\|class1 name_b\|score\|date\|class3 name_c\|score\|date\|class2 name_c\|score\|date\|class3 name_c\|score\|date\|class3 ...and so on` [download] I need 3 seperate lists (actually html tables) based on the class (3 possible classes) and I need the lists in decending order by number of incidents of the names. so we get: `_class1_ name_x = 2 name_z = 1 _class2_ name_y = 2 name_c = 1 _class3_ name_b = 3 name_c = 2` [download] I know this is getting a little complex, but I'm lost. Thanks again.	[reply] [d/l] [select]
Re: Re: Re: Counting incidents of names in a file by TippyTurtle (Novice) on Feb 13, 2002 at 06:55 UTC
This looks like feature creep to me, or should we say, wanting others to do all the work. Please post some code to show that you have attempted to solve this problem on your own. The other monks it appears were very generous with your first question, but coming back with no attempt to solve on your own is not a good idea. That said I am sure someone will post a solution because most of us just can't help ourselves. :)	[reply]
Re: Re: Re: Counting incidents of names in a file by trs80 (Priest) on Feb 13, 2002 at 07:08 UTC
Simply modify my last code to include the classes. use strict; use Data::Dumper; my @all; my %occurances; while (<DATA>) { chomp $_; push @all, [ split(/\\|/,$_) ]; # I should explain what is going on here # the -1 is going to get the last element # in a list. In this case the first [-1] # is the last list of elements added # from the DATA section below with the push # function. The second -1 # is the last element of that list which # is the class. The next key in the # occurances hash is the name_? value # which is extracted from the last array # (-1) pushed onto the @all and the first # element (0) of the annoymous array in # in that (-1) location of @all. $occurances{$all[-1][-1]}{$all[-1][0]}++; # class # name_? } print Dumper(\@all); print Dumper(\%occurances); __DATA__ name_x\|score\|date\|class1 name_y\|score\|date\|class2 name_y\|score\|date\|class2 name_a\|score\|date\|class2 name_b\|score\|date\|class3 name_z\|score\|date\|class1 name_b\|score\|date\|class3 name_x\|score\|date\|class1 name_b\|score\|date\|class3 name_c\|score\|date\|class2 name_c\|score\|date\|class3 name_c\|score\|date\|class3 [download]	[reply] [d/l]
Re: Counting incidents of names in a file by trs80 (Priest) on Feb 13, 2002 at 06:46 UTC
Here is a solution that doesn't create any temp values and it keeps all your information in order in an array. A hash is in my opinion the easier way to count the occurances. `use Data::Dumper; my @all; my %occurance; while (<DATA>) { chomp; push @all, [ split(/\\|/,$_) ]; $occurance{$all[-1][0]}++; } print Dumper(\@all); print Dumper(\%occurance); __DATA__ name_x\|score\|date name_y\|score\|date name_y\|score\|date name_z\|score\|date name_z\|score\|date` [download] I included the Data::Dumper part for monks that haven't used it yet so they can see how it can be used to confirm content without doing a foreach or similar operation to see content of a hash or array.	[reply] [d/l]
(Duplicate: to be deleted) Re: Counting incidents of names in a file by Anonymous Monk on Feb 13, 2002 at 05:01 UTC
`while( <> ){ $name{(split/\\|/)[0]} += 1; } foreach( keys $name ){ print "$_ = $name{$_}\n"; }` [download]	[reply] [d/l]
Re: Counting incidents of names in a file by Anonymous Monk on Feb 13, 2002 at 05:03 UTC
`while( <> ){ $name{(split/\\|/)[0]} += 1; } foreach( keys %name ){ print "$_ = $name{$_}\n"; }` [download]	[reply] [d/l]
Re: Counting incidents of names in a file by Bishma (Beadle) on Feb 13, 2002 at 22:46 UTC
Ok, with your help I think I managed to find a solution. It's a little sloppy, but I'll clean it up later. `@classes = qw/ class1 class2 class3/; foreach (@classes) { my %kboard; print "_ $_ _<BR>"; for ($i = 0; $i <= $#scoredata; $i++) { @logdata = split /\\|/, $scoredata[$i]; if ($logdata[4] eq $_) { $kboard{$logdata[0]}++; } } for (keys %kboard) { print "$_ = $kboard{$_}<BR>\n"; } print "<BR><BR>"; }` [download] @scoredata is what I read my data file into. Thanks again everyone.	[reply] [d/l]


laziness, impatience, and hubris
	PerlMonks