Re: Counting incidents of names in a file
by Anonymous Monk on Feb 13, 2002 at 04:54 UTC
|
| [reply] |
|
Yeah, but I really don't like hashes. I like to keep my data in the order I want it to be in. It's a completely irrational and unfounded prejudice, I know, but it's still there.
| [reply] |
|
That's like saying I like Carpentry but I don't like drills. Then you spend your time trying to bore a hole with your screwdriver, the cabinet takes for ever to build and it's not all that sturdy.
Or, I'm going to write a novel, but I'm not going to use adjectives.
Hashes are one of the basic tools of the language. You wouldn't code a large C project without pointers, would you?
Problems that would be innefficient using arrays like existance checks and counting occurances are quick and painless with hashes. And order is as simple as:
foreach (sort keys %hash) {
my $item = $hash{$_};
...
}
Not much worse than:
foreach my $item (@array) {
...
}
Plus there is no effort involved in inserting and delete and maintaining order.
I usually judge the progress of junior perl programmers by their use of hashes. When they stop trying to use arrays to do the job of a hash, they've leveled up in perl. (BTW, regexp are the second tier, then map/grep)
Of course this is all just my opinion,
-pete
Entropy is not what is used to be. | [reply] [d/l] [select] |
|
If your concern is just in keeping the names in the same
order in which they're seen, there are two approaches. The
easiest is to look into Tie::IxHash. This is a
variant of the hash that preserves the order of keys as
they are inserted.
The second way, that doesn't require installing a new
module, is to have your loop also push all newly-discovered
names onto an array, then use the array to iterate over the
hash rather than the keys keyword.
Don't be so quick to dismiss the basic constructs that
Perl provides. They are here for a reason, and when you ask
a very basic question you have to expect that your initial
answers are going to be pointers to the these basic
elements. At the very least, if you are going to ask such
a basic question then you should state up front why
you don't want to use the basic solution.
--rjray
| [reply] |
Re: Counting incidents of names in a file
by dvergin (Monsignor) on Feb 13, 2002 at 05:17 UTC
|
The solution of the Unnamed One is correct but a little
terse if you are still learning. Here's a more spelled-out
version of the same general idea:
#!/usr/bin/perl -w
use strict;
my %hash;
# Do it
for (<DATA>) {
my ($name, $score, $date) = split /\|/;
$hash{$name}++;
}
#Show it
for (keys %hash) {
print "$_ = $hash{$_}\n";
}
__DATA__
name_x|score|date
name_y|score|date
name_z|score|date
name_x|score|date
name_z|score|date
name_z|score|date
And just to explain what is happening with AM's solution:
$name{(split/\|/)[0]} += 1;
(split/\|/) returns a list which is then subscripted to
get the zeroth element which is then used as the key
for the %name hash. The value associated with that key
(which may be magically created if it didn't exist before)
is increased by one using the += assignment operator.
Just a word of encouragement about hashes. They are a
wonderful tool for many purposes. And it is commonly said
that you are not really programming in Perl until you
can think in terms of hashes. Your point about their lacking
a fixed order is well taken, but once you begin using
hashes, you may be pleasantly surprised to discover how
often that doesn't matter. | [reply] [d/l] [select] |
|
Ok, the hash makes sense. Thanks for you help. Now (since I know next to nothing about hashes and I don't have my books) I need to ask another question. I kept my question simple for ease of understanding, but now I need to get more indepth.
My data set also contain a "class" element like so:
name_x|score|date|class1
name_y|score|date|class2
name_y|score|date|class2
name_a|score|date|class2
name_b|score|date|class3
name_z|score|date|class1
name_b|score|date|class3
name_x|score|date|class1
name_b|score|date|class3
name_c|score|date|class2
name_c|score|date|class3
name_c|score|date|class3
...and so on
I need 3 seperate lists (actually html tables) based on the class (3 possible classes) and I need the lists in decending order by number of incidents of the names. so we get: _class1_
name_x = 2
name_z = 1
_class2_
name_y = 2
name_c = 1
_class3_
name_b = 3
name_c = 2
I know this is getting a little complex, but I'm lost.
Thanks again. | [reply] [d/l] [select] |
|
This looks like feature creep to me, or should we say,
wanting others to do all the work. Please post some
code to show that you have attempted to solve this problem
on your own. The other monks it appears were very
generous with your first question, but coming back with
no attempt to solve on your own is not a good idea. That
said I am sure someone will post a solution because most
of us just can't help ourselves. :)
| [reply] |
|
Simply modify my last code to include the classes.
use strict;
use Data::Dumper;
my @all;
my %occurances;
while (<DATA>) {
chomp $_;
push @all, [ split(/\|/,$_) ];
# I should explain what is going on here
# the -1 is going to get the last element
# in a list. In this case the first [-1]
# is the last list of elements added
# from the DATA section below with the push
# function. The second -1
# is the last element of that list which
# is the class. The next key in the
# occurances hash is the name_? value
# which is extracted from the last array
# (-1) pushed onto the @all and the first
# element (0) of the annoymous array in
# in that (-1) location of @all.
$occurances{$all[-1][-1]}{$all[-1][0]}++;
# class # name_?
}
print Dumper(\@all);
print Dumper(\%occurances);
__DATA__
name_x|score|date|class1
name_y|score|date|class2
name_y|score|date|class2
name_a|score|date|class2
name_b|score|date|class3
name_z|score|date|class1
name_b|score|date|class3
name_x|score|date|class1
name_b|score|date|class3
name_c|score|date|class2
name_c|score|date|class3
name_c|score|date|class3
| [reply] [d/l] |
Re: Counting incidents of names in a file
by trs80 (Priest) on Feb 13, 2002 at 06:46 UTC
|
Here is a solution that doesn't create any temp values and
it keeps all your information in order in an array. A hash
is in my opinion the easier way to count the occurances.
use Data::Dumper;
my @all;
my %occurance;
while (<DATA>) {
chomp;
push @all, [ split(/\|/,$_) ];
$occurance{$all[-1][0]}++;
}
print Dumper(\@all);
print Dumper(\%occurance);
__DATA__
name_x|score|date
name_y|score|date
name_y|score|date
name_z|score|date
name_z|score|date
I included the Data::Dumper part for monks that haven't used
it yet so they can see how it can be used to confirm content
without doing a foreach or similar operation to see content
of a hash or array. | [reply] [d/l] |
(Duplicate: to be deleted) Re: Counting incidents of names in a file
by Anonymous Monk on Feb 13, 2002 at 05:01 UTC
|
while( <> ){
$name{(split/\|/)[0]} += 1;
}
foreach( keys $name ){
print "$_ = $name{$_}\n";
}
| [reply] [d/l] |
Re: Counting incidents of names in a file
by Anonymous Monk on Feb 13, 2002 at 05:03 UTC
|
while( <> ){
$name{(split/\|/)[0]} += 1;
}
foreach( keys %name ){
print "$_ = $name{$_}\n";
}
| [reply] [d/l] |
Re: Counting incidents of names in a file
by Bishma (Beadle) on Feb 13, 2002 at 22:46 UTC
|
Ok, with your help I think I managed to find a solution. It's a little sloppy, but I'll clean it up later.
@classes = qw/ class1 class2 class3/;
foreach (@classes) {
my %kboard;
print "_ $_ _<BR>";
for ($i = 0; $i <= $#scoredata; $i++) {
@logdata = split /\|/, $scoredata[$i];
if ($logdata[4] eq $_) {
$kboard{$logdata[0]}++;
}
}
for (keys %kboard) {
print "$_ = $kboard{$_}<BR>\n";
}
print "<BR><BR>";
}
@scoredata is what I read my data file into. Thanks again everyone. | [reply] [d/l] |