Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Increasing key/values held in a Hash from an Array

by Gavin (Archbishop)
on Mar 17, 2006 at 21:13 UTC ( [id://537579]=perlquestion: print w/replies, xml ) Need Help??

Gavin has asked for the wisdom of the Perl Monks concerning the following question:

I have a hash (%hash) containing the following Data.
Key (holding the score)
Value (holding a line of text)
Key    Value
12     First line of text
25     Second line of text
34     Third line of text
An array @scoreWords holds a list of special words. I would like to increase the score held in each Key by $y for each occurrence of any of the special words on each line.

Any help would be much appreciated
I have a second hash containing the same text in the Values and the line numbers in the Key
Key    Value
1     First line of text
2     Second line of text
3     Third line of text
I would like to use the sorted key scores from the first hash to find the highest scoring lines of text in the second hash by their line numbers.
  • Comment on Increasing key/values held in a Hash from an Array

Replies are listed 'Best First'.
Re: Increasing key/values held in a Hash from an Array
by ikegami (Patriarch) on Mar 17, 2006 at 21:21 UTC

    A hash is inappropriate here. Your key is not a key at all. A key is a *unique* way of *identifying* something. A score is neither unique, nor an identity.

    my @data = ( [ 12, 'First line of text' ], [ 25, 'Second line of text' ], [ 34, 'Third line of text' ], ); foreach (@scoreWords) { my $re = qr/\b\Q$_\E\b/i; foreach (@data) { $_[0] += $y while $_[1] =~ /$re/g; } }

    Update: I either missed the second part of the question, or the OP has been updated. Here's my updated answer:

    my @data = ( [ 1, 12, 'First line of text' ], [ 2, 25, 'Second line of text' ], [ 3, 34, 'Third line of text' ], ); foreach (@scoreWords) { my $re = qr/\b\Q$_\E\b/i; foreach (@data) { $_[1] += $y while $_[2] =~ /$re/g; } } print("Line ${$_}[0] has a score of ${$_}[1]\n") foreach sort { $b->[1] <=> $a->[1] } @data;
    or if you don't actually care about the line numbers:
    my @data = ( [ 12, 'First line of text' ], [ 25, 'Second line of text' ], [ 34, 'Third line of text' ], ); foreach (@scoreWords) { my $re = qr/\b\Q$_\E\b/i; foreach (@data) { $_[0] += $y while $_[1] =~ /$re/g; } } print("${$_}[0]: ${$_}[1]\n") foreach sort { $b->[0] <=> $a->[0] } @data;

    Update: Changed if // to while //g.

Re: Increasing key/values held in a Hash from an Array
by xdg (Monsignor) on Mar 17, 2006 at 21:33 UTC

    I'd reverse the nature of the hash and store lines as keys and values as the count.

    use strict; use warnings; my @words = qw( alpha beta gamma ); my @lines = ( 'alpha comes before beta and gamma, but not alphabet', 'beta comes after alpha; beta comes before gamma', 'gamma is the last word', ); my %count; for my $line ( @lines ) { for my $word ( @words ) { my $c =()= ($line =~ m/\b\Q$word\E\b/g); $count{$line} += $c; } } print map { "$count{$_}: $_\n" } @lines;

    Gives:

    3: alpha comes before beta and gamma, but not alphabet 4: beta comes after alpha; beta comes before gamma 1: gamma is the last word

    (I'm sure this can be golfed, but I think it's clearer this way.)

    Update: reading ikegami's answer, we're both on the right track, but I think solving slightly different problems. I'm not sure which one is the one you want: a count of occurances of keys across all lines or a count of occurances of keys within lines. And do you want to count multiple occurances of each special word? My answer gives the count by line including multiple occurances.

    Also, it reminded me to add \Q and \E for safety's sake, so I've made that change.

    Update 2: For those who don't recognize it, see Perl Idioms Explained - my $count = () = /.../g.

    -xdg

    Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Re: Increasing key/values held in a Hash from an Array
by fizbin (Chaplain) on Mar 17, 2006 at 21:36 UTC

    Your data structure is malformed. You don't want the key to be the score, that is, something subject to change.

    What happens if I bump up the score of something that was at 13 by two, but there's another sentence at 15? Or what if there's a sentence at 12, and one at 13, and the 12 one gets bumped up by 3 and the 13 one gets bumped up by two? What's going to happen with that data structure is that one or the other will get wiped out.

    So first, we take your hash and replace it with a saner data structure:

    my @scorearray = map { {score=>$_, text=>$hash{$_}} } keys %hash;

    Now that that's done, what you ask is quite easy:

    for my $word (@scoreWords) { my $qmword = quotemeta($word); my $regexp = qr/\b$qmword\b/; for my $scorebit (@scorearray) { $scorebit->{'score'} += 1 if $scorebit->{'text'} =~ $regexp; } }

    You can even print it out in a pretty table:

    printf "%-7s%s\n", "Score", "Value"; for my $scorebit (@scorearray) { printf "%5d %s\n", $scorebit->{'score'}, $scorebit->{'text'}; }
    --
    @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/
      I think the Map function may suit my needs.
      But I dont understand fully what's going on Fizbin's script. Could someone explain further how to get the values into the scorearray or suggest a tutorial. I am very new to perl as you might gather and am finding these expressions difficult to follow.
      Thanks to all for their help.
Re: Increasing key/values held in a Hash from an Array
by leocharre (Priest) on Mar 17, 2006 at 21:42 UTC

    This is buggy as far as scoring, but the idea is here.

    #!/usr/bin/perl -w use strict; my $text = qq|If you have a question on how to do something in Perl, o +r you need a Perl solution to an actual real-life problem, or you're unsure why something you've tri +ed just isn't working... then this is the place to ask. However, you might consider asking in the chatterbox first (if you're +a registered user). The response time tends to be quicker, and if it turns out that the pr +oblem/solutions are too much for the cb to handle, the kind monks will be sure to direct y +ou here.|; # build lines hash my %lines; my $k=1; for (split(/\n/,$text)){ $lines{$k}=$_; $k++; } # score hash # put your words here, and the value for each my %score_guide = ( 'for'=>1, 'the'=>1, 'if'=>2, 'you'=>3, 'moguai'=>40 ); # this will hold key= line num, val= score my %score_results =(); # score each line for (keys %lines){ $score_results{$_} = score_line($lines{$_}); } # feedback for (keys %score_results){ print "line $_ score $score_results{$_}\n"; } sub score_line { my ($string)=$_[0]; # my $score=0; for (sort keys %score_guide){ while ($string=~s/\b\Q$_\E\b//i){ # corrected by ikegami $score+=$score_guide{$_}; } } return $score; }

    output:

    [leo@mescaline ~]$ perl scorelines.pl line 1 score 8 line 2 score 6 line 3 score 1 line 4 score 9 line 5 score 4 line 6 score 6

      /\b$_\b/
      should be
      /\b\Q$_\E\b/
      since the OP said he had an array of words (as opposed to regexps).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://537579]
Approved by friedo
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2024-04-25 07:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found