Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Counting the Times Two Unknown Values Appear Together

by Dru (Hermit)
on Jul 15, 2007 at 02:44 UTC ( [id://626680]=perlquestion: print w/replies, xml ) Need Help??

Dru has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

I am trying to count the number of times an unknown value appears on the same line as another unknown value. When I say "unknown" I mean I do not know beforehand what the value will be, so I can't match it using a regex or something, but I do know the position of these values on the line.

I created a hash reference (I believe this is what this data type is called) that saves the data that I want, but haven't been able to figure out the counting piece.

Referring to my snippet of code below, I would like to count the number of times $sport appears on the same line as $src in a file

The file is rather large, so I would prefer not to process it twice.

while (<FILE>){ my ($src,$sport) = (split /;/)[9,12]; $hash{$src}={sport => $sport}; }
Thanks,
Dru

Perl, the Leatherman of Programming languages. - qazwart

Replies are listed 'Best First'.
Re: Counting the Times Two Unknown Values Appear Together
by kyle (Abbot) on Jul 15, 2007 at 03:07 UTC

    I think maybe you want this:

    while (<FILE>){ my ($src,$sport) = (split /;/)[9,12]; $hash{$src}{$sport}++; }

    Then you can say something like...

    foreach my $src ( keys %hash ) { foreach my $sport ( keys %{$hash{$src}} ) { printf "%s appears with %s %d times\n", $src, $sport, $hash{$src}{$sport}; } }

    Or just use something like Data::Dumper or YAML to spew out the whole structure.

      I'd simplify it a bit further:

      my %count; while (<FILE>) { my( $src, $sport ) = (split /;/)[9,12]; $count{$src,$sport}++; # magic $; } for my $pair ( sort keys %count ) { print "($pair) occurred on a line together $count{$pair} times\n"; # if you need the two parts: # my( $src, $sport ) = split $;, $pair; }
      In fact, I'd extend it to any number of fields:
      my @fields = ( 9, 12 ); my %count; while (<FILE>) { chomp; $count{ (split /;/)[@fields] }++; } for ( sort keys %count ) { print "($_) occurred on a line together $count{$_} times\n"; # if you need the parts: # my @vec = split $;; }
      Cool, this is what I went with and it does exactly what I want, thanks!

      P.S. I looked at your home node and saw "Other monks I've met in person:" line and when I first read it, I thought it said "Other monks I've met in prison!!"

      P.P.S. Thanks to the other monks for their responses as well.

      Thanks,
      Dru

      Perl, the Leatherman of Programming languages. - qazwart
Re: Counting the Times Two Unknown Values Appear Together
by naikonta (Curate) on Jul 15, 2007 at 03:08 UTC
    Well, assuming that you can spot the position and save it as $sport, then you already have something to compare with. If I'm not mistaken, do you mean something like (untested):
    my %count; my $total; while (<FILE>) { chomp; my($src, $sport) = (split /;/)[9,12]; my $number_of_sport_shows_on_this_line = () = /$sport/gi; $count{$.} = $number_of_sport_shows_on_this_line; $total += $number_of_sport_shows_on_this_line; } # iterate %count to see how many the pattern shows up on each line # $total is the total number the pattern shows up in the file
    If you avoid regexes:
    my(%count, $total); while (<FILE>) { chomp; my @fields = split /;/; my %seen; $seen{$_}++ for @fields; my $target = $fields[12]; $count{$.} = $seen{$target}; $total += $seen{$target}; }

    Update: (15-07-2007) I misread the title, I missed the "appear together" part, and I second kyle's reply.


    Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

Re: Counting the Times Two Unknown Values Appear Together
by wind (Priest) on Jul 15, 2007 at 03:52 UTC
    There's nothing stopping you from using a regular expression. As long as you escape the meta characters, $variables will work just in a regex.
    my $count = 0; while (<FILE>){ $count++ if /\Q$src/ && /\Q$sport/; } print "Both found a total of $count times\n";
    - Miller

    Update: (03:58 UTC) Ahh, misunderstand "beforehand". I second that kyle's reply is the way to go.
Re: Counting the Times Two Unknown Values Appear Together
by johngg (Canon) on Jul 15, 2007 at 16:32 UTC
    If you know the position of the fields in the line there's no reason you can't use a regex to do the job. Capture the 10th field then use a back-reference to the 10th in the 13th position.

    use strict; use warnings; my @strings = qw{ 0;1;2;3;4;5;6;7;8;9;10;11;12;13 0;1;2;3;4;5;6;7;8;9;10;11;9;8 0;1;2;3;4;5;6;7;8;9;10;11;99 0;1;2;3;4;5;6;7;8;9;10;11;9 0;1;2;3;4;5;6;7;8;;10;11;;13 0;1;2;3;4;5;6;7;8;;10;11;12;13 }; my $rxByPosn = qr {(?x) \A # Start of string (?:[^;]*;){9} # Nine fields/delimiters ([^;]*) # Capture 10th field, which # could be blank ; # Delimiter (?:[^;]*;){2} # Two more fields/delimiters \1 # 13th field must match 10th (?:;|\z) # Delimiter or end of string }; foreach my $string ( @strings ) { print qq{String - $string\n}, $string =~ $rxByPosn ? qq{ Col. 10 matches col. 13\n} : qq{ Cols 10 and 13 differ\n} ; }

    The output.

    String - 0;1;2;3;4;5;6;7;8;9;10;11;12;13 Cols 10 and 13 differ String - 0;1;2;3;4;5;6;7;8;9;10;11;9;8 Col. 10 matches col. 13 String - 0;1;2;3;4;5;6;7;8;9;10;11;99 Cols 10 and 13 differ String - 0;1;2;3;4;5;6;7;8;9;10;11;9 Col. 10 matches col. 13 String - 0;1;2;3;4;5;6;7;8;;10;11;;13 Col. 10 matches col. 13 String - 0;1;2;3;4;5;6;7;8;;10;11;12;13 Cols 10 and 13 differ

    I hope this is of use.

    Cheers,

    JohnGG

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://626680]
Approved by naikonta
Front-paged by monkfan
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2024-04-23 21:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found