Re^2: Word Count and Match

Replies are listed 'Best First'.
Re^3: Word Count and Match by davido (Cardinal) on Jan 07, 2021 at 22:50 UTC
Your sample code is not a short self-contained snippet that demonstrates the behavior you describe. Your data is not part of the code, and the code that is relevant is incorporated in a subroutine that your example never calls, and it has external dependencies that aren't needed, and that aren't loaded by the snippet. That means anyone wanting to respond to help you has to refactor the code so that it can run stand-alone, and in doing that, anyone trying to help you might inadvertently fix the thing that you're describing as working incorrectly. Precision in these sorts of things is important. We shouldn't be chasing you down to demonstrate the actual bug to us in a way that we can repeat in our own tests. I took a stab at doing this; I pulled the data into a __DATA__ segment, removed the need for external files, and caused your output to print entirely to STDOUT, but preserved in the output the names of the handles you were printing to. I also removed the colorization, since it is an external dependency that you didn't "use" in your snippet. So I assume it's not part of the problem. Additionally, I added the formatting back in that I needed to be able to understand your code. Having done all that stuff that you should have done, this is what I came up with: `#!/usr/bin/env perl use strict; use warnings; my %count; my $namecnt='David\|Tom\|Sam\|Will\|Dave\|William\|Thomas'; while(<DATA>){ my @words = split(":"); foreach my $word (@words){ if($word=~/($namecnt)/io){ $count{$1}++; } } } foreach my $word (sort keys %count) { printf("(STDOUT) %39s %-14s %-19s %6s", "There are", $count{$wo +rd}, $word, "Name(s)\n"); print "(OUTPUT) There are $count{$word} $word Name(s)\n"; } __DATA__ 1:NAME:Bob:Phone 2:NAME:Dave:Phone 3:NAME:Will:Phone 4:NAME:Todd:Phone` [download] And when I run that I get: `(STDOUT) There are 1 Dave + Name(s) (OUTPUT) There are 1 Dave Name(s) (STDOUT) There are 1 Will + Name(s) (OUTPUT) There are 1 Will Name(s)` [download] Which, to me, is an indication that the code is behaving as designed, and that you are NOT getting some summary count at the end like you said you are getting. At least not from the code you provided. So where are we now? You've asked a question, people said that the original question didn't demonstrate the problem being described. You took a stab at providing a better example of the code, but still failed to demonstrate that there is actually a problem with the code you posted. If I had to guess, I would say you have a print statement somewhere that you have forgotten about. Either way, people who are just trying to help had their willingness to help squandered. Dave	[reply] [d/l] [select]
Re^4: Word Count and Match by LanX (Saint) on Jan 07, 2021 at 22:55 UTC
> Either way, people who are just trying to help had their willingness to help squandered. He has always been like this, and always got similar replies. And next time will be a déjà vu again. Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery}	[reply]
Re^5: Word Count and Match by davido (Cardinal) on Jan 07, 2021 at 22:57 UTC
I know. I keep hoping, given the number of years, things will be better. Dave	[reply]
Re^6: Word Count and Match by LanX (Saint) on Jan 07, 2021 at 23:16 UTC
Re^7: Word Count and Match by Marshall (Canon) on Jan 09, 2021 at 17:58 UTC
Some notes below your chosen depth have not been shown here
Re^4: Word Count and Match by PilotinControl (Pilgrim) on Jan 08, 2021 at 13:30 UTC
@Dave: There actually is a problem with the code. I've modified the code below and the output. As you can see it lists the names twice instead of just once. That's why I wanted to match the data exactly using the specific field in the file. #!/usr/bin/env perl use strict; use warnings; my %count; my $namecnt='David\|Tom\|Sam\|Will\|Dave\|William\|Thomas'; while(<DATA>){ my @words = split(":"); foreach my $word (@words){ if($word=~/($namecnt)/io){ $count{$1}++; } } } foreach my $word (sort keys %count) { printf("(STDOUT) %39s %-14s %-19s %6s", "There are", $count{$wo +rd}, $word, "Name(s)\n"); print "(OUTPUT) There are $count{$word} $word Name(s)\n"; } __DATA__ 1:NAME:Bob:Bobville:Phone 2:NAME:Dave:Davis:Phone 3:NAME:Will:Willard:Phone 4:NAME:Todd:Toadlane:Phone [download] (STDOUT) There are 1 Dave Name(s) (OUTPUT) There are 1 Dave Name(s) (STDOUT) There are 2 Will Name(s) (OUTPUT) There are 2 Will Name(s)	[reply] [d/l]
Re^5: Word Count and Match by choroba (Cardinal) on Jan 08, 2021 at 13:44 UTC
When matching, use anchors in the regex to only match the whole word: `my $namecnt = qr/^(David\|Tom\|Sam\|Will\|Dave\|William\|Thomas)$/;` [download] `^` matches at the beginning, `$` matches at the end. The match will now look like (I removed the `/o` as it's not recommended) `if ($word =~ /$namecnt/i) {` [download] Note that the first argument to split is a regex. It's clearer to write `my @words = split /:/;` [download] `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^ARGV,3]`	[reply] [d/l] [select]
Re^5: Word Count and Match by LanX (Saint) on Jan 08, 2021 at 13:41 UTC
> As you can see it lists the names twice* instead of just once.* Do you understand the effects of `print` and `printf`? Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery}	[reply]
Re^5: Word Count and Match by davido (Cardinal) on Jan 08, 2021 at 19:58 UTC
No, there is no problem with the code's output. Your original code was printing using printf to STDOUT, and it was printing via print to the OUTPUT file handle. Two distinct destinations; one the terminal, the other a file. For the purposes of this demonstration I switched it so that both print to STDOUT, because printing to a file adds complexity to the example that you don't need. But in the print output I identify which output stream your original code would have printed to. So it is intentional that we are getting output twice. As you can see it is not identical output. And as you can see from the code, it is because we have a printf statement, and a print statement, just as your original code did. So in other words, you still haven't shown us the code that is misbehaving, unless the misbehavior was that you were printing to STDOUT and to an output file because you have two print statements. Do you need for us to suggest removing one of the two print statements? Dave	[reply]
Re^5: Word Count and Match by eyepopslikeamosquito (Archbishop) on Jan 08, 2021 at 15:51 UTC
I see you're still using the `/o` modifier in: `if($word=~/($namecnt)/io){` [download] Did you bother to read and understand my earlier reply?	[reply] [d/l] [select]
Re^5: Word Count and Match by Marshall (Canon) on Jan 09, 2021 at 20:16 UTC
The first problem that I had with your code was the confusing name, $namecnt. I expected a numeric scalar for that type of name! I changed that var name to "$names2count" to imply multiple names which will be counted - I assume in a case insensitive manner. Added: William test case use strict; use warnings; my %name_count; my $names2cnt='David\|Tom\|Sam\|Will\|Dave\|William\|Thomas'; while (my $line = <DATA>) { next unless ($line =~ /\S/); #skip blank lines my $name = (split (":",$line))[2]; $name_count{$1}++ if $names2cnt =~ /\b($name)\b/i; } foreach my $name (sort keys %name_count) { print "$name => $name_count{$name}\n"; } =prints: Dave => 1 Will => 3 #allows Will and WILL and WiLL spellings William => 1 =cut __DATA__ 1:NAME:Bob:Bobville:Phone 2:NAME:Dave:Davis:Phone 3:NAME:Will:Willard:Phone 4:NAME:Todd:Toadlane:Phone 5:NAME:WILL:Street:Phone 6:NAME:WiLL:Street2:phone2 7:NAME:WilliaM:xyz:1234 [download] Update: I'm not sure that this \b stuff in the regex is necessary. I put some obvious test cases into the code, but not all possible test cases.	[reply] [d/l]
Re^3: Word Count and Match by eyepopslikeamosquito (Archbishop) on Jan 07, 2021 at 22:00 UTC
That is not a Short, Self-Contained, Correct Example. Your program should "just work" when we download it and run it. You should test that before posting (you should also use strict and warnings). You seem to be a pilot out of control today. See also: I know what I mean. Why don't you?	[reply]


Perl Monk, Perl Meditation
	PerlMonks