Re: Counting words..
by BrowserUk (Patriarch) on Oct 24, 2005 at 15:17 UTC
|
... what does the following mean?
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
It decodes url-encoded text. Ie. When you see url's that look have characters that have been encoded, eg. http://news.bbc.co.uk/1/hi/technology/default%20.stm, it looks for the %xx encoding and translates that back to an ascii character.
s/
% ## find (and discard) the % char
( ## capture
[a-fA-F0-9][a-fA-F0-9] ## two hex charcters to $1
)
/
pack("C", hex($1)) ## convert hexcharacters to a number
+,
## then pack to a character
/eg; ## replace all in the target string
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Counting words..
by ikegami (Patriarch) on Oct 24, 2005 at 15:08 UTC
|
The substitution decodes URL-encoded strings. It does the same uri_unescape in URI::Escape. CGI already decodes parameters for you. Is there any reason you're not using that module?
my $count = () = /$keyword/gi; will count the number of occurances. The () forces a list context.
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Counting words..
by japhy (Canon) on Oct 24, 2005 at 15:10 UTC
|
The form validation code you've shown takes a URL-encoded escape sequence (like %7E) and replaces it with the character it encoded (like "~"). But you should let CGI.pm take care of your form processing. It does it right.
As for the number of times a pattern appears in a string, I'd suggest: my $count = 0; ++$count while $string =~ /$pattern/g;
| [reply] [Watch: Dir/Any] [d/l] |
Re: Counting words..
by mulander (Monk) on Oct 24, 2005 at 15:08 UTC
|
Here is one way to do it:
perl -ne '$count++ while /seek/ig; print $count,"\n" if eof;' file.txt
The /g modifier tells the regex to search for more matches if possible, and for each match the while loop's code block is executed ( $count++ in this case ). So this oneliner will tell you how many times it saw a 'seek' in a file. | [reply] [Watch: Dir/Any] [d/l] |
Re: Counting words..
by lepetitalbert (Abbot) on Oct 24, 2005 at 15:06 UTC
|
if (/$keyword/i)
replace with
if (/$keyword/ig)
have a nice day | [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Counting words..
by Not_a_Number (Prior) on Oct 24, 2005 at 16:43 UTC
|
Wrt the wordcount, you say that you want to match a specific word, but all the solutions so far provided match strings. To illustrate the difference:
my $string = 'Cathy placated her cat,
which was trying to catch a caterpillar.';
my $keyword = 'cat';
my $count = 0;
++$count while $string =~ /$keyword/gi;
print "Occurrences of '$keyword': $count";
Output: Occurrences of 'cat': 5
If that's not what you want, wrap your keyword in word boundary metacharacters:
$string =~ /\b$keyword\b/gi; | [reply] [Watch: Dir/Any] [d/l] [select] |
|
Many thanks for the replies,
I got it working using $count++ while /$keyword/ig;
Printing the $count, then re-setting the counter (as I've got it in a loop to count each page searched).
I'm wrestling with searching for words/strings now (which is semi-working!), and will probably result in a new post!
Thanks again people,
NL
| [reply] [Watch: Dir/Any] [d/l] |
|
This may prove a bit more efficient:
$count = () = /$keyword/ig;
It puts the match into array context, which causes it to return the matches; then the resulting array is taken as a scalar, resulting in the count. This is a somewhat common Perl idiom. And you don't have to reset the counter.
Of course, if you need to accumulate several counts, make it += instead of =, and do remember to reset the counter.
Caution: Contents may have been coded under pressure.
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Counting words..
by kwaping (Priest) on Oct 24, 2005 at 17:10 UTC
|
This will also work for counting words (aka occurences of a pattern in a string):
#!/usr/bin/perl
use strict;
use warnings;
read DATA, my $text, 40;
### this is the important line ###
my $wordcount = () = $text =~ /test/gi;
print $wordcount;
__DATA__
TEST
tester
testing
asdf lalala greatest
Update: The while++ solution previously posted takes half the time to run as this one. Can anyone explain why? | [reply] [Watch: Dir/Any] [d/l] [select] |
|
| [reply] [Watch: Dir/Any] |