He said (and I think he meant) 5 to 10 FIVE digit zip codes. And I think he was just going to hard code those zip codes in the script. So perhaps something like: #!/usr/local/bin/perl -l -w
use strict;
use CGI;
my @zips = qw(
92714
92715
92716
);
my %zips;
@zips{@zips} = ();
my $q = CGI->new;
print $q->header;
print $q->start_html;
if (my $zip = $q->param('zip')) {
print "<br>You entered $zip";
my $valid = (exists $zips{$zip}) ? 'valid' : 'invalid';
print "<br>It is $valid";
} else {
print $q->start_form;
print $q->textfield(-name=>"zip");
print $q->submit("Go");
print $q->end_form;
}
print $q->end_html;
Update: Ok, so you can get rid of the hash and replace 'exists $zip{$zips}' with
'grep { $_ eq $zip } @zips' (see my benchmarks further down this thread), although if you were to port this to mod_perl or the like I'd stick with a hash (though it'd be defined differently) :-) | [reply] [d/l] |
| [reply] |
1 0 wallclock secs (-0.89 usr + 0.00 sys = -0.89 CPU)
2 0 wallclock secs (-1.00 usr + 0.00 sys = -1.00 CPU)
3 -1 wallclock secs (-0.67 usr + 0.00 sys = -0.67 CPU)
4 1 wallclock secs (-0.35 usr + 0.01 sys = -0.34 CPU)
5 1 wallclock secs (-0.05 usr + 0.01 sys = -0.04 CPU)
6 0 wallclock secs ( 0.32 usr + 0.00 sys = 0.32 CPU)
7 2 wallclock secs ( 0.68 usr + 0.00 sys = 0.68 CPU)
8 2 wallclock secs ( 0.95 usr + 0.00 sys = 0.95 CPU)
9 1 wallclock secs ( 1.33 usr + 0.00 sys = 1.33 CPU)
10 3 wallclock secs ( 1.70 usr + 0.00 sys = 1.70 CPU)
So yah, you're better off going with a linear scan unless you're going to be working with more than a handful of elements. | [reply] [d/l] |
FYI, here was my test (assuming worst case of 'not found'):
#!/usr/local/bin/perl -l -w
use strict;
use Benchmark;
my @zips = qw(
92713
92714
92715
92716
92717
92718
92719
);
my $zip = '11111';
timethese (50000,{
HASH => \&hash_it,
LINEAR => \&linear,
LINEAR_GREP => \&linear_grep,
});
sub hash_it {
my %zips;
@zips{@zips} = undef;
exists $zips{$zip};
}
sub linear {
for (@zips) {
return 1 if $_ eq $zip;
}
return 0;
}
sub linear_grep {
grep { $_ eq $zip } @zips;
}
/usr/home/dougw/tst>./timeit
Benchmark:
timing 50000 iterations of
HASH, LINEAR, LINEAR_GREP
...
HASH: 3 wallclock secs ( 2.65 usr + 0.01 sys = 2.66 CPU) @ 18
+796.99/s (n=50000)
LINEAR: 1 wallclock secs ( 1.84 usr + 0.00 sys = 1.84 CPU) @ 27
+173.91/s (n=50000)
LINEAR_GREP: 0 wallclock secs ( 1.16 usr + 0.00 sys = 1.16 CPU) @ 4
+3103.45/s (n=50000)
Update:I don't include positive searches because we don't know the probability of WebSmart's users of entering a 'found' zip code. So I assume that since there's so few (5 - 10) 'valid' zip codes, and so many possible zip codes to enter, it will probably be 'not found'. Positive searches would just make the results lean more toward the LINEAR search, I don't know how many positive searches it would take before LINEAR would be better than the LINEAR_GREP. Either is better than the HASH, though. | [reply] [d/l] |