Re: Efficient way to verify scalar contents (updated x3)
by haukex (Archbishop) on Jun 22, 2020 at 07:05 UTC
|
See also How to ask better questions using Test::More and sample data. You can do it all in a single regex, see perlretut (and perlre) in regards to alternations, character classes, and lookaheads. However, perhaps my test cases below will give a hint that these password rules aren't necessarily good indicators of password quality; perhaps use one of the other established methods like Data::Password::zxcvbn instead. If this happens to be for a website, note there's also a JavaScript version of the "zxcvbn" algorithm that allows you to give live feedback to the user when and why a password isn't good, so they can know while choosing one; of course it should still be verified on the server in case they have JS disabled.
Update: Note that given/when are unforunately still experimental, in this case I'd suggest normal ifs instead.
Update 2: The talk on the library is great: https://youtu.be/vf37jh3dV2I (even just the first five minutes on the issues)
Update 3: A slightly optimized version of the regex below, that turns out to be pretty fast:
my $pw_re = qr{ \A (?: .{20,} | (?=.*[a-z]) (?=.*[A-Z])
(?: .{16,19} | (?=.*[0-9]) (?: .{12,15}
| (?=.*[\x21-\x2F\x3A-\x40\x5B-\x60\x7B-\x7E]) .{8,11}
) ) ) \z }msx;
Original code:
use warnings;
use strict;
my $pw_re =
qr{ \A (?:
.{20,}
| (?=.*[a-z]) (?=.*[A-Z])
.{16,19}
| (?=.*[a-z]) (?=.*[A-Z]) (?=.*[0-9])
.{12,15}
| (?=.*[a-z]) (?=.*[A-Z]) (?=.*[0-9])
(?=.*[\x21-\x2F\x3A-\x40\x5B-\x60\x7B-\x7E])
.{8,11}
) \z }msx;
use Test::More;
unlike '', $pw_re;
unlike 'aA0-', $pw_re;
unlike 'aA0-aA0', $pw_re;
like 'aA0-aA0-', $pw_re;
like 'aaaaaA0-', $pw_re;
like 'aaaaaaaaA0-', $pw_re;
unlike 'aaaaaaaaa0-', $pw_re;
unlike 'aaaaaaaaAA-', $pw_re;
unlike 'AAAAAAAAA0-', $pw_re;
unlike 'aaaaaaaaA00', $pw_re;
like 'aaaaaaaaaaA0', $pw_re;
like 'aaaaaaaaa-A0', $pw_re;
like 'aaaaaaaaaaaaaA0', $pw_re;
unlike 'aaaaaaaaaaaaaaA', $pw_re;
unlike 'aaaaaaaaaaaaaa0', $pw_re;
unlike 'AAAAAAAAAAAAAA0', $pw_re;
unlike 'aaaaaaaaaaaaa-A', $pw_re;
unlike 'aaaaaaaaaaaaa-0', $pw_re;
like 'aaaaaaaaaaaaaaaA', $pw_re;
like 'aaaaaaaaaaaaaa0A', $pw_re;
like 'aaaaaaaaaaaaa-0A', $pw_re;
like 'aaaaaaaaaaaaaaaaaaA', $pw_re;
unlike 'aaaaaaaaaaaaaaaaaaa', $pw_re;
unlike 'AAAAAAAAAAAAAAAAAAA', $pw_re;
unlike 'aaaaaaaaaaaaaaaaaa0', $pw_re;
unlike 'aaaaaaaaaaaaaaaaaa-', $pw_re;
like 'aaaaaaaaaaaaaaaaaaaa', $pw_re;
like 'aaaaaaaaaaaaaaaaaA0-', $pw_re;
like 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa', $pw_re;
done_testing;
| [reply] [d/l] [select] |
|
| [reply] |
|
Correction, smart-matching is the one with the broken design. I think given/when's only problem is that they sometimes use smart-matching. But without smart-matching, there's not much point to given/when. For example, the OP would avoid given/when by making nothing but the following changes:
- given ⇒ for
- first when ⇒ if
- other when ⇒ elsif
- default ⇒ else
| [reply] |
|
Re: Efficient way to verify scalar contents
by choroba (Cardinal) on Jun 22, 2020 at 09:17 UTC
|
my @RE = (qr/[[:lower:]]/, qr/[[:upper:]]/, qr/[0-9]/, qr/[^[:alnum:]]
+/);
sub check {
local ($_) = @_;
my $length = length;
return if $length < 8;
return if $length < 12 && (! /$RE[0]/ || ! /$RE[1]/ || ! /$RE[2]/
+|| ! /$RE[3]/);
return if $length < 16 && (! /$RE[0]/ || ! /$RE[1]/ || ! /$RE[2]/)
+;
return if $length < 20 && (! /$RE[0]/ || ! /$RE[1]/);
return 1
}
Tested against the tests from Re: Efficient way to verify scalar contents (updated x3) (haukex++).
Update:
Or, if you want to abstract away the repeating $RE[...]
use List::Util qw{ any };
my @RE = (qr/[[:lower:]]/, qr/[[:upper:]]/, qr/[0-9]/, qr/[^[:alnum:]]
+/);
sub check {
my ($password) = @_;
my $length = length $password;
return if $length < 8;
return if $length < 12 && any { $password !~ $RE[$_] } 0 .. 3;
return if $length < 16 && any { $password !~ $RE[$_] } 0 .. 2;
return if $length < 20 && any { $password !~ $RE[$_] } 0 .. 1;
return 1
}
map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
| [reply] [d/l] [select] |
|
| [reply] [d/l] |
|
I'm not sure. I remember [a-z] depending on locale, but it was probably in bash rather than Perl...
$ (LC_ALL=en_US.UTF-8; [[ é =~ [a-z] ]] || echo no)
$ (LC_ALL=C; [[ é =~ [a-z] ]] || echo no)
no
$ perl -Mutf8 -wE 'say "é" =~ /[[:lower:]]/'
1
$ perl -Mutf8 -wE 'say "é" =~ /[a-z]/'
$
Update: Switched from <c> to <pre> and back.
map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
| [reply] [d/l] [select] |
Re: Efficient way to verify scalar contents
by haukex (Archbishop) on Jun 22, 2020 at 13:29 UTC
|
I feel the current implementation is kind of redundant or inefficient?
It turns out to be pretty good! Compared to the other solutions so far, it's only beaten by my second regex.
Rate LanX2 LanX1 choroba2 choroba1 haukex1 ori
+g haukex2
LanX2 35040/s -- -49% -59% -75% -84% -87
+% -89%
LanX1 68981/s 97% -- -19% -51% -69% -74
+% -79%
choroba2 84861/s 142% 23% -- -39% -62% -68
+% -74%
choroba1 139373/s 298% 102% 64% -- -37% -47
+% -57%
haukex1 222355/s 535% 222% 162% 60% -- -15
+% -31%
orig 261901/s 647% 280% 209% 88% 18% -
+- -19%
haukex2 323064/s 822% 368% 281% 132% 45% 23
+% --
Updated benchmarks here!
| [reply] [d/l] [select] |
|
| [reply] |
|
my goal was maintainability, if speed mattered, I'd move the declaration of the rules and rulesets out of the sub. redefining them each time is useless.
Good point, modified:
Rate choroba2 LanX2 choroba1 LanX1 haukex1 ori
+g haukex2
choroba2 84800/s -- -8% -37% -38% -59% -68
+% -73%
LanX2 91935/s 8% -- -32% -33% -56% -66
+% -71%
choroba1 134606/s 59% 46% -- -2% -35% -50
+% -58%
LanX1 136955/s 62% 49% 2% -- -34% -49
+% -57%
haukex1 208522/s 146% 127% 55% 52% -- -22
+% -35%
orig 267007/s 215% 190% 98% 95% 28% -
+- -16%
haukex2 318577/s 276% 247% 137% 133% 53% 19
+% --
| [reply] [d/l] [select] |
Re: Efficient way to verify scalar contents
by perlfan (Vicar) on Jun 22, 2020 at 05:45 UTC
|
I would consider using something that will force you to organize your checks like Validate::Tiny.
Besides that, I don't think your rules are sufficient. There are a few password checking modules out there, and at the very least they will give you an idea about what to check. Length makes no difference if it's all the same character, for example. See Data::Password and Data::Password::Entropy, just to name a couple. | [reply] |
Re: Efficient way to verify scalar contents
by BillKSmith (Monsignor) on Jun 22, 2020 at 14:51 UTC
|
Your "Do it yourself" approach can be simplified by testing the content before the length.
use strict;
use warnings;
sub is_pw_valid {
local $_ = $_[0];
return
!(/[a-z]/ && /[A-Z]/) ? length >= 20
: !/\d/ ? length >= 16
: !/[\x21-\x2F\x3A-\x40\x5B-\x60\x7B-\x7E]/ ? length >= 12
: length >= 8
;
}
UPDATE: Corrected error in regex for special characters and added (below) a less terse version of the same logic
sub is_pw_valid {
my $password = $_[0];
my $MixedCase = qr/(:? [a-z].*[A-Z]) | (:? [A-Z].*[a-z] )/x;
my $Digit = qr/\d/;
my $special = qr/[\x21-\x2F\x3A-\x40\x5B-\x60\x7B-\x7E]/;
my $min_length;
if ($password !~ $MixedCase) {$min_length = 20}
elsif ($password !~ $Digit) {$min_length = 16}
elsif ($password !~ $special) {$min_length = 12}
else {$min_length = 8}
+
my $is_OK = ( length($password) >= $min_length );
return $is_OK;
}
| [reply] [d/l] [select] |
Re: Efficient way to verify scalar contents
by LanX (Saint) on Jun 22, 2020 at 04:59 UTC
|
Hi, some untested pseudo code
I see it as a rules matrix to be checked.
I'd take
$index= min( int(length /4), 5 )
And an array of arrays @rules
for lookup.
The rules array would contain subs which all must return true for the password to pass.
Like $letter = sub { /[a-z]/i }
(This will test against $_ )
So $rules[3] = [ $letter, $number ]
$rules[1] = [ $forbidden ]
All you need to do now is too loop over all rules and fail if one entry returns false.
HTH! :)
PS instead of code refs one might also store regex refs with qr//.
But why blocking the way to more complicated rules?
| [reply] [d/l] [select] |
|
$index= min( int(length /4), 5 )
length being the passwordLength? | [reply] [d/l] |
|
It appears to be the index of the check based on length of the supplied password. You have 4 valid conditions that lead to additional checks; the 5th one is always invalid. Indexes 0, 1, 2, and 3 elements of the array of subroutine references he suggests will call the function appropriate for the length.
when ($_ >= 8 && $_ <= 11) # $index here is min( 2, 5), so 2
when ($_ >= 12 && $_ <= 15) { # $index here is min( 3, 5), so 3
when ($_ >= 16 && $_ <= 19) { # $index here is min( 4, 5), so 4
when ($_ >= 20) { # $index here is min( 5+,5), so alw
+ays bounded at 5
TBH it's not clear enough to make great sense to me, but he appears to be suggesting that you put your ranges into buckets based on the formula he gave; for a length of 20 or more, you're always going to fall into bucket 5 (or perhaps element 3) of your array of checks.
This is actually a pretty good way to determine a bucket, so then you can have an array of checks; or to make it more clear, a hash:
my %check_dispatch = (
2 => sub { ... }, # or better, define in named subs and, => \&check1
+, etc
3 => sub { ... },
4 => sub { ... },
5 => sub { ... },
);
my $pw = $_;
my $index = min( (length $pw)/4, 5);
if ($check_dispatch{$index} and 'CODE' eq ref $check_dispatch{$index})
+ {
die if not $check_dispatch{$index}->($pw);
}
| [reply] [d/l] [select] |
|
use strict;
use warnings;
use Data::Dump qw/pp dd/;
use List::Util qw/min/;
sub pw_not_ok {
my $pw = shift;
my $short =
sub { "is too short" };
my $lowercase =
sub { /[a-z]/ ? "" : "has no lowercase character" };
my $uppercase =
sub { /[A-Z]/ ? "" : "has no uppercase character" }; ### FIXED
my $number =
sub { /[0-9]/ ? "" : "has no number" };
my $special =
sub {
if (
/[\x21-\x2F]/ ||
/[\x3A-\x40]/ ||
/[\x5B-\x60]/ ||
/[\x7B-\x7E]/
) {
return "";
} else {
return "has no special characters";
}
};
my $pass = sub {""};
my @rules_for_class =
(
[ $short ],
[ $short ],
[ $lowercase, $uppercase, $number ,$special ],
[ $lowercase, $uppercase, $number ],
[ $lowercase, $uppercase ],
[ $pass ]
);
my $len = length $pw;
my $idx = min(int($len/4),5);
for my $rule ( @{ $rules_for_class[$idx] } ) {
$_ = $pw;
if ( my $err = $rule->() ) {
return $err;
}
}
return;
}
# --------- Tests
for my $pw (
"A" x 3 ,
"A" x 7 ,
"A" x 11,
"a" x 5 . "A" x 3,
"a" x 5 . "A" x 3 . "1",
"a" x 5 . "A" x 3 . "1" .":",
"a" x 24,
){
if ( my $err = pw_not_ok($pw) ){
warn "ERROR: $pw is $err\n";
} else {
print "OK: $pw\n"
}
}
ERROR: AAA is is too short
ERROR: AAAAAAA is is too short
ERROR: AAAAAAAAAAA is has no lowercase character
ERROR: aaaaaAAA is has no number
ERROR: aaaaaAAA1 is has no special characters
OK: aaaaaAAA1:
OK: aaaaaaaaaaaaaaaaaaaaaaaa
UPDATE
see Re^6: Efficient way to verify scalar contents | [reply] [d/l] [select] |
|
|
|
Re: Efficient way to verify scalar contents
by tybalt89 (Monsignor) on Jun 23, 2020 at 22:27 UTC
|
sub validate
{
local $_ = shift;
length >= 20 ? 1 :
!/[a-z]/ || !/[A-Z]/ ? 0 :
length >= 16 ? 1 :
!/\d/ ? 0 :
length >= 12 ? 1 :
!/[^_\w]/ ? 0 :
length >= 8 ? 1 :
0;
}
Besides, I think it's cute.
| [reply] [d/l] |
|
Rate BillKSmith2 choroba2 LanX2 choroba1 LanX1 haukex1
+ BillKSmith1 orig tybalt89 haukex2
BillKSmith2 38227/s -- -55% -60% -72% -72% -83%
+ -85% -85% -87% -88%
choroba2 85658/s 124% -- -10% -38% -38% -62%
+ -66% -67% -70% -74%
LanX2 95639/s 150% 12% -- -31% -31% -57%
+ -62% -63% -67% -71%
choroba1 138099/s 261% 61% 44% -- -1% -38%
+ -44% -47% -52% -58%
LanX1 138866/s 263% 62% 45% 1% -- -38%
+ -44% -47% -52% -58%
haukex1 223763/s 485% 161% 134% 62% 61% --
+ -10% -14% -22% -33%
BillKSmith1 248507/s 550% 190% 160% 80% 79% 11%
+ -- -5% -13% -25%
orig 260650/s 582% 204% 173% 89% 88% 16%
+ 5% -- -9% -21%
tybalt89 287084/s 651% 235% 200% 108% 107% 28%
+ 16% 10% -- -13%
haukex2 331543/s 767% 287% 247% 140% 139% 48%
+ 33% 27% 15% --
| [reply] [d/l] [select] |
|
sub validatetr {
local $_ = shift;
length >= 20 ? 1 :
!tr/a-z// || !tr/A-Z// ? 0 :
length >= 16 ? 1 :
!tr/0-9// ? 0 :
length >= 12 ? 1 :
!tr/a-zA-Z0-9//c ? 0 :
length >= 8 ? 1 :
0;
}
| [reply] [d/l] |
|