http://qs321.pair.com?node_id=11118334

x_stream3m has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, learning Perl for the first time. I am trying to see if there is a more efficient way of going about verifying the contents of a scalar value.

I have a password input. If the password length is between:

- 8 and 11, the password should contain a-zA-Z0-9 and special characters.

- 12 and 15, the password should contain a-zA-Z0-9.

- 16 and 19, the password should contain a-zA-Z

- 20 and above, no criteria

I feel the current implementation is kind of redundant or inefficient? Maybe I am wrong?

my $password = <STDIN>; my $lengthOfPassword = length $password; given ($lengthOfPassword) { when ($_ >= 8 && $_ <= 11) { if ($password =~ /[a-z]/ && $password =~ /[A-Z]/ && $password =~ /[0-9]/ && ($password =~ /[\x21-\x2F]/ || $password =~ /[\x3A-\x40]/ || $password =~ /[\x5B-\x60]/ || $password =~ /[\x7B-\x7E]/)) { say "Successful password"; } else { invalidPasswordError(); } } when ($_ >= 12 && $_ <= 15) { if ($password =~ /[a-z]/ && $password =~ /[A-Z]/ && $password =~ /[0-9]/) { say "Successful password"; } else { invalidPasswordError(); } } when ($_ >= 16 && $_ <= 19) { if ($password =~ /[a-z]/ && $password =~ /[A-Z]/) { say "Successful password"; } else { invalidPasswordError(); } } when ($_ >= 20) { say "Successful password"; } default { lengthPasswordError(); } }

Replies are listed 'Best First'.
Re: Efficient way to verify scalar contents (updated x3)
by haukex (Archbishop) on Jun 22, 2020 at 07:05 UTC

    See also How to ask better questions using Test::More and sample data. You can do it all in a single regex, see perlretut (and perlre) in regards to alternations, character classes, and lookaheads. However, perhaps my test cases below will give a hint that these password rules aren't necessarily good indicators of password quality; perhaps use one of the other established methods like Data::Password::zxcvbn instead. If this happens to be for a website, note there's also a JavaScript version of the "zxcvbn" algorithm that allows you to give live feedback to the user when and why a password isn't good, so they can know while choosing one; of course it should still be verified on the server in case they have JS disabled.

    Update: Note that given/when are unforunately still experimental, in this case I'd suggest normal ifs instead.

    Update 2: The talk on the library is great: https://youtu.be/vf37jh3dV2I (even just the first five minutes on the issues)

    Update 3: A slightly optimized version of the regex below, that turns out to be pretty fast:

    my $pw_re = qr{ \A (?: .{20,} | (?=.*[a-z]) (?=.*[A-Z]) (?: .{16,19} | (?=.*[0-9]) (?: .{12,15} | (?=.*[\x21-\x2F\x3A-\x40\x5B-\x60\x7B-\x7E]) .{8,11} ) ) ) \z }msx;

    Original code:

    use warnings; use strict; my $pw_re = qr{ \A (?: .{20,} | (?=.*[a-z]) (?=.*[A-Z]) .{16,19} | (?=.*[a-z]) (?=.*[A-Z]) (?=.*[0-9]) .{12,15} | (?=.*[a-z]) (?=.*[A-Z]) (?=.*[0-9]) (?=.*[\x21-\x2F\x3A-\x40\x5B-\x60\x7B-\x7E]) .{8,11} ) \z }msx; use Test::More; unlike '', $pw_re; unlike 'aA0-', $pw_re; unlike 'aA0-aA0', $pw_re; like 'aA0-aA0-', $pw_re; like 'aaaaaA0-', $pw_re; like 'aaaaaaaaA0-', $pw_re; unlike 'aaaaaaaaa0-', $pw_re; unlike 'aaaaaaaaAA-', $pw_re; unlike 'AAAAAAAAA0-', $pw_re; unlike 'aaaaaaaaA00', $pw_re; like 'aaaaaaaaaaA0', $pw_re; like 'aaaaaaaaa-A0', $pw_re; like 'aaaaaaaaaaaaaA0', $pw_re; unlike 'aaaaaaaaaaaaaaA', $pw_re; unlike 'aaaaaaaaaaaaaa0', $pw_re; unlike 'AAAAAAAAAAAAAA0', $pw_re; unlike 'aaaaaaaaaaaaa-A', $pw_re; unlike 'aaaaaaaaaaaaa-0', $pw_re; like 'aaaaaaaaaaaaaaaA', $pw_re; like 'aaaaaaaaaaaaaa0A', $pw_re; like 'aaaaaaaaaaaaa-0A', $pw_re; like 'aaaaaaaaaaaaaaaaaaA', $pw_re; unlike 'aaaaaaaaaaaaaaaaaaa', $pw_re; unlike 'AAAAAAAAAAAAAAAAAAA', $pw_re; unlike 'aaaaaaaaaaaaaaaaaa0', $pw_re; unlike 'aaaaaaaaaaaaaaaaaa-', $pw_re; like 'aaaaaaaaaaaaaaaaaaaa', $pw_re; like 'aaaaaaaaaaaaaaaaaA0-', $pw_re; like 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa', $pw_re; done_testing;

      Note that given/when are unforunately still experimental

      That's a good thing. Their design is broken. The only reason they haven't been removed is that too many people are using them despite that problems with them X_X.

        Correction, smart-matching is the one with the broken design. I think given/when's only problem is that they sometimes use smart-matching. But without smart-matching, there's not much point to given/when. For example, the OP would avoid given/when by making nothing but the following changes:

        • given ⇒ for
        • first when ⇒ if
        • other when ⇒ elsif
        • default ⇒ else
Re: Efficient way to verify scalar contents
by choroba (Cardinal) on Jun 22, 2020 at 09:17 UTC
    my @RE = (qr/[[:lower:]]/, qr/[[:upper:]]/, qr/[0-9]/, qr/[^[:alnum:]] +/); sub check { local ($_) = @_; my $length = length; return if $length < 8; return if $length < 12 && (! /$RE[0]/ || ! /$RE[1]/ || ! /$RE[2]/ +|| ! /$RE[3]/); return if $length < 16 && (! /$RE[0]/ || ! /$RE[1]/ || ! /$RE[2]/) +; return if $length < 20 && (! /$RE[0]/ || ! /$RE[1]/); return 1 }

    Tested against the tests from Re: Efficient way to verify scalar contents (updated x3) (haukex++).

    Update: Or, if you want to abstract away the repeating $RE[...]

    use List::Util qw{ any }; my @RE = (qr/[[:lower:]]/, qr/[[:upper:]]/, qr/[0-9]/, qr/[^[:alnum:]] +/); sub check { my ($password) = @_; my $length = length $password; return if $length < 8; return if $length < 12 && any { $password !~ $RE[$_] } 0 .. 3; return if $length < 16 && any { $password !~ $RE[$_] } 0 .. 2; return if $length < 20 && any { $password !~ $RE[$_] } 0 .. 1; return 1 }

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      Please correct me, but do these character classes not depend on localisation?

      [[:lower:]] [[:upper:]]

      I think passwords should rather be defined in terms of ASCII

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

        I'm not sure. I remember [a-z] depending on locale, but it was probably in bash rather than Perl...
        $ (LC_ALL=en_US.UTF-8; [[ é =~ [a-z] ]] || echo no) $ (LC_ALL=C; [[ é =~ [a-z] ]] || echo no) no $ perl -Mutf8 -wE 'say "é" =~ /[[:lower:]]/' 1 $ perl -Mutf8 -wE 'say "é" =~ /[a-z]/' $

        Update: Switched from <c> to <pre> and back.

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: Efficient way to verify scalar contents
by haukex (Archbishop) on Jun 22, 2020 at 13:29 UTC
    I feel the current implementation is kind of redundant or inefficient?

    It turns out to be pretty good! Compared to the other solutions so far, it's only beaten by my second regex.

    Rate LanX2 LanX1 choroba2 choroba1 haukex1 ori +g haukex2 LanX2 35040/s -- -49% -59% -75% -84% -87 +% -89% LanX1 68981/s 97% -- -19% -51% -69% -74 +% -79% choroba2 84861/s 142% 23% -- -39% -62% -68 +% -74% choroba1 139373/s 298% 102% 64% -- -37% -47 +% -57% haukex1 222355/s 535% 222% 162% 60% -- -15 +% -31% orig 261901/s 647% 280% 209% 88% 18% - +- -19% haukex2 323064/s 822% 368% 281% 132% 45% 23 +% --

    Updated benchmarks here!

      my goal was maintainability, if speed mattered, I'd move the declaration of the rules and rulesets out of the sub. redefining them each time is useless.

      Also using regex-refs via qr// is faster, but code-refs allow implementing more complex rules if requirements change.

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

        my goal was maintainability, if speed mattered, I'd move the declaration of the rules and rulesets out of the sub. redefining them each time is useless.

        Good point, modified:

        Rate choroba2 LanX2 choroba1 LanX1 haukex1 ori +g haukex2 choroba2 84800/s -- -8% -37% -38% -59% -68 +% -73% LanX2 91935/s 8% -- -32% -33% -56% -66 +% -71% choroba1 134606/s 59% 46% -- -2% -35% -50 +% -58% LanX1 136955/s 62% 49% 2% -- -34% -49 +% -57% haukex1 208522/s 146% 127% 55% 52% -- -22 +% -35% orig 267007/s 215% 190% 98% 95% 28% - +- -16% haukex2 318577/s 276% 247% 137% 133% 53% 19 +% --
Re: Efficient way to verify scalar contents
by BillKSmith (Monsignor) on Jun 22, 2020 at 14:51 UTC
    Your "Do it yourself" approach can be simplified by testing the content before the length.
    use strict; use warnings; sub is_pw_valid { local $_ = $_[0]; return !(/[a-z]/ && /[A-Z]/) ? length >= 20 : !/\d/ ? length >= 16 : !/[\x21-\x2F\x3A-\x40\x5B-\x60\x7B-\x7E]/ ? length >= 12 : length >= 8 ; }

    UPDATE: Corrected error in regex for special characters and added (below) a less terse version of the same logic

    sub is_pw_valid { my $password = $_[0]; my $MixedCase = qr/(:? [a-z].*[A-Z]) | (:? [A-Z].*[a-z] )/x; my $Digit = qr/\d/; my $special = qr/[\x21-\x2F\x3A-\x40\x5B-\x60\x7B-\x7E]/; my $min_length; if ($password !~ $MixedCase) {$min_length = 20} elsif ($password !~ $Digit) {$min_length = 16} elsif ($password !~ $special) {$min_length = 12} else {$min_length = 8} + my $is_OK = ( length($password) >= $min_length ); return $is_OK; }
    Bill
Re: Efficient way to verify scalar contents
by perlfan (Vicar) on Jun 22, 2020 at 05:45 UTC
    I would consider using something that will force you to organize your checks like Validate::Tiny.

    Besides that, I don't think your rules are sufficient. There are a few password checking modules out there, and at the very least they will give you an idea about what to check. Length makes no difference if it's all the same character, for example. See Data::Password and Data::Password::Entropy, just to name a couple.

Re: Efficient way to verify scalar contents
by LanX (Saint) on Jun 22, 2020 at 04:59 UTC
    Hi, some untested pseudo code

    I see it as a rules matrix to be checked.

    I'd take

    $index= min( int(length /4), 5 )

    And an array of arrays @rules for lookup.

    The rules array would contain subs which all must return true for the password to pass.

    Like $letter = sub {  /[a-z]/i }

    (This will test against $_ )

    So $rules[3] = [ $letter, $number ]

    $rules[1] = [ $forbidden ]

    All you need to do now is too loop over all rules and fail if one entry returns false.

    HTH! :)

    PS instead of code refs one might also store regex refs with qr//.

    But why blocking the way to more complicated rules?

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

      Thanks for replying!

      A little confused on the purpose of

      $index= min( int(length /4), 5 )

      length being the passwordLength?

        It appears to be the index of the check based on length of the supplied password. You have 4 valid conditions that lead to additional checks; the 5th one is always invalid. Indexes 0, 1, 2, and 3 elements of the array of subroutine references he suggests will call the function appropriate for the length.
        when ($_ >= 8 && $_ <= 11) # $index here is min( 2, 5), so 2 when ($_ >= 12 && $_ <= 15) { # $index here is min( 3, 5), so 3 when ($_ >= 16 && $_ <= 19) { # $index here is min( 4, 5), so 4 when ($_ >= 20) { # $index here is min( 5+,5), so alw +ays bounded at 5

        TBH it's not clear enough to make great sense to me, but he appears to be suggesting that you put your ranges into buckets based on the formula he gave; for a length of 20 or more, you're always going to fall into bucket 5 (or perhaps element 3) of your array of checks. This is actually a pretty good way to determine a bucket, so then you can have an array of checks; or to make it more clear, a hash:
        my %check_dispatch = ( 2 => sub { ... }, # or better, define in named subs and, => \&check1 +, etc 3 => sub { ... }, 4 => sub { ... }, 5 => sub { ... }, ); my $pw = $_; my $index = min( (length $pw)/4, 5); if ($check_dispatch{$index} and 'CODE' eq ref $check_dispatch{$index}) + { die if not $check_dispatch{$index}->($pw); }
        As I said it was pseudocode =)

        I meant something like this:

        use strict; use warnings; use Data::Dump qw/pp dd/; use List::Util qw/min/; sub pw_not_ok { my $pw = shift; my $short = sub { "is too short" }; my $lowercase = sub { /[a-z]/ ? "" : "has no lowercase character" }; my $uppercase = sub { /[A-Z]/ ? "" : "has no uppercase character" }; ### FIXED my $number = sub { /[0-9]/ ? "" : "has no number" }; my $special = sub { if ( /[\x21-\x2F]/ || /[\x3A-\x40]/ || /[\x5B-\x60]/ || /[\x7B-\x7E]/ ) { return ""; } else { return "has no special characters"; } }; my $pass = sub {""}; my @rules_for_class = ( [ $short ], [ $short ], [ $lowercase, $uppercase, $number ,$special ], [ $lowercase, $uppercase, $number ], [ $lowercase, $uppercase ], [ $pass ] ); my $len = length $pw; my $idx = min(int($len/4),5); for my $rule ( @{ $rules_for_class[$idx] } ) { $_ = $pw; if ( my $err = $rule->() ) { return $err; } } return; } # --------- Tests for my $pw ( "A" x 3 , "A" x 7 , "A" x 11, "a" x 5 . "A" x 3, "a" x 5 . "A" x 3 . "1", "a" x 5 . "A" x 3 . "1" .":", "a" x 24, ){ if ( my $err = pw_not_ok($pw) ){ warn "ERROR: $pw is $err\n"; } else { print "OK: $pw\n" } }

        ERROR: AAA is is too short ERROR: AAAAAAA is is too short ERROR: AAAAAAAAAAA is has no lowercase character ERROR: aaaaaAAA is has no number ERROR: aaaaaAAA1 is has no special characters OK: aaaaaAAA1: OK: aaaaaaaaaaaaaaaaaaaaaaaa

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery

        UPDATE

        see Re^6: Efficient way to verify scalar contents

Re: Efficient way to verify scalar contents
by tybalt89 (Monsignor) on Jun 23, 2020 at 22:27 UTC

    Just for some TMTOWTDI fun :)

    sub validate { local $_ = shift; length >= 20 ? 1 : !/[a-z]/ || !/[A-Z]/ ? 0 : length >= 16 ? 1 : !/\d/ ? 0 : length >= 12 ? 1 : !/[^_\w]/ ? 0 : length >= 8 ? 1 : 0; }

    Besides, I think it's cute.

      Nice, so far the only other solution that beats the original :-)

      Rate BillKSmith2 choroba2 LanX2 choroba1 LanX1 haukex1 + BillKSmith1 orig tybalt89 haukex2 BillKSmith2 38227/s -- -55% -60% -72% -72% -83% + -85% -85% -87% -88% choroba2 85658/s 124% -- -10% -38% -38% -62% + -66% -67% -70% -74% LanX2 95639/s 150% 12% -- -31% -31% -57% + -62% -63% -67% -71% choroba1 138099/s 261% 61% 44% -- -1% -38% + -44% -47% -52% -58% LanX1 138866/s 263% 62% 45% 1% -- -38% + -44% -47% -52% -58% haukex1 223763/s 485% 161% 134% 62% 61% -- + -10% -14% -22% -33% BillKSmith1 248507/s 550% 190% 160% 80% 79% 11% + -- -5% -13% -25% orig 260650/s 582% 204% 173% 89% 88% 16% + 5% -- -9% -21% tybalt89 287084/s 651% 235% 200% 108% 107% 28% + 16% 10% -- -13% haukex2 331543/s 767% 287% 247% 140% 139% 48% + 33% 27% 15% --

        Out of curiosity, I tried tr/// instead of regex, much faster :)

        sub validatetr { local $_ = shift; length >= 20 ? 1 : !tr/a-z// || !tr/A-Z// ? 0 : length >= 16 ? 1 : !tr/0-9// ? 0 : length >= 12 ? 1 : !tr/a-zA-Z0-9//c ? 0 : length >= 8 ? 1 : 0; }