http://qs321.pair.com?node_id=1078827

FierceMoose has asked for the wisdom of the Perl Monks concerning the following question:

Greetings All.

I'm seeking the greater wisdom of PerlMonks on patter matching issue that's been plaguing me. It's probably something obvious, but I appear to be blind to it. Running a simple script that query's LDAP, via Net::LDAP, and have it report back Seat data that does not follow the expected pattern, easier explained by Test Cases below. Here's the snippet of code used for processing data retrieved from LDAP:

for($c=0; $c < $totalFound; $c++) { my $SeatLoc = $entries[$c]->get_value('SeatLocation'); if ($SeatLoc =~ /(\b[A-Z]{1}\d{1}\.\d+[A-Z]*)/) { print "Valid Seat Found: $SeatLoc.\n"; } else { print "Found Invalid Seat: $SeatLoc\n"; } }

Believe I've got the correct RegEx for the pattern match, tested on 'My Regex Tester' but cannot seem to get it to work in Perl, doesn't match on a single one.

Test Cases:

  • A1.23BC - Correct
  • U2.60L - Correct
  • R2.32 - Correct
  • C1.3L - Correct
  • AB3.45E - Invalid
  • D45.1A - Invalid
  • Some Bldg - Invalid
  • Please help me find my way back to the right path.

    Thanks!

    Replies are listed 'Best First'.
    Re: Perl will not match my RegEx pattern....
    by AnomalousMonk (Archbishop) on Mar 18, 2014 at 17:45 UTC

      A quick check shows all strings match as you seem to expect. Can you be more specific about what you mean by "doesn't match on a single one"? As Bloodnok has asked, are you really, really sure about your data?

      c:\@Work\Perl\monks>perl -wMstrict -le "my @seatlocs = (qw(A1.23BC U2.60L R2.32 C1.3L AB3.45E D45.1A), 'Some +Bldg'); ;; for my $loc (@seatlocs) { print qq{'$loc' }, $loc =~ m{ \b [A-Z]{1} \d{1} \. \d+ [A-Z]* }xms ? 'Correct' : 'Invalid'; } " 'A1.23BC' Correct 'U2.60L' Correct 'R2.32' Correct 'C1.3L' Correct 'AB3.45E' Invalid 'D45.1A' Invalid 'Some Bldg' Invalid

      BTW:  [A-Z]{1} is the same as  [A-Z] is the same as  [[:upper:]] and  \d{1} the same as  \d

        I suspect the data as well, but I'm not sure what can be done to clean it up with out altering the actual data. Chomp has had not change on operation either. Perhaps I need to convert it to a quoted word for RegEx to work?

        my $acopy = qw($SeatLoc);

        I'm somewhere between novice & intermediate in my experience and not immune to the 'obvious' mistake. ;-)

          qw does not interpolate variables. This statement is equivalent to
          my $acopy = '$SeatLoc';

          Note the single quotes.

          لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
          Perhaps I need to convert it to a quoted word for RegEx to work?

          Blindly converting something that's already a string to a string is unlikely to help. You already know that the regex you provided works for the sample strings you provided. Find out what your data really looks like!

    Re: Perl will not match my RegEx pattern....
    by Bloodnok (Vicar) on Mar 18, 2014 at 17:39 UTC
      Although the test cases would appear to be matched, can you be asbolutely sure that the data returned from LDAP matches the test cases ?

      A user level that continues to overstate my experience :-))
    Re: Perl will not match my RegEx pattern....
    by Jer2911 (Acolyte) on Mar 18, 2014 at 19:31 UTC
      Adding a "+" after [A-Z] and the first \d should get it to match AB3.45E and D45.1A:
      $SeatLoc =~ /(\b[A-Z]+\d+\.\d+[A-Z]*)/)
        ... AB3.45E and D45.1A ...

        I think these were intended to be examples of strings that must not match.

    Re: Perl will not match my RegEx pattern....
    by FierceMoose (Initiate) on Mar 19, 2014 at 14:18 UTC

      Thanks to everyone for your help.

      Below are examples of the output requested. I'm refraining from giving entire list search returns 4375 entries, which is the expected amount, all are being seen as invalid due to some error in my code. I've tried chomp on the returned data, but no change in results, chop just removed last char and invalidated the results.

      Found Invalid Cube Number: A1.01A Found Invalid Cube Number: A1.01B Found Invalid Cube Number: A1.01B ... Found Invalid Cube Number: A2.06AA Found Invalid Cube Number: A2.06AABB Found Invalid Cube Number: A2.06B ... Found Invalid Cube Number: H2.25 Found Invalid Cube Number: H3.10 Found Invalid Cube Number: Imaging Center Found Invalid Cube Number: Italy Found Invalid Cube Number: K1.03A Found Invalid Cube Number: K1.03B ... Found Invalid Cube Number: R1.19G Found Invalid Cube Number: R2.25AA Found Invalid Cube Number: R2.25AABB Found Invalid Cube Number: R2.25BB Found Invalid Cube Number: R2.25D Found Invalid Cube Number: R2.25DD Found Invalid Cube Number: R2.25F Found Invalid Cube Number: R2.25FF ... Found Invalid Cube Number: V3.26P Found Invalid Cube Number: V3.26Q Found Invalid Cube Number: V3.26QR Found Invalid Cube Number: V3.26R Found Invalid Cube Number: V3.26R Found Invalid Cube Number: V3.26S Found Invalid Cube Number: V3.26ST Found Invalid Cube Number: V3.26T

      In the results above, all should be a valid Seat Location (pattern) except for 'Imaging Center' & 'Italy'.

      FWIW- below is the actual search performed using Net::LDAP. This format has been used in many other scripts besides this one, but thought it may give context to what's happening.

      my $userFilter = "(SeatLocation=*)"; my &userAttributes = [ 'SeatLocation', 'uid' ]; my $searchMesg = $ldap->search (base => $userBase, filter => $userFilt +er, scope => $scope, attrs => $userAttributes); my @entries = $searchMesg->entries; my $totalFound=@entries; # determine the number of entries found

      Thanks again!

        $ perl -nle 'print "$_ is ", /^[A-Z][1-9]\.\d\d?[A-Z]*$/ ? "valid" : " +invalid"' 1078942.in A1.01A is valid A1.01B is valid A1.01B is valid A2.06AA is valid A2.06AABB is valid A2.06B is valid H2.25 is valid H3.10 is valid Imaging Center is invalid Italy is invalid K1.03A is valid K1.03B is valid R1.19G is valid R2.25AA is valid R2.25AABB is valid R2.25BB is valid R2.25D is valid R2.25DD is valid R2.25F is valid R2.25FF is valid V3.26P is valid V3.26Q is valid V3.26QR is valid V3.26R is valid V3.26R is valid V3.26S is valid V3.26ST is valid V3.26T is valid
        EDIT: to weed out .0 and .00, use this pattern instead: /^[A-Z][1-9]\.(?:[1-9]\d?|0[1-9])[A-Z]*$/