http://qs321.pair.com?node_id=1156122

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to take the first name from a person’s full name with two possibly cases. First case if the person’s name is like: SMITH, A DOE
This one could be done like this, unless there is a better way:
$mystring = ($mystring =~ /(\w{3,})$/) ? $1 : ''; Print “DOE”;

But if the person’s name is as this: BULLOCK JOE A
It can be done like this, unless there is a better way:
$mystring = ($mystring =~ / (\w{1,2}\s)(\w+)$/) ? $2 : ''; Print “JOE”;

My goal is to get this line of code to work for both, no luck, any suggestions?
$mystring = ($mystring =~ /(\w{1}\s\w+)|(\w{3,})$/) ? $1 : ''; print "\n $mystring\n";

Test code:
#!/usr/bin/perl use strict; use warnings; #my $mystring = "SMITH, A DOE"; my $mystring = "BULLOCK JOE A"; if($mystring =~ /(\w{3,})$/) { print "\n\n $1\n\n"; }elsif($mystring =~ /(\w{1,2}\s)(\w+)$/) { print "\n\n $2\n\n"; }
Thanks for looking!

Replies are listed 'Best First'.
Re: RegExp, grabbing first name
by hippo (Bishop) on Feb 25, 2016 at 18:55 UTC

    The match /([A-Z]{3,})/g works for me by extracting both the surname and the forename. Example:

    #!/usr/bin/env perl use strict; use warnings; use Test::More tests => 2; my %names = ( 'BULLOCK JOE A' => 'JOE', 'SMITH, A DOE' => 'DOE' ); for my $fullname (keys %names) { my ($sname, $fname) = $fullname =~ /([A-Z]{3,})/g; is ($fname, $names{$fullname}, "Forename $fname extracted from $fu +llname"); }
      But I don't have my data in a hash (key=>value), I have a list of names that I have to extract the first name from the full name, but thanks for replaying!

        Same for list

        #!perl use strict; my @list = ('BULLOCK JOE A','SMITH, A DOE'); for my $name (@list){ my ($surname,$first) = $name =~ /(\w{3,})/g; print "$first : $name\n"; }
        poj
        But I don't have my data in a hash

        The hash is used as a clear and concise way to list both the data which you have (the keys) with the results which you want (the values) and to test that an operation on the former yields the latter. There is no need for you to use a hash in your implementation. See How to ask better questions using Test::More and sample data for a recent discussion on why using a test in a question is a good idea and why (to my mind) using a test in an answer is equally beneficial.

        The fact that my example iterates over a list (keys %names) should help you to use a similar list in your own code as do the examples from pod and ExReg.

        You could just as easily do it with an array or list then. Modify the code hippo gave to use an array instead:

        #!/usr/bin/env perl use strict; use warnings; my @names = ( 'BULLOCK JOE A', 'SMITH, A DOE', ); for my $fullname (@names) { my ($sname, $fname) = $fullname =~ /([A-Z]{3,})/g; print "Forename $fname extracted from $fullname"; }

        There are three assumptions that need to be met: both first and last names must be at least three letters long, the last name must come first, and the middle name must be an initial (or at least less than three letters long. If you cannot meet those conditions, you will have a hard problem.

OT: Test framework (Re: RegExp, grabbing first name)
by soonix (Canon) on Feb 26, 2016 at 09:55 UTC
    I bloated up hippo's test Frame with the Solutions suggested so far and added some more test cases. They all do OK as far as specified and differ in the "unspecified by OP" cases:
    #!/usr/bin/env perl use 5.011; # implies strict + feature 'say' use warnings; use Test::More; my %names = ( # Test data # input expected output 'BULLOCK JOE A' => 'JOE', 'SMITH, A DOE' => 'DOE', 'BULLOCK MICHAEL A' => 'MICHAEL', # not specified by OP: 'SMITH ADAM' => 'ADAM', 'POCAHONTAS' => 'POCAHONTAS', 'TRAPPER JOHN M D' => 'JOHN', ); my %approaches = ( 'hippo' => sub { my $fullname = shift; my ( $sname, $fname ) = $fullname =~ /([A-Z]{3,})/g; return $fname || $sname; # updated as per [id://1156306] }, 'Maresia' => sub { my $string = shift; if ( $string =~ /(\w{3,})$/ ) { return $1; } elsif ( $string =~ /(\w+)\s(\w{1,2})$/ ) { return $1; } }, 'kcott' => sub { shift =~ /^[^, ]+(?:,\s+\w+|)\s+(\w+)/; return $1; }, ); plan tests => scalar keys %approaches; for my $who ( keys %approaches ) { print "\nRunning tests for $who:\n\n"; subtest( $who, sub { plan tests => scalar keys %names; my $get_forename = $approaches{$who}; for my $fullname ( keys %names ) { my $fname = $get_forename->($fullname); no warnings 'uninitialized'; is( $fname, $names{$fullname}, "wanting forename '$names{$fullname}' from '$full +name'" ); } } ); print "\n"; }

      Nice work comparing these approaches (++).

      If anyone wants the hippo version to work even in the OP-unspecified case, all that is required is to change the return statment to

      return $fname || $sname;

      which completes the test set for that approach.

        Thanks, updated.

        The proper reason I wrote this (and why I marked it as OT) was to get acquainted with (and get used to) TDD, so I didn't look deeper into the solutions than was necessary to fit them in…

        BTW in my private copy, I have the test case from Re^2: RegExp, grabbing first name :-)
        'PEPPER ALICE BELINDA, CHARLOTTA, ..., YULIA, ZINNIA' => 'ALPHABET +'
Re: RegExp, grabbing first name
by Maresia (Beadle) on Feb 25, 2016 at 16:53 UTC
    Hi,this line right right should work, quick test, but it worked on both options for me:
    $mystring = ($mystring =~ /(\w{3,})$/) ? $1 : '';

      Re Re^3: RegExp, grabbing first name Listing Lname, MInitial, Fname is not a convention of which I'm aware.

      OP seems satisfied (below) with some of the suggestions, but IMHO, "A" is an abbreviation of the Fname for "SMITH" (as posted by OP).


      Come, let us reason together: Spirit of the Monastery
      Thanks, but it didn't work as I expected:
      I have to get the first name only no matter what the choices are:
      my $string_1 = "SMITH, A DOE"; my $mystring_2 = "BULLOCK JOE A";
      I am looking to get:

      String 1 = DOE
      String 2 = JOE

      My post was not clear, sorry!
      Thank you!
Re: RegExp, grabbing first name
by SuicideJunkie (Vicar) on Feb 25, 2016 at 20:29 UTC
    Are you OK with failing in the case of names like:
    Peep, Bo Chen, Yu
      Or

      Unavoidable, Chun the

      or

      'Alice Belinda, Charlotta, ..., Yulia, Zinnia Pepper', known locally as Alphabet Pepper.

      Parsing names is not a simple problem.

      ----
      I Go Back to Sleep, Now.

      OGB

Re: RegExp, grabbing first name
by kcott (Archbishop) on Feb 26, 2016 at 08:20 UTC

    This matches the two test strings you provided and captures the firstname:

    /^[^, ]+(?:,\s+\w+|)\s+(\w+)/

    Here's my test:

    $ perl -wnE '/^[^, ]+(?:,\s+\w+|)\s+(\w+)/; say $1;' SMITH, A DOE DOE BULLOCK JOE A JOE

    — Ken

Re: RegExp, grabbing first name
by clueless newbie (Curate) on Feb 26, 2016 at 12:02 UTC
Re: RegExp, grabbing first name
by Anonymous Monk on Feb 25, 2016 at 20:37 UTC

    You might be looking for $+. Quoting the docs:

    The text matched by the last bracket of the last successful search pattern. This is useful if you don't know which one of a set of alternative patterns matched. For example:

    /Version: (.*)|Revision: (.*)/ && ($rev = $+);

      Try this see if it will work.

      #!/usr/bin/perl use strict; use warnings; #my $mystring = "SMITH, A DOE"; my $mystring = "BULLOCK JOE A"; my @array= $mystring =~ m/^[A-Z, ]+([A-Z]{3,}).*?$/; print"$array[0]\n";
        Could this one work?

        if ($string =~ /(\w{3,})$/ ) { print "\n\$1\n"; }elsif( $string =~ /(\w+)\s(\w{1,2})$/ ) { #}elsif( $string =~ /(\w{1})\s(\w+)\s(\w{1,2})$/) { # or this one usin +g $2 print "\n$1\n"; }
        It doesn't work:
        my $string = "BULLOCK MICHAEL A"; $string = ($string =~ m/^[A-Z, ]+([A-Z]{3,}).*?$/) ? $1 : ''; print AEL;