Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: Handling caps for surnames with capitals in the middle (was: Irish Surnames)

by ariels (Curate)
on May 06, 2002 at 11:48 UTC ( [id://164286]=note: print w/replies, xml ) Need Help??


in reply to Handling caps for surnames with capitals in the middle (was: Irish Surnames)

Doesn't this depend on the person? I believe you can find Macdonalds and MacDonalds (not to mention McDonald's) coexisting in the phone books...

It looks like you're short of luck: you have a social problem, so a technical solution won't do.

  • Comment on Re: Handling caps for surnames with capitals in the middle (was: Irish Surnames)

Replies are listed 'Best First'.
Re: Re: Handling caps for surnames with capitals in the middle (was: Irish Surnames)
by Joost (Canon) on May 06, 2002 at 11:59 UTC
    But you might make an educated guess when you have no (reasonable) capitalisation. I think that's how humans treat this problem.

    i.e.

    "MacDonalds" eq handle_caps("irish","macdonalds"); "Macdonalds" eq handle_caps("irish","Macdonalds"); "MacDonalds" eq handle_caps("irish","MaCDOnaLDS");

    Assuming that "MacDonalds" is the 'preferred' spelling in Irish...

    - Joost.

Re: Re: Handling caps for surnames with capitals in the middle (was: Irish Surnames)
by Baz (Friar) on May 06, 2002 at 11:56 UTC
    Hmmm...well all the surnames im accessing are stored in a database as lower case strings. When I print them out I'd like to display them in the form I've discribed above i.e. mcgee to McGee and develere to DeVelera.

      If it were my surname you were mauling, I'd be more than a bit annoyed. I get enough of my surname's Anglicised (actually Brazilinated, but it's close enough) form being auto-"corrected" into the original Russian. I am actually capable of spelling it correctly, and I do just that.

      That said, you could

      $surname =~ s/^((?:Mc|Mac|De|Da|Du)?)(.*)$/\u\1\u\2/i;
      (assumes the surname starts off lower-cased).

      Other problems: not smart enough. E.g. <samp>'mack'</samp> becomes <samp>'MacK'</samp>, which is wrong; you might want .{3,} instead of .* in the regexp.

      But I really don't think you should be doing this...

      Wel, depending on whether you're validating (i.e. someone submits their name and you want th check if it's in the database or not) or doing something for each entry in a query (give me a list of every person in the database who likes cheeseburgers), you could just canonicalize in your query. Observe:
      $sth=$dbh->prepare(qq(select * from table where lower(last_name)=?)) o +r die "$dbh->errstr"; my ($new_last_name = $last_name) =~ tr/[A-Z]/[a-z]/; $sth->execute($new_last_name) or die "$dbh->errstr";
      Now, you can store the names on the database how ever you want, do your query, and return what the user actually entered in as their name (many have made the point that they know how to spell their own name). By translating both your string and what's on the database to lowercase, you are going to find the match regardless of what case it is on the database. Be warned though that this basically destroys any indexing that you may have had on that field, because the database doesn't know what the results of the lower() function will be until it actually does it. So, it must do it on every record on the table.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://164286]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (2)
As of 2024-04-16 21:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found