Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Split confusion

by AnomalousMonk (Archbishop)
on Jun 03, 2020 at 16:27 UTC ( [id://11117663]=note: print w/replies, xml ) Need Help??


in reply to Split confusion

If you're processing a list of upper-cased names and want to turn them into "properly" cased names, why not just do that?

c:\@Work\Perl\monks>perl -wMstrict -le "my @names = ( 'JAMES SMITH-JONES', 'BOB SMITH-SMYTHE-SMITH', 'J. JONAH JAMESON', 'BILLY BOB THORNTON', ); ;; for my $name (@names) { printf qq{'$name' -> }; $name =~ s{ \b ([[:upper:]]+) \b }{\u\L$1}xmsg; print qq{'$name'}; } " 'JAMES SMITH-JONES' -> 'James Smith-Jones' 'BOB SMITH-SMYTHE-SMITH' -> 'Bob Smith-Smythe-Smith' 'J. JONAH JAMESON' -> 'J. Jonah Jameson' 'BILLY BOB THORNTON' -> 'Billy Bob Thornton'

Update: See also Falsehoods Programmers Believe About Names.


Give a man a fish:  <%-{-{-{-<

Replies are listed 'Best First'.
Re^2: Split confusion
by afoken (Chancellor) on Jun 03, 2020 at 19:35 UTC

    If you're processing a list of upper-cased names and want to turn them into "properly" cased names, why not just do that?

    See also Falsehoods Programmers Believe About Names.

    It simply does not work correctly:

    #30 There exists an algorithm which transforms names and can be reversed losslessly.

    X:\>perl oops.pl 'JAMES SMITH-JONES' -> 'James Smith-Jones''BOB SMITH-SMYTHE-SMITH' -> +'Bob Smith -Smythe-Smith''J. JONAH JAMESON' -> 'J. Jonah Jameson''BILLY BOB THORN +TON' -> 'B illy Bob Thornton''LUDWIG VAN BEETHOVEN' -> 'Ludwig Van Beethoven' X:\>type oops.pl my @names = ( 'JAMES SMITH-JONES', 'BOB SMITH-SMYTHE-SMITH', 'J. JONAH JAMESON', 'BILLY BOB THORNTON', 'LUDWIG VAN BEETHOVEN', ); ;; for my $name (@names) { printf qq{'$name' -> }; $name =~ s{ \b ([[:upper:]]+) \b }{\u\L$1}xmsg; print qq{'$name'}; } X:\>

    Ol' Ludwig needs a lower case 'v' in his name. Quoting Wikipedia:

    The prefix van to the surname "Beethoven" reflects the Flemish origins of the family; the surname suggests that "at some stage they lived at or near a beet-farm".

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

      ... "properly" ...   Nobody ever mentioned lossless reversal. :)


      Give a man a fish:  <%-{-{-{-<

Re^2: Split confusion
by swampyankee (Parson) on Jun 03, 2020 at 18:37 UTC

    Thank you.

    I was considering using a regex, but those seem to be the first clues I've lost

    While I'm processing names in a formulaic manner, I actually know how they write them at least when using some extended version of the Roman alphabet (some of my students have names that are transliterated from Arabic, Serbian, and Macedonian). The names were upcased by the software from the system producing the reports.


    Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

Re^2: Split confusion
by soonix (Canon) on Jun 03, 2020 at 19:23 UTC
    31. I can safely assume that this dictionary of bad words contains no people’s names in it.
    My theorem is this:
    For any name of any person, there is a language in which it is a swearword.

      I have the same theory about car models

Re^2: Split confusion
by Fletch (Bishop) on Jun 03, 2020 at 18:06 UTC

    Heh, #28 . . . Qapla'

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11117663]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (3)
As of 2024-04-24 16:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found