Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

(Ovid) RE: Pig Latin

by Ovid (Cardinal)
on Jul 23, 2000 at 20:43 UTC ( [id://23985]=note: print w/replies, xml ) Need Help??


in reply to Pig Latin

Here's the shortest that I could come up with:
s/\b((qu|[bcdfghjklmnpqrstvwxyz]+)?([a-z]+))/$2?$3.$2."ay":$1."way"/eg +;
Points to note:
  • It handles multiple consonants at the start of the word (i.e. "this" comes "isthay")
  • It handles 'qu'.
  • It's terribly inefficient, but then, I guess that wasn't the point :)
Here's the full breakdown, if anyone's interested:
#!/usr/bin/perl -w my $test = "that is the time for all good men to come to the aid of th +eir country."; $test=~s/ \b # start of word ( # capture all to $1 ( # this is $2 qu # word starts with qu | # or [bcdfghjklmnpqrstvwxyz]+ # a consonent )? # but it's optional ( # this is $3 [a-z]+ # rest of word ) ) /$2 ? $3.$2."ay" : $1."way" /xeg; # if $2 evaluates as true +then # put it at end of word an +d add "ay" # otherwise, just add "way +" print $test;
Cheers,
Ovid

Replies are listed 'Best First'.
Perl Golf (was RE: (Ovid) RE: Pig Latin)
by japhy (Canon) on Jul 23, 2000 at 23:12 UTC
    I'm just playing through. I made your code case-insensitive and did a bit of regex tomfoolery.
    # updated (5 years later!) s/\b(qu|[^\W0-9aeiou_]+)?([a-z]+)/$1?"$2$1ay":"$2way"/ieg;
    I don't see the need to save 3 pieces of data. And using [^\W0-9_] is shorter than [bcdf..xyz] and [b-df-hj-np-tv-z] and it forces the reader to think for a second. ;). And I saved space with the quoting on the RHS.

    Score: 53.

    $_="goto+F.print+chop;\n=yhpaj";F1:eval
      japhy, originally, I was constructing a rather longer and optimized script to do the pig latin conversion. Then I went back and reread vroom's specs. First, I didn't use the /i modifier because he said we were to assume the data was lowercase and he wanted the shortest possible code.

      The reason I am using three backreferences is because the data saved to $2 is tricky. Your equivalent (ignoring the "qu" problem) is [^\W0-9_]. This allows you to match all alphabeticals but does no discrimination for vowels. However, you apeared to notice this when you mentioned [b-df-hj-np-tv-z]. Therefore, I suspect that you intended the following and (assuming you did intend this) I offer you kudos for a clever regex:

      s/\b(qu|[^\W0-9_aeiou]+)?([a-z]+)/$1?"$2$1ay":"$2way"/ieg;
      I also noticed that, in this case, using the /i modifier ignored vroom's "lowercase" spec, but does result in a shorter regex.

      Cheers,
      Ovid

        Oh, d'oh, I'm silly. I meant to add 'aeiou' to the character class, I really did, since that was the whole reason I introduced it. :) And I'm sorry I hadn't checked vroom's specs.

        By the way, since Pig Latin does not produce a 1-to-1 mapping of normal strings to PL-strings, you can't reasonably reverse this process. Example: flea and leaf both go to eaflay.

        $_="goto+F.print+chop;\n=yhpaj";F1:eval
Re^2: Pig Latin
by Sartak (Hermit) on May 05, 2006 at 03:42 UTC
    [bcdfghjklmnpqrstvwxyz]+ can become [bcdfghj-np-tv-z]+ to save six bytes without loss of accuracy.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://23985]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2024-04-19 02:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found