Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Code Optimization v5.8.1

by bluethundr (Pilgrim)
on May 20, 2004 at 16:29 UTC ( [id://355000]=perlquestion: print w/replies, xml ) Need Help??

bluethundr has asked for the wisdom of the Perl Monks concerning the following question:

Hey Folks! I'm still new here, and new to Perl in general.

Still trying to craft, btw, an online presence here that doesn't annoy too many of us! :) I was thinking of "Perl-21 Lemay p.243 ex 4" (honestly!) but that seemed a little less...catchy? I suppose would be the word. At any rate, if you'd like to holler at me that this would have been a better title, by all means! Fire away! Formatting tips also welcome!

So here's the problem from -duh- "Sams Teach Yourself Perl In 21 Days by Laura Lemay". It's example 4 on page 243.

The problem states:

-------------------------------------------------------

'Write a problem to expand acronyms in its input (for example) to replace the letters "HTML" with "HTML (Hypertext Markup Language)". Use the following acronyms and meanings to replace:

HTML (Hypertext Markup Language)

ICBM (InterContinental Ballistic Missile)

EEPROM (Electrically-erasable programmable read-only memory)

SCUBA (Self-contained underwater breathing aparatus)

FAQ (Frequently Asked Question)'

---------------------------------------------------


I didn't phrase the output in quite the same way, but my answer was close enough in gross concept, IMB. So my question is, no, not for you to write the proggie for me! But rather, I came up with a solution that is a little less than...how shall we say? elegant! I realize that Mike Gancarz according to his "Unix Philosophy" (also new to that!) would disagree with this question, but I would love to know how a vet perl user would approach the problem.

btw, this *works* on my Linux laptop. But unfortunatly I'm visiting with relatives and all they have is dialup! And they don't want me installing all this "hacker stuff" (????!!!!!...and not that they would even likely know it if'n I did!!!) onto their pc. Eh, I'm the guest. Guess I'll just "reinvent the wheel"! Also counter to UNIX ideology, I know. I'm not asking for that kind of help, btw. I plan on visiting an installfest soon. But please try to bear with me! :)

So what I'm trying to say (in my ugly and verbose sort of way...much like my code!) is that this may work, or I may typo in the translation. If so, I accept all criticisms.

Here goes!

---------------------------------------------

#!/usr/bin/perl -w $in = (); # usr input $exit = 'n'; # done yet? $final = (); # final output %acro = (); # hash for acronyms %acro = ( 'HTML' => "Hypertext Markup Language", 'ICBM' => "Intercontinental Ballistic Missile", 'EEPROM' => "Electronically-erasable programmable read only memory", 'SCUBA' => "Self Contained Underwater Breathing Aparatus", 'FAQ' => "Frequently Asked Questions", 'LCARS' => "Library Computer And Retrieval System", # my own goofy ass + addition. Others welcome! 'NASA' => "National Aeronautical and Space Administration" # ditto ); while () { print "\nPlease enter an acronym": "; chomp($in = <STDIN>); print "\n"; foreach $ac (keys %acro) { if ($in =~ /$ac/ig) { $final = "\n$ac is the acronym for $acro{ac} \n\n"; last; } else { $final = "\nSorry. But that is not an acronym that I recognize!\n" +; } print $final; $final = (); print "\nTry again? : "; chomp ($exit =~ /[y|yes]/i { next; print "\n"; last; } print "\n"; last; } <br><br>
------------------------------------------------------

So there ya have it! Ugly crappy code from a total noob! :) But what the hey, it works!

THANKS!

Replies are listed 'Best First'.
Re: Code Optimization v5.8.1
by dragonchild (Archbishop) on May 20, 2004 at 17:33 UTC
    Here's how I would have written it. (This code has been lightly tested.)
    use strict; use warnings; $|++; my %acro = ( HTML => "Hypertext Markup Language", ICBM => "Intercontinental Ballistic Missile", EEPROM => "Electronically-erasable programmable read only memory", SCUBA => "Self Contained Underwater Breathing Aparatus", FAQ => "Frequently Asked Questions", LCARS => "Library Computer And Retrieval System", NASA => "National Aeronautical and Space Administration", ); INPUT: { print "Please enter an acronym: "; chomp( $_ = <STDIN> ); $_ = uc or redo INPUT; last INPUT if /^Q/; unless (exists $acro{$_}) { print "I don't know what '$_' means.\n"; redo INPUT; } print "$_ ($acro{$_})\n"; redo INPUT; }

    Some explanations:

    1. Always begin programs with the top two lines. It's easier to have the compiler do the bookkeeping for you.
    2. The $|++ is to disable output buffering. (q.v. Suffering from Buffering for more info.)
    3. Then, I set up the lookup table for the acronyms.
    4. Now, here's the goofy thing. You'll notice I'm using a named block and using last and redo. (I can't use next cause I'm not in a loop, but redo does something similar.) Sometimes, a certain construct makes a given algorithm easier to express for a given person. I like using named blocks and redo to handle keyboard input. Others like while-loops, goto, and all sorts of other ideas. I like named blocks.
    5. Now, I assign to $_, which has the fancy name of "local topic". This is the default variable for many functions, including regexes.
    6. I want to compare case-insensitively, so I uc the string. (uc defaults to using $_.) Now, if there is nothing, uc will return the empty string. This allows me to use logic short-circuits to redo the loop if nothing was entered. (Try it!)
    7. I allow for a quit scenario by stopping if anything that begins with 'Q' or 'q' is entered. (You might want to change this to allow acronyms that begin with 'Q'.)
    8. I use the exists function to have the hash do the lookup for me (instead of me coding a for loop). This also has the benefit of keeping my hash clean. If it doesn't exist, print a message and redo the block.
    9. If it does exist, print the acronym and definition, then redo the block.

    If you have any questions, please don't hesitate to ask!

    ------
    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

    I shouldn't have to say this, but any code, unless otherwise stated, is untested

      I like this style, but the one thing that would drive me crazy using a program like this is that I wouldn't have any idea of how to exit. While I don't like cumbersome prompts like "Enter an acronym, upper or lower case, or enter something that starts with Q (also upper or lower case) to quit:" perhaps some indication of how to get out would be beneficial?

      I know this is probably a moot point for a script like this, but I spend too much time in programs with really bad UIs to ignore it...

Re: Code Optimization v5.8.1
by davido (Cardinal) on May 20, 2004 at 17:12 UTC
    At the top of your script, right after the shebang line, put the following three lines:

    use strict; use warnings; use diagnostics;

    This will do several things. You mentioned that you're (paraphrasing) "so new you don't know what you don't know..."

    'use strict;' will force you to learn what it is you don't know; how to use lexical variables and scoping instead of package globals. It will also protect you from the pitfalls of unwittingly using soft references (symbolic refs -- something to leave alone for a good long while). And strictures will protect you from misspelling variable names.

    'use warnings;' will cause Perl to warn you when you do things that turn out to often be indicitave of a common mistake. For example, using a variable only once, trying to rely on the value of a variable that never received a value, etc. You've already got warnings with the -w switch on the shebang line, but you get better control with the pragma instead of the commandline switch.

    'use diagnostics;' will force Perl to become more verbose by explaining what its warnings and error messages mean, in greater detail.

    If you do that, you will suddenly find yourself needing to revamp the script a little. Feel free to ask questions if you need an explanation of the outcome.


    Dave

Re: Code Optimization v5.8.1
by duff (Parson) on May 20, 2004 at 17:20 UTC
    There's no need to iterate over the keys of the hash to see if the acronym exists, just use the exists function!
    chomp($in = <STDIN>); if (exists $acro{$in}) { print "$in is the acronym for $acro{$in}\n"; } else { print "Sorry, unknown acronym"; }
    Also, the last part of your while loop where you're asking if the user wants to try again seems garbled.
Re: Code Optimization v5.8.1
by flyingmoose (Priest) on May 20, 2004 at 19:04 UTC
    I agree with simon and combatsquirrel in the way the program should be solved. In addition, you have recieved some good advice on picking up meryln's book ("Learning Perl" published by O'Reilly), (etc) to improve your style. Being on a boring telecom, I'll give your code a quick review though for some of the misconceptions you would like pointed out. Others can add more. Perl is weird in that "Perlishness" is kind of a zen, so writing baby-perl (as either Merlyn or Larry or Tom put it) is acceptable as you attempt to write more perl in the style of the perl idiom.

    In my opinion (and all things are always subject to debate):

    • You use last and other tools to exit loops when your loop structure could be made simpler to eliminate this hopping around. Code feels a bit goto-ish. Consider subroutines if you want extra clarity.
    • Need to use 'my' to declare variables with scope rather than globally. Definitely enable 'use strict' and 'use warnings'. There are numerous reasons for this, which the books can explain in great detail.
    • Program should probably process STDIN and output to STDOUT given the problem description, rather than being an interactive tool
    • Stylistic, but no need to single quote items on the left side of '=>'
    • If you want to clear a variable, use 'undef $var' instead of sending it the size of an empty array with '$var=()'. If you write good code, you'll rarely need to clear a variable, scoping means you don't need globals. Globals are usually a sign of a design flaw.
    • Check for exit should probably read /yes|y/i instead of using what looks to be character classes... again, pick up Learning Perl or Programming Perl -- preferably both, as they cover this in depth
    • No reason to assign a string to $final if you can just print a message...again, this is a sign your loop logic needs to be refactored

    These are just a bunch of random comments, but you were asking for them, so I thought I would share. As you grow as a Perl programmer you will learn new tweaks and stuff, and your code will grow more elegant and easy to understand. Enjoy the process, as Perl is a really fun language and allows for some neat constructs. Don't be worried that people are pointing out ways you can improve, this monastery is an incredible resource. Most of all, realize Perl is fun!

Re: Code Optimization v5.8.1
by simonm (Vicar) on May 20, 2004 at 18:23 UTC
    Write a problem to expand acronyms in its input...

    I think most people would read this as suggesting that it should accept arbitrary text with the acronyms embedded, and modify them in place, not just do a straight lookup.

    Under this interpretation, CombatSquirrel's solution is the only correct one in the thread.

    Here's a related solution using the -p flag:

    #!perl -p s/($regex)/$1 ($acronyms{$1})/g; BEGIN { %acronyms = ( 'HTML' => "Hypertext Markup Language", 'ICBM' => "Intercontinental Ballistic Missile", 'EEPROM' => "Electronically-erasable programmable read only memor +y", 'SCUBA' => "Self Contained Underwater Breathing Aparatus", 'FAQ' => "Frequently Asked Questions", 'LCARS' => "Library Computer And Retrieval System", 'NASA' => "National Aeronautical and Space Administration" ); $regex = join '|', map quotemeta, sort { length($b) <=> length($a) } keys %acronyms; }

      A quick note to bluethundr to explain whats going on here.

      Perhaps the biggest difference between your program and CombatSquirrel's and simonm's is that they do not prompt for input, but instead read it from stdin. A program that prompts for input can only be used in one way. A program that reads text from stdin, does some work (expands acronyms) and writes out to stdout is useful on the unix command line. It can be used just like any other unix tool, and in combination with any of them.

      See The Art of Unix programming, or any good unix book. Its not more work to learn the unix way along with the perl way - its less!

      qq

      In this case I often use an alternate method, especially if there are lots of accronyms in the list: I scan the text for things that look like accronyms, and see if they exist in the hash:

      #!/usr/bin/perl -w use strict; my %acro = ( HTML => "Hypertext Markup Language", ICBM => "Intercontinental Ballistic Missile", EEPROM => "Electronically-erasable programmable read only memory", SCUBA => "Self Contained Underwater Breathing Aparatus", FAQ => "Frequently Asked Questions", LCARS => "Library Computer And Retrieval System", NASA => "National Aeronautical and Space Administration", ); my $text=<<TEXT; Here is a text that includes LOTS of accronyms like HTML, ICBM, SCUBA etc. Maybe this should be a FAQ. TEXT # substitute all accronymns in the text $text=~ s{([A-Z0-9]+)} # find all upper-case words (st +ored in $1) { $acro{$1} # is it in %acro? ? "$1 ($acro{$1})" # yes, expand it : $1 # no, leave it as is }gex; # g means to do it as many time +s as possible # e means to execute the code i +n the replacement part # x allows multi-line regexp an +d comments print $text;

      Note that if accronyms can include lower case letters, you will need to change the matching regexp to include them.

Re: Code Optimization v5.8.1
by diotalevi (Canon) on May 20, 2004 at 16:36 UTC
    You have stylistic issues and misconceptions about the operation of perl but the overall idea of checking a hash for membership is the right idea. I'd write something in the same family as what you posted. Is this what you were looking for? A confirmation of whether you had the concept down?
      Hey man. Thanks for your input. Sure, that's (partly) what I'm looking for. But the thing is I'm so new to perl that I literally "don't know what it is I don't know". honestly! Could you possibly, or rather would you care to be more specific about my "misconceptions about the operation of perl"? But I appreciate the confirmation that I was in headed in the right general direction. So, if you don't want to write a book about my "misconceptions" I can certainly understand.

      At any rate, THANKS!
        The short answer is that you'd do well to pick up one of the well known and well written books like merlyn's Learning Perl. I'll see if I can follow up with some specifics later.
Re: Code Optimization v5.8.1
by Nkuvu (Priest) on May 20, 2004 at 17:20 UTC
    I rewrote this fairly quickly, putting comments inline. It works, but the thing to consider is that a lot of the changes were stylistic changes. Other monks may have cleaner/better solutions.
    #!/usr/bin/perl -w use strict; my $in = (); # usr input my $exit = 'n'; # done yet? my %acro = ( 'HTML' => "Hypertext Markup Language", 'ICBM' => "Intercontinental Ballistic Missile", 'EEPROM' => "Electronically-erasable programmable read only memory +", 'SCUBA' => "Self Contained Underwater Breathing Aparatus", 'FAQ' => "Frequently Asked Questions", 'LCARS' => "Library Computer And Retrieval System", 'NASA' => "National Aeronautical and Space Administration" ); # I personally don't like empty conditions on # while loops. This changed the logic a bit, so # I changed the exit prompt as well. while ($exit !~ /^y/i) { print "\nPlease enter an acronym: "; chomp($in = <STDIN>); print "\n"; # I'm lazy, and don't want to worry about entering # uppercase acronyms. The uc() changes input to # uppercase automagically. Also, this approach # doesn't loop over every key every time. That's # what a hash is best for -- so you don't have to loop # over everything. if (exists $acro{uc($in)}) { # No need to assign to a temp variable, btw. print "$in is the acronym for ", $acro{uc($in)}, " \n\n"; } else { # I also echo input so if the user mistyped the acronym they # may see the error. print "Sorry. But $in is not an acronym that I recognize!\n"; } print "\nFinished? "; # Your original script didn't have any way to input $exit, so # it went into an infinite loop (I am guessing that was a # transcription error) chomp($exit = <STDIN>); }

    Ask if any of this doesn't make sense.

Re: Code Optimization v5.8.1
by CombatSquirrel (Hermit) on May 20, 2004 at 17:49 UTC
    In addition to what the others said, I would also like to pint out that you can build RegExes "on the fly" and get rid of that foreach (keys ... loop in your program. I solved the original problem because it is easier ;-):
    #!perl use strict; use warnings; my %acronyms = ( 'HTML' => "Hypertext Markup Language", 'ICBM' => "Intercontinental Ballistic Missile", 'EEPROM' => "Electronically-erasable programmable read only memory +", 'SCUBA' => "Self Contained Underwater Breathing Aparatus", 'FAQ' => "Frequently Asked Questions", 'LCARS' => "Library Computer And Retrieval System", 'NASA' => "National Aeronautical and Space Administration" ); my $regex; { my $temp = join '|', map quotemeta, keys %acronyms; $regex = qr/($temp)/o; } while (<>) { s/$regex/$1 ($acronyms{$1})/g; print; }
    Anyways, you should always use strict; no matter what; it helps a lot with syntax problems.

    Hope this helped.
    CombatSquirrel.
    Entropy is the tendency of everything going to hell.
      That is a useless use of /o.
      Ah, that's what I was going to say!

      So here's something a little different:
      local $" = ';'; system qq( sed '@{[ map "s/\\<$_\\>/$_ ($acronyms{$_})/g", keys %acron +yms ]}' );
      Of course, if the acronym list is long, the above will smack into the command line length limit; in that case, it's easy enough to write a sed script file.
      and a useless use of $1 ;)
Re: Code Optimization v5.8.1
by poqui (Deacon) on May 20, 2004 at 22:22 UTC
    Hey bluethundr,
    Just a side note on the other problem you mentioned: being away from your home computer and Linux, and not wanting to "footprint" on your hosts' machine...

    I have had *great* success with Knoppix which runs completely off the CD leaving NO FOOTPRINT on the host machine; and it includes perl.

    The X windows settings may take some fiddling on older machines, but it runs like a dream on any of the newer (<2years old) machines I have shoved it into.
      Guys...I just have to say this. I am silently amazed. This thread is more than I was hoping for and truly a wonderful thing. The monk's site, in my thus far brief experience, proves to exemplify the finest aspects of the net. Which in my mind is that of allowing people to form their own communities and then to take that potential and leverage it to one's own personal growth and benefit. Amazing.

      I have to say that a good deal of what was written was a bit beyond my ken. That is a good thing. It gives me something to strive for. But a good amount was also clearly discernible to my beginner's mind. I found an amazing amount of insight that I could only find amongst monks' wisdom that was, nevertheless, within my beginner's grasp.

      I wish that I could address each of you individually. But that of course is impossible, especially given the fact that I'm typing this while the boss isn't looking! ;D

      But in the meantime I just wanted to express my gratitude and say that I will STUDY what you all have written, weigh what I believe to be the pro's and cons of this exhilerating discussion and take the vast majority of your suggestions into consideration; especially the reading suggestions!

      But right now for example, though I do have some experience with C && C++ - a thing which I find makes this journey a bit easier - I am still only, for example JUST learning about how to create functions in Perl and got a handle on basic regexing only a few weeks ago (among a great many other things that I don't yet know!). So this process may take some time. That's okay. I'm a patient fellow. One virtue to balance out a great multitude of vice! I'm going to take the next year or so to improve my beginner's status, but I'm quite sure I'll enjoy writing Perl forever. Or at least until the inevitable forward ho' march of progress forcess us to adopt more advanced technology!

      But when I have more time I will try to address the individual questions I may have as a result of this (formidble!) thread. Until that can happen, I just wanted to say...

      DA MONKS ROCK!!!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://355000]
Approved by Happy-the-monk
Front-paged by QM
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2024-04-25 09:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found