http://qs321.pair.com?node_id=1211325

thenextfart has asked for the wisdom of the Perl Monks concerning the following question:

I am currently learning Perl by trial/try/error, and I have a Python background. The purpose of the program of the subject of the question, however, is to find out what is idiomatic Perl and what is not. I don't want to end up writing Python in Perl. This is my program:
use strict; use warnings; print "RegEx Engine 1.0\n________________\n"; print "Gimme a string: "; my $str = <STDIN>; print "Gimme a RegEx: "; my $pattern = <STDIN>; my $answer = eval("\"$str\" =~ $pattern"); if ($answer) { print "Yes!"; } else { print "No."; } print "\nkthxbye\n";
Is this good/idiomatic/bad/ugly/encouraged/discouraged/ Perl? (Note: I am using Perl 5)

Replies are listed 'Best First'.
Re: Idiomatic Perl?
by haukex (Archbishop) on Mar 20, 2018 at 14:20 UTC

    I see two things in that code that I would suggest to improve:

    • Don't use eval - it allows execution of arbitrary Perl code, and getting the quoting of interpolated code right is tricky (try entering one double quote for $str), and so it should only be used sparingly. In this case it is not needed, just say: my $answer = $str =~ /$pattern/; - however note that then, your regexes should not be entered with the surrounding slashes.
      (If you do use eval someday, make sure to do proper error checking, as described e.g. here: something like eval "$code; 1" or warn "eval failed: $@";)
    • my $str = <STDIN>; and $pattern will have a newline on the end, which you should chomp off, e.g. chomp( my $str = <STDIN> );. See also the Basic debugging checklist, which suggests using a module like Data::Dumper or Data::Dump to look at what variables actually contain. If you use the former, I recommend setting $Data::Dumper::Useqq=1;.

    In Perl, as opposed to Python's "there should be one - and preferably only one - obvious way to do it", TIMTOWTDI - There Is More Than One Way To Do It. While there are certainly 20 different ways to write the code you showed, don't worry about that too much - just keep your eye out for best practices like the ones I mentioned above, and otherwise enjoy learning Perl :-) We'll be happy to help.

    Made a few minor edits.

      Is it okay to just do my $str = chomp(<STDIN>);?
        Is it okay to just do my $str = chomp(<STDIN>);?

        No, because chomp is a bit special: it modifies its argument(s) and returns the total number of characters removed from all its arguments. The code you showed would fail because chomp wants to modify its arguments, but can't modify <STDIN> itself.

        The reason chomp( my $str = <STDIN> ); works is because a scalar assignment in Perl like ( my $str = <STDIN> ) is modifiable (an "lvalue"), as described in Assignment Operators: "Modifying an assignment is equivalent to doing the assignment and then modifying the variable that was assigned to."

        If you want to be a little bit more verbose, what you can do is:

        my $str = <STDIN>; chomp($str);

        Try it! What does it return?

        If it's not what you want, another approach might be to call the chomp on the result of the assignment:

        chomp( my $str = <STDIN> );

        Hope this helps!


        The way forward always starts with a minimal test.

        Depends on what you want to have in $str. Read chomp to learn about the return value of chomp.

Re: Idiomatic Perl?
by choroba (Cardinal) on Mar 20, 2018 at 16:05 UTC
    When using eval as a poor man's try/catch, don't use the eval EXPRESSION form, use the eval { CODE } one:
    my $answer; eval { $answer = ($str =~ $pattern); 1 } or warn "Error in the pattern +: $@";
    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
Re: Idiomatic Perl?
by mxb (Pilgrim) on Mar 20, 2018 at 17:20 UTC

    As haukex has already pointed out, it's probably best to avoid eval in this case as it could be manipulated to run arbitary Perl code in your example.

    hippo has also made a valid suggestion about using the ternary if conditional to make the code easier to read.

    Personally, I'd use both of these suggestions, add in the use of say rather than print when appropriate and move the display prompt and return input into a separate function, but that might be over-engineering this learning exercise.

    Putting it all together gives the following:

    use strict; use warnings; use 5.016; sub prompt_read { print shift; chomp( $_ = <STDIN> ); return $_; } say "RegEx Engine 1.0\n________________"; my $str = prompt_read("Gimme a string: "); my $pattern = prompt_read("Gimme a RegEx: "); say $str =~ /$pattern/ ? "Yes!" : "No!"; say "kthxbye";
      I don't want to start any "style wars". But my natural inclination would have been to code prompt_read() something like this:
      sub prompt_read { my $prompt = shift; print $prompt; my $response = <STDIN>; $response =~ s/^\s*|\s*$//g; return $response; } Note: these "standard idioms" can be used, but with recent Perl's the more complex regex above is slightly faster. $response =~ s/^\s*//; # these 2 statements do the same $response =~ s/\s*$//; # but slightly slower
      Some Comments:
      • I prefer the bracketing style shown for Perl. For Java I like the more compact form because of all of the little "getter and setter" subs which can take up a lot of space.
      • In Perl, I look for the first line of the sub to understand the input args. Perl doesn't have prototypes like C but this appears to work just fine.
      • Giving a name to a variable is "cheap", "very cheap" in terms of execution speed. I assigned $prompt as the sub's input value. There is nothing "un-Perl" about that at all. Likewise making a new variable $response for the input is "very cheap". I don't assign values to $_ which I consider to "belong to Perl". In for() or foreach() loops I assign my own name for the loop variable rather than $_. This prevents problems in nested foreach() loops as well as providing some more documentation, provided of course that the loop variable name makes sense!
      • idiomatic Perl shows up in the next line. In Python, you have to explicitly import stuff to use regex and then explicitly compile the regex, then use it in another statement. In Perl, regex is "built-into the language". Normal CLI input conventions would be to strip all leading and trailing spaces as well as the line ending. That one statement does both. In a loop, Perl will take care of compiling the regex and not doing that step more than necessary (in most cases).
      It would take a more complex example for the power of Perl vs Python to be demonstrable.

      One point that I have is that well-written idiomatic Perl does not have to be cryptic.

      I don't claim that my style above is better than other styles. This is just one example.

      From a coding standpoint, I did like the idea of splitting out the function of sending a prompt and getting a response. I wrote a more sophisticated version of this awhile back. My expanded routine took a prompt,regex,err_message as input. This handled blank lines, did retries and such things.

      If you're gonna use globals you're supposed to localize
        Can you properly localize $_? I vaguely recall Rafael Garcia-Suarez mentioning something about "having 'local $_' not working as intended", but never investigated what that means.

        Huh, I didn’t know that! I’ve only ever assigned to $_ in throw away scripts and one liners. Thanks for that, I learned something new.

Re: Idiomatic Perl? -- eval qr/ /
by Discipulus (Canon) on Mar 20, 2018 at 20:37 UTC
    Hello thenextfrater and welcome to the monastery and to the wonderful world of Perl!

    haukex and choroba are both wise and right about possibly dangerous uses of eval but since you are asking the user to enter a regex you must be sure that is a valid one; infact your code allows to pass a wrong regex:

    Gimme a string: a Gimme a RegEx: a( Unmatched ( in regex; marked by <-- HERE in m/a( <-- HERE / at .. line + .. <STDIN> line 2.

    So sometimes eval is useful and imho one these cases is compiling a regex (see qr// in perl documentation):

    use strict; use warnings; print "Gimme a string: "; my $str = <STDIN>; chomp $str; print "Gimme a RegEx: "; my $pattern = <STDIN>; chomp $pattern; my $rex; { # a block to localize $@ local $@; # eval the regex eval{ $rex = qr/$pattern/ }; # die if errors die "error compiling regex!" if $@; } # end of the localizing block print $str =~ /$rex/ ? "Yes!" : "No.";

    So do not avoid eval because it can be dangerous: know it and profit it! Obviously you do not use a bazooka to kill a mosquito.. do you?

    PS about arbitrary code execution I was tempted to add also regex can lead to code execution but perl is wise in this: not arbitrary code:

    perl -E "say 'match' if 'ec' =~ /$ARGV[0]/" "ec(?{print 'EVAL!';})" Eval-group not allowed at runtime, use re 'eval' in regex m/ec(?{print + 'EVAL!';})/ at -e line 1.
    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: Idiomatic Perl?
by haukex (Archbishop) on Mar 21, 2018 at 09:07 UTC
    find out what is idiomatic Perl and what is not

    Here's my take on how to make that a bit more idiomatic/modern/whatever:

    use warnings; use strict; use IO::Prompt; print "RegEx Engine 1.0\n", "________________\n"; my $str = prompt("Gimme a string: "); my $pattern = prompt("Gimme a RegEx: "); print $str=~$pattern ? "Yes!" : "No.", "\n"; print "kthxbye\n";

    (As an alternative to IO::Prompt, you could use ExtUtils::MakeMaker 'prompt'; - ExtUtils::MakeMaker is a core module, even though it has a completely different purpose. Or, there is Term::ReadLine.)

    I don't want to end up writing Python in Perl.

    But why not? ;-)

    use Acme::Pythonic; use warnings use strict use IO::Prompt print("RegEx Engine 1.0\n________________\n") while 1: my $str = prompt("Gimme a string: ") unless length($str): last my $pattern = prompt("Gimme a RegEx: ") if $str=~$pattern: print("Yes!\n") else: print("No.\n") print("kthxbye\n")
              But why not ;-)        
      Because: most of the benefits learnt while learning a postmodern language such as Perl are lost when we try to implement modern languages like Python. Also, whitespace is starting to appear in my dreams.

        stevieb is right, I wasn't being (entirely) serious. Modules in the Acme:: namespace are generally meant to be tongue-in-cheek, silly, and/or not for production use. I also gave that example to show off some of the power of Perl, in this case its ability to rewrite source code on the fly.

        I do believe haukex may have been being facetious (although I could be wrong).

        Also, whitespace is starting to appear in my dreams

        haukex pointed out one Acme module, so here's another for your whitespace dreams: Acme::Bleach ;)

        By the way, I completely agree with your reply about coding in one language while trying to retain and use the particulars of another.

Re: Idiomatic Perl?
by karthiknix (Sexton) on Mar 20, 2018 at 15:45 UTC

    Code seems to work, but it is better in PERL way as below.

    use strict; use warnings; print "RegEx Engine 1.0\n________________\n"; print "Gimme a string: "; my $str = <STDIN>; print "Gimme a RegEx: "; my $pattern = <STDIN>; my $answer = ($str =~ /$pattern/); if ($answer) { print "Yes!"; } else { print "No."; } print "\nkthxbye\n";

    Code Change at line "my $answer = ($str =~ /$pattern/);" will look for the pattern in $pattern in the string $str and will result in yes or no.

      my $answer = ($str =~ /$pattern/); if ($answer) { print "Yes!"; } else { print "No."; }

      I think the ternary here would be even more idiomatic. You can replace those 6 lines with just one:

      print $str =~ /$pattern/ ? "Yes!" : "No.";
Re: Idiomatic Perl?
by QM (Parson) on Mar 21, 2018 at 14:51 UTC
    If you're just learning, or have complete trust over what goes into the regex, skip the rest of this.

    However, someone should mention that taking regex input from users is fraught with security issues. If you don't have complete trust over what's read into the regex, you should consider something like re::engine::RE2, which limits some constructs and mitigates denial-of-service.

    I have only touched the tip of the iceberg; you should search for "perl regex security" and similar to be more bullet-proof-resistant.

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of