Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Never-to-use Perl features?

by Juerd (Abbot)
on May 12, 2003 at 22:37 UTC ( #257590=perlmeditation: print w/replies, xml ) Need Help??

It happens all the time. Someone asks about a language feature and the general consensus is "don't use that!". Most recent example, and insperation for this node: regex "o" modifier.

Perl has its weaknesses and it's good we realize them. I can think of only a few myself:

  • V-strings
  • Pseudo-hashes
Including the /o-modifier, the list now has 3 language features that seemed a nice idea at first, but aren't really usable now.

This is your opportunity to curse in church :)

Juerd # { site => '', plp_site => '', do_not_use => 'spamtrap' }

Title edit by tye
Title edit by me

Replies are listed 'Best First'.
Re: Never
by dws (Chancellor) on May 12, 2003 at 23:52 UTC
    The way Perl does objects is definitely a mixed bag. OO Perl is flexible once you've climbed the learning curve, which implies that you have the commitment to get there. The strange syntax, however, is enough to scare a lot of non-Perl people away (and into the waiting arms of Python).

    That, and $| side-effecting the currently selected file handle, are my two gripes.

    Oh, and constants. The use constant kludge, which produces second-class constants that misbehave if you stare at them the wrong way, is a real pain sometimes.

Re: Never
by Abigail-II (Bishop) on May 12, 2003 at 23:09 UTC
    • Objects.
    • Prototypes.
    • $*, $[, $#
    • dump
    • Having both 2-arg select and 4-arg select.
    • Global $/, $\, $,, $| instead of per handle settings.


      They are all problematic, but do they fall into the "never use this feature" category? (Granted, you're not someone I'd expect to put anything there at all.)

      Makeshifts last the longest.


      Are these blessed values or something else? And why are they wrong?

      Juerd # { site => '', plp_site => '', do_not_use => 'spamtrap' }

        They are wrong because you have only one instance variable. Everyone would think it's odd if subroutines can have only one variable - it would still be workable of course, you'd just use a hash and put all your variables in that. Perl would be mocked at for having such silly subroutines, and it would have been fixed years ago.

        Why people accept such sillyness with objects, I don't know. But then, people accept Microsoft Windows as well.


      You don't like prototypes? Why's that?

        sub foo ($$) { } my @bar = (1, 2); foo (1, 2); # Fine. foo (@bar); # Compile time error.


Re: Never
by tilly (Archbishop) on May 12, 2003 at 23:23 UTC
    Add in reset, a dynamically scoped $_ (which is always in $main::), our, $Carp::CarpLevel, a poorly designed (the mistake being a misunderstanding of $Carp::CarpLevel), etc.
Re: Never
by grantm (Parson) on May 13, 2003 at 00:17 UTC
    Including the /o-modifier, the list now has 3 language features that seemed a nice idea at first, but aren't really usable now.

    I've read the thread you quoted and I'm obviously missing something because in my experience /o works exactly as it should:

    use Benchmark; my @words = map { chomp; $_ } (<DATA>); my $alpha = '[a-zA-Z]'; my $alnum = '[a-zA-Z0-9]'; timethese(2000, { 'Without /o' => \&testsub, 'With /o' => \&testsubo, }); sub testsub { my $count = 0; foreach (@words) { $count++ if(/^$alpha$alnum+$/); } return $count; } sub testsubo { my $count = 0; foreach (@words) { $count++ if(/^$alpha$alnum+$/o); } return $count; } __DATA__ 1500 words one per line

    Which on my system shows that with /o is three times faster than without.

    Using variables to give meaningful names to chunks of a regex is very useful for improving the readability, maintainability and reusability of the code. Without /o it would be inefficient. What is it about /o that makes it "not really usable"?

    Update: I added this to the test script:

    my $qr = qr/^$alpha$alnum+$/; [snip] sub testsubqr { my $count = 0; foreach (@words) { $count++ if(/$qr/); } return $count; } sub testsubqro { my $count = 0; foreach (@words) { $count++ if(/$qr/o); } return $count; }

    The qr// approach seems to be about 20% slower than /o and qr// + /o seems to be about the same as /o alone.

      At last, someone else sees the benefits of /o.

      Then, the counter argument is: Use qr// which works and removes the need (most of the time) for /o...

      until you combine a couple of chunks pre-compiler with qr// into another chuck with qr//. Then the /o seems (sometimes at least) to show benefits again.

      I wish I could truly tie down when and why qr/.../o produces these benefits and when not.

      Or is it all just a figment of my imagination.

      The counter-argument that you shouldn't use /o because you might forget you'd used it sometime doesn't cut much ice with me.

      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller

        hmm.. yup. after much fiddling with the tests, I couldn't get qr to match /o. My conclusion is:
        if you've got a regex containing a variable that will not change, use /o.
        if you need to loop through multiple regexes, then you can't use /o, so compile them with qr.
        the option to that is to have multiple tests using /o which I think would still be faster.. ok off to test..

        again, hmmm..

        use Benchmark; my @words = map { chomp; $_ } (<DATA>); my $alpha = '[a-zA-Z]'; my $alnum = '[a-zA-Z0-9]'; my @qr = ( qr/^$alpha/, qr/$alnum+$/ ); timethese(500000, { 'With /o' => \&testsub, 'qr' => \&testsubqr, }); sub testsub{ my $count = 0; foreach(@words){ $count += testsubb($_); } } sub testsubb { my $word = $_[0]; return unless $word =~ /^$alpha/o; return unless $word =~ /$alnum+$/o; return 1; } sub testsubqr{ my $count = 0; foreach(@words){ $count += testsubbqr($_); } } sub testsubbqr { my $word = $_[0]; foreach(@qr){ return unless $word =~ $_; } return 1; } Benchmark: timing 500000 iterations of With /o, qr... With /o: 12 wallclock secs (13.13 usr + 0.01 sys = 13.14 CPU) @ 38 +051.75/s (n=500000) qr: 18 wallclock secs (18.90 usr + 0.01 sys = 18.91 CPU) @ 26 +441.04/s (n=500000) Benchmark: timing 100000 iterations of With /o, qr... With /o: 3 wallclock secs ( 2.64 usr + 0.00 sys = 2.64 CPU) @ 37 +878.79/s (n=100000) qr: 4 wallclock secs ( 3.78 usr + 0.00 sys = 3.78 CPU) @ 26 +455.03/s (n=100000)

        I think I'm done defending qr..

        ok, back to work.. nothing to see here...

        update: adding these tests shows that the loop is adding more time than qr saves, but /o is still quicker.. so where does that leave us?

        sub testsubqr2{ my $count = 0; foreach(@words){ $count += testsubbqr2($_); } } sub testsubbqr2 { my $word = $_[0]; return unless $word =~ $qr[0]; return unless $word =~ $qr[1]; return 1; } Benchmark: timing 100000 iterations of With /o, qr, qr2... With /o: 3 wallclock secs ( 2.63 usr + 0.00 sys = 2.63 CPU) @ 38 +022.81/s (n=100000) qr: 4 wallclock secs ( 3.76 usr + 0.00 sys = 3.76 CPU) @ 26 +595.74/s (n=100000) qr2: 3 wallclock secs ( 2.98 usr + 0.00 sys = 2.98 CPU) @ 33 +557.05/s (n=100000)



      The speed differences are small enough to be irrelevant for almost all real world situations. The important difference is that qr// is a clear and fairly obvious modifier, which /o does something totally bizarre, i.e. the first time a regex with /o runs it will do one thing and every other time it will do something different. This frequently wreaks havoc in persistent environments like mod_perl where people forget that /o regexes will not get reset after their script finishes.
        the first time a regex with /o runs it will do one thing and every other time it will do something different

        Actually every other time it will do the same thing despite many people expecting it to do something different :-)

        All flippancy aside though, your point re persistent environments is a good one that I hadn't considered. Mind you, persistent environments wreak all sorts of havoc with file scoped lexicals too but that doesn't mean they're inherently a bad idea - it just means you need to use them with caution.

      Um, I guess it helps if you know how to use qr// properly. You don't write /$qr/ if you want it fast, you write $qr!

      I get qr// faster than even /o. Though, your benchmark is testing such micro operations that the results can be rather unstable. The most "likely looking" result I got (early on) was:

      Rate Without /o With /o With qr Without /o 42725/s -- -26% -39% With /o 57636/s 35% -- -18% With qr 70185/s 64% 22% --
      But a more typical result was:
      Rate 2// 2/o 1// 1/o 1qr 2qr 2// 31.4/s -- -0% -0% -1% -25% -26% 2/o 31.5/s 0% -- -0% -0% -25% -25% 1// 31.6/s 0% 0% -- -0% -25% -25% 1/o 31.6/s 1% 0% 0% -- -25% -25% 1qr 41.9/s 33% 33% 33% 33% -- -1% 2qr 42.2/s 34% 34% 34% 34% 1% --
      Yes, that's right, /o was so close that it even ran slower than // on occasion.

      Note that I didn't change any of the code in the subroutines being benchmarked between these two runs (I did change the data used several times, but even other runs with the same data never gave me results very similar to that first result above). It is just that Benchmark has to do some interesting work to try to measure such micro operations and so can easily show differences of around 20% between successive runs of identical code. That is why I usually make sure I have the benchmarking code run each case twice (otherwise you are rather likely to give a 20% disadvantage to the case that gets run first, for example).

      Also, always verify that all of your benchmarked cases are doing the same thing:

      Without /o:2200 With /o:2200 With qr:2200

      So I stand by my assertion that you should never use /o!

                      - tye

        And I recall from looking at the generated optree that =~ /$qr/ and =~ $qr are 100% identical. I don't think anyone here is actually measuring any real difference.

        You're right, publishing a benchmark without the test data is pretty meaningless. Here's a revised version that uses the individual words output from 'perldoc -t perlfunc' as the test data.

        #!/usr/local/bin/perl -w use Benchmark; my (@words, $count); open(TESTDATA, "perldoc -t perlfunc|") || die $!; while(<TESTDATA>) { chomp; push @words, /(\S+)/g } print @words . " words\n"; my $alpha = '[a-zA-Z]'; my $alnum = '[a-zA-Z0-9]'; my $qr = qr/^$alpha$alnum+$/; timethese(100, { '/^$alpha$alnum+$/ ' => \&testsub, '/^$alpha$alnum+$/o' => \&testsubo, '/$qr/ ' => \&testsubqr1, '$qr ' => \&testsubqr2, '/$qr/o ' => \&testsubqro, }); sub testsub { foreach (@words) { $count++ if(/^$alpha$alnum+$/); } + } sub testsubo { foreach (@words) { $count++ if(/^$alpha$alnum+$/o); } + } sub testsubqr1 { foreach (@words) { $count++ if(/$qr/); } + } sub testsubqr2 { foreach (@words) { $count++ if($_ =~ $qr); } + } sub testsubqro { foreach (@words) { $count++ if(/$qr/o); } + }

        This is probably a fairer test than the original (less iterations of more data) and the output looks like this:

        /^$alpha$alnum+$/ : 20 wallclock secs (20.41 usr + 0.00 sys = 20.41 C +PU) @ 4.90/s (n=100) /^$alpha$alnum+$/o: 9 wallclock secs ( 8.34 usr + 0.00 sys = 8.34 C +PU) @ 11.99/s (n=100) /$qr/ : 9 wallclock secs ( 9.59 usr + 0.00 sys = 9.59 C +PU) @ 10.43/s (n=100) $qr : 10 wallclock secs ( 9.94 usr + 0.00 sys = 9.94 C +PU) @ 10.06/s (n=100) /$qr/o : 9 wallclock secs ( 8.34 usr + 0.01 sys = 8.35 C +PU) @ 11.98/s (n=100)

        The reason I used /$qr/ rather than =~ $qr was not because I didn't know how to use it, but because I was using it in an if statement and $qr being a reference would simply evaluate to true without even attempting a match. The results above appear to show that plain $qr is slightly slower than /$qr/ but that is almost certainly due to the fact that I had to spell it out as $_ =~ $qr and so the difference should be disregarded.

Re: Never
by broquaint (Abbot) on May 12, 2003 at 23:51 UTC
    • getc - doesn't do what one would expect
    • inline globs - is prone to mixing up filehandle reads
    • first arg of open - isn't intuitive
    • nested subroutines - broken to a degree


      The first argument of open()? I have some other problems with it:
      • Single argument open — silliness beyond belief
      • Combined argument for open mode and filepath (the 3-argument open() remedies that)
      Anyway, I feel it would make more sense that open() would return a filehandle instead of modifying one of its arguments and return a flag. But then, Perl didn't work like that in the old days.

      What's wrong with getc?

      From perldoc:

      getc FILEHANDLE

      Returns the next character from the input file attached to FILEHANDLE, or the undefined value at end of file, or if there was an error. If FILEHANDLE is omitted, reads from STDIN. This is not particularly efficient. However, it cannot be used by itself to fetch single characters without waiting for the user to hit enter. For that, try something more like:

      That sounds like how I expect getc to work. Or, am I missing something?

        That sounds like how I expect getc to work. Or, am I missing something?
        It does indeed work as the docs say (much like all of Perl's other foibles listed in this thread) but when processing STDIN it doesn't work as one might initially expect (grab a char, return it). This is not so much a fault on Perl's behalf but that of the C implmentation of getc.


Re: Never
by theorbtwo (Prior) on May 13, 2003 at 06:13 UTC

    (This list contains a lot of dupes from other's lists; I'm Huberific enough to think that somebody might care what's on /my/ list.)

    • v-strings
    • Globs as filehandles (way too shallow; I encompas all the $|-et-al-shouldn't-be-gloabls in this).
    • open FH, "path" syntax (sepperate because you could have had *FH = open "path" without making filehandles first-class or objects)
    • psudo-hashes
    • not being able to iterate over lexicals through a psudo-symbol-table.
    • $_ as a global.

    Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).

      not being able to iterate over lexicals through a psudo-symbol-table.

      See PadWalker. PodMaster even made a ppm of it for his ppm repository.

Re: Never
by hardburn (Abbot) on May 13, 2003 at 14:12 UTC

    Inheirtence with objects. I think tye has said something on this before.

    Inheirtence is an incrediably powerful concept (even if it often isn't used right), and a primary reason to have an object system in the first place. However, Perl sucks at it.

    First, run-time lookup of methods tends to slow inheirted objects down. Second, there are all sorts of hoops you must jump through to make sure both the parent and child classes can support inheirtence correctly (parent must have a well-behaved constructor, and child must have a well-behaved destructor (if you have one), for example).

    I've gone over the problems with abstract classes/interfaces before.

    Lastly, there needs to be data which both parent and child need to agree to use, but should not be visiable to the rest of the world. Java handles this with protected member data, but there is no similar standard in Perl OO (some might say this follows Perl's philosophy, but it also tends to add extra work for those needing to implement a subclass).

    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    Note: All code is untested, unless otherwise stated

      Hmm, I disagree with pretty much all of that. I love the simplicity and flexibility of perl's object model, and I've written a lot of OO perl code using inheritance.

      Note that perl does cache the method-lookups. I tend to write code initially in the simplest possible way and worry about optimisation only when it becomes necessary to do so: while I regularly do come up against a need to optimise eventually, the method-lookup overhead has never been a significant factor at that point (0). Maybe I've just been lucky.

      I have not had problems with child/parent disagreements - it is usually pretty clear to me when the child needs to call $self->SUPER::method(@_). Maybe I've just been lucky, but I think this is also a question of class design, and of course of documentation - it is vital for parent classes to document what children may, must, or must not do. (1)

      Abstract classes I also use a lot of, but I've always considered it adequate to have the abstract class die at runtime if a method gets called that should have been overridden by a concrete child. This has never caused me any significant problems - and certainly not in production code - but maybe I've just been lucky.

      I don't tend to have any data that parent and child agree to use - my attributes are accessed through attribute methods, and that method is the _only_ thing that knows how to find the data. Even within other methods of the parent class I call the attribute method, which (among other things) makes it really easy to change the implementation. I've never found the need to distinguish between child classes and other callers in such attribute methods, but maybe I've just been lucky.

      It probably helped that before Perl I'd only ever tried to do OO design in C++, and hated the complex inflexible approach there - I think perhaps that allowed me to develop my approach to OO concentrating on Perl's strengths rather than trying to force Perl into the mould of some existing methodology it might be less suited for.


      (0) That's not entirely true, but the only examples I can think of were standalone scripts solving a mathematical problem using a deeply recursive solution, with an object representing the problem parameters. In those cases it wasn't particularly method lookups that were a problem - even the overhead of a straight subroutine call was significant enough to avoid if possible.

      (1) The only time this has caused me a problem is when one class is auto-generating methods (I do this a lot) that use SUPER into another class, which I haven't needed that often but becomes a real pain when I do. I've worked around that in my current work application using string-eval to generate the methods directly in the appropriate class, which is ugly, but works ok.

        I have not had problems with child/parent disagreements

        Yes, it's always possible to get around the problems. I only want these things to be automatic. I used to think bless should be automatic, but now I'm not so sure (it does provide a lot of flexibility). But calling a parent's destructor should definatly be automatic.

        considered it adequate to have the abstract class die at runtime if a method gets called that should have been overridden by a concrete child

        I think you genuinely have been lucky here, though lucky in a way that almost everyone else will also get lucky. Consider a child class that implements everything except one little-used method left unimplemented in the child that nobody notices for years, long after the orginal programmers of the parent and child classes have left. Since the checking is done when the method is called, nobody notices it. One day, somebody notices this obscure method and decides to use it. Oops, runtime error. Cracks appear in the earth's surface. The sun explodes. All die.

        Most of my OO background comes from Java, which may not have the best object system, but is is very clean compared to C++.

        Most of the things I consider broken in Perl's object system are being fixed in Perl6, along with a few other parts of the language that bug me (like parameter passing to subroutines).

        I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
        -- Schemer

        Note: All code is untested, unless otherwise stated

      Since you've invoked my name, some clarification on some things that my position on this are not... I would not say you should never use inheritance in Perl.

      And I vehemently disagree that "Inheritance is...a primary reason to have an object system in the first place". I think inheritance sucks in some important ways and these are made worse in Perl. And I also think people tend to way over-value inheritance.

      So I think one will be happier in the long run if one considers inheritance (of more than just interfaces) to be more of a "last resort" than the feature of OO. This is even more true in Perl since Perl doesn't really have interfaces (much less really supporting inheritance of them) and Perl's clunky OO system makes inheritance even more problematic in general.

                      - tye
Re: Never-to-use Perl features ?
by dada (Chaplain) on May 14, 2003 at 08:32 UTC
    • the ... operator
    • rand*10 parsed as rand(*10) instead of rand()*10 (not much of a Perl feature, but it still bites me after more than 6 years of perl hacking :-)

    King of Laziness, Wizard of Impatience, Lord of Hubris

      Tip: disambiguate those cases without adding parens by reversing the order of operands: 10*rand

      Makeshifts last the longest.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://257590]
Approved by broquaint
Front-paged by rinceWind
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2023-12-03 16:24 GMT
Find Nodes?
    Voting Booth?
    What's your preferred 'use VERSION' for new CPAN modules in 2023?

    Results (20 votes). Check out past polls.