Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: Re: Re: Re: Code review: validation regexes

by l2kashe (Deacon)
on Jul 09, 2003 at 05:07 UTC ( [id://272577]=note: print w/replies, xml ) Need Help??


in reply to Re: Re: Re: Code review: validation regexes
in thread Code review: validation regexes

While you are correct, I like to keep to what I think are best practices in my regular coding.

For me this means
attempting to maintain consistancy of style across a codebase
utilizing idioms as much as possible (unless using another mechanism provides better readability)
attempting to do as much I/O as possible at once to minimize waiting for a resource to become available again
scoping my variables as tightly as possible, and if they are needed elseware returning a reference to the data as opposed to the data

So from my angle of best practices, its a natural extension to only request the pieces of data I am actually going to do something with. If I only need the month from localtime I do
my $mth = ( split(/\s+/, localtime) )[1];
I dont need the other values, so why waste the RAM and cycles building the lvalue list only to use a single element? Granted here its slightly contrived, but in more complex code, only building what I need, and returning only what is essential tends to turn out a more maintainer friendly codebase in my personal experience.

Update: I don't know that I was clear in how I make the jump to only capture the data I'm going to use, based on the list above. I try to keep the codebase trim, as the more keystrokes the easier for a bug to creep in (yay strict, warnings, boo logic errors :P). So from there, its an extension to keep the number of variables down. I guess I should also add judicious use of comments to my best practices list. I comment blocks, and particularly interesting lines (ala, splitting some value and pulling elements, I will generally to something like # 0: foo 1: bar 2: baz).

MMMMM... Chocolaty Perl Goodness.....

Replies are listed 'Best First'.
Re: Re: Re: Re: Re: Code review: validation regexes
by bigj (Monk) on Jul 09, 2003 at 07:00 UTC

    While you are correct, I like to keep to what I think are best practices in my regular coding.

    Of course, you shall keep on on your best practices. But there are more best practices than only one. One is to program very efficient, the other is to guarantee a simple maintainance and readability. Both ways are there for avoiding problems in the future, but there is no best best practice.

    I would have also used the /^(...|...|...)$/ way. My most important reason would have been, that it is easier to read. The extra ?: doesn't have an influence to the algorithm done by the program, it only influences the internal way Perl is handling the regexp and its global variables. But the two extra character are confusing our eyes, making it a bit (really only a bit, but why should I renounce it) more difficulty to understand the algorithm some years later.

    That doesn't mean, you're on the the false way, I only wanted to clarify that it is correct to give a beginner a simple, but (still) productive way, unless the speed penalty really hurts.

    You gave another excellent example:
    my $mth = ( split(/\s+/, localtime) )1;
    
    I would have written it either as:
    use POSIX qw/strftime/;
    my $month = strftime "%m", localtime;
    
    or
    use Date::Simple;
    my $today = Date::Simple->new;
    my $month = $today->month;
    
    allthough both ways are much slower than your suggested one. Both have the benefit that you don't need to know what the magic 1 is referencing in the time array to (what can be a great benefit if some other or myself in one year) has to maintain the code. (In addition you can avoid extra calculations to the month as they range in the natural 01-12 or 1-12 way and not from 0..11)

    Greetings,
    Janek
      This comes back to the age old, do I code what I can or do I code for whomever may have to maintain this after me. If I bang out a piece of code, I expect that the next person will have a comparable level of experience, be willing to go through the perldocs, or leave the code alone. I tend to shy away from detracting functionality or speed from a code base, due to future maintenance. If they do play with the code, and don't understand what a particular line or block is doing, I expect them to copy the code to another place and examine it there, or deal with breaking the code or other ramifications. I also keep a current/most recent copy of all the code I write in my home directory, just in case I get that phone call "The program is broken.. no we didn't edit it" and diff v1.pl v2.pl produces output. (where to go from there is a different story)

      With that said, I'd like to comment on your counterpoints.

      If you don't use (?:..|..|..), do you also not use ?!, ?=, ?<, ?>, ?<=, ?>=, ?{}. Its a shame to give up the functionality provided by these idioms, simply due to the fact that they look funny. Worst case a simple # perldoc perlre, would suffice to point the next person in line to the appropriate documentation for those operators.

      In terms of my way of collecting a month versus your way. It appears that you didn't actually test the line to see what ends up in $mth, but rather assumed it was the numerical value, which it is not. Its actually the 3 letter abbrev for the current month. Going back to comparable Perl knowledge levels, it's interesting that you called the 1 magical. For me it is a simple powerful feature of perl. Any time parentheses come into play, the context within becomes a LIST (Im pretty sure this is true, though if any monks can think of a situation where this is not true please let me know) With that in mind, a LIST and an array are basically the same (*peers around waiting for the lightning strike*), and can be sliced the same. Here is an example.
      my @date = split(/\s+/, localtime); # note the @date, when using an array slice my($mth, $date, $year) = @date[1,2,4]; # Versus my($mth, $date, $year) = ( split(/\s+/, localtime) )[1,2,4];
      Both of the lines which assign the values to the scalars, deal with LIST context. The ability to force LIST context on the left or right is an extremely simple and powerful concept, and should be in every perl coder's bag of tricks, along with slices.

      One last comment. Yes it is good to provide beginners with a productive way to get something done, it is also good to toss out information that may cause them to have to look at the docs. Lets say I use the (?:blah|foo) construct with the comment of perldoc perlre. If they have a newer version of perl and an older version of the LLama ( as well as the Camel {I checked: some or the ? tokens are talked about briefly on page 68 of Camel 2nd Ed}) book, they will see stuff in that document they would have never known existed, if they relied strictly on the books.



      MMMMM... Chocolaty Perl Goodness.....

        With that said, I'd like to comment on your counterpoints.

        I didn't want to offense you, all I wanted to say is there are more than one way to do it, and they are all acceptable.

        If you don't use (?:..|..|..), do you also not use ?!, ?=, ?<, ?>, ?<=, ?>=, ?{}. Its a shame to give up the functionality provided by these idioms, simply due to the fact that they look funny.

        In fact, I like to use the non-capturing groups, positive/negative look aheads and behinds and closefisted regexp parts. I use them where they are implied by the algorithm. (In doubt, I choose the way with the least type strokes and the least reading effort). The ?: construct is needed when we want to capture some parts, and some other group parts are only needed for simple grouping. In addition, the ?: construct is important, if the regexp is reused in other regexps (e.g. build with the qr-operator). In the OP's case, he only wanted to find out whether a string is exactly one of some alternatives. In that case you need a group consisting of the alternatives and enclosed by the ^ and $. It doesn't matter whether it is captured or not and in doubt I would prefer the easier solution. The ?: solution is not worse, nor better as you wrote for this task. (It's perhaps only a bit quicker, but for the price of a longer and a bit more complicated script).

        In terms of my way of collecting a month versus your way. It appears that you didn't actually test the line to see what ends up in $mth, but rather assumed it was the numerical value, which it is not.

        To say the truth, I confound the list to the scalar context and I also confound the min of the doc as mon(th) :-( But please note, that it is also simple to get confused, as the list context is suggested by the (...)[1] part, allthough it comes from a split. Of course, it was my error, but as I think making errors is typical for humans - at least for me - thus I prefer to program as directly as possible. That involves, trying to express the algorithm without any indirection. A split call has nothing to do with time per se, the (from me called magical) 1 has nothing to do with months per se, it is only technical. Of course, it is a common idiom, but the algorithm is hided on the first glance.

        In contrast, my prefered solution my $month = strftime "%b", localtime; is at least not more cryptic, it's shorter and for a lot of typical date formats, there are handy, simple to understand shortcuts (like %d for day, %m for month (number), %y for year, %w for weekday(number) and so on). It is simple to change and often simple to understand, even without a manual.

        Don't misunderstand me, I don't want to say, that your style is wrong. Indeed, I like it also. But it is also a good style to program in another way and it is also quite full acceptable to advert readability and error preventing style.

        Greetings,
        Janek

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://272577]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (9)
As of 2024-04-23 08:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found