Common Perl Pitfalls

This node in no way means that I claim to be an expert on Perl. I hardly consider myself at an intermediate level (I'm still making my way through the Alpaca!). These are just some of the most common ways I've managed to shoot myself in the foot. I thought I would share them here in the hope that they would benefit others and in the hope that I may receive enlightenment from other, more experienced monks on how to better handle these issues. Most of them have to do with regex (go figure!). Here they are in no particular order:

EDIT: Made some changes to the proposed solutions above according to some keen insights from JavaFan and Jenda. Thank you for your constructive criticism.

Comment on Common Perl Pitfalls Select or Download Code

Replies are listed 'Best First'.
Re: Common Perl Pitfalls by JavaFan (Canon) on Apr 09, 2012 at 23:11 UTC
The solution: redefine $/ right after your slurp: No. That's just another potential problem. The solution is: `my $slurp; {local $/; $slurp = <INPUTFILE>};` [download] Or: `my $slurp = do {local(@ARGV, $/) = "inputfile"; <>};` [download] Or even: my $slurp = `cat inputfile`; [download]	[reply] [d/l] [select]
Re^2: Common Perl Pitfalls by Joe_ (Beadle) on Apr 09, 2012 at 23:14 UTC
Great stuff. Thanks for the feedback. I really like the second solution. The third solution isn't as portable, though. Why do you think redefining $/ as I did is a potential problem?	[reply]
Re^3: Common Perl Pitfalls by JavaFan (Canon) on Apr 09, 2012 at 23:23 UTC
The third solution isn't as portable, though. `cat` is probably available on more platforms than `perl` is. Of course, Windows rules the world, and both `cat` and `perl` are ported to Windows -- and, AFAIK, neither comes standard with the OS. Unlike `cat`, `perl` is not included in the POSIX standard for shell utilities. Why do you think redefining $/ as I did is a potential problem? Well, you consider someone modifying the code to be a potential problem. Would if someone modifies your code, and adds a return after the first assignment to $/, but before the second? Would if someone wraps the code in an eval, and the read triggers an exception?	[reply] [d/l]
Re^4: Common Perl Pitfalls by Joe_ (Beadle) on Apr 10, 2012 at 15:14 UTC
Re^2: Common Perl Pitfalls by chrestomanci (Priest) on Apr 11, 2012 at 12:59 UTC
Rather than mess around with $INPUT_RECORD_SEPARATOR (AKA: $/), and restore it afterwards, a better method would be to use the File::Slurp module. It is another common Perl pitfall to write new code for a common problem when you should have looked on CPAN. There is a very good chance that you will find a fully debugged implementation that considers all the edge cases. It never ceases to amaze me that people would prefer to spend half a day writing and debugging code, instead of 15 minutes finding and installing a module from CPAN.	[reply]
Re^3: Common Perl Pitfalls by JavaFan (Canon) on Apr 11, 2012 at 14:23 UTC
It always amazes me people prefer downloading a CPAN module, and using it, over writing a one-liner. I'm even more amazed that people think that just because there's a module on CPAN, it automatically is fully debugged and covers all the edge cases. I do wonder though, if it takes half a day to write: `my $slurp = do {local $/; <HANDLE>};` [download] how long do you need to type in: `use Some::Module::From::CPAN; my $slurp = Some::Module::From::CPAN->some_API(some_argument);` [download] Twice the number of lines, so, a full work day?	[reply] [d/l] [select]
Re^4: Common Perl Pitfalls by jdporter (Paladin) on Apr 11, 2012 at 15:15 UTC
Re^5: Common Perl Pitfalls by JavaFan (Canon) on Apr 11, 2012 at 15:29 UTC
Some notes below your chosen depth have not been shown here
Re^2: Common Perl Pitfalls by dwalin (Monk) on Apr 10, 2012 at 23:20 UTC
First solution is not equal to last two, as it implies that INPUTFILE is already open. I would say that the correct idiom looks like this: `my $slurp = do { open my $fh, '<', "inputfile"; local $/; <$fh> };` [download] P.S. I really like the second one, thanks. Not for production use, of course. :) Regards, Alex.	[reply] [d/l]
Re^3: Common Perl Pitfalls by JavaFan (Canon) on Apr 10, 2012 at 23:32 UTC
Considering that Joe_'s example uses the handle `INPUTFILE`, I don't have any problems with the implication, and I really don't see the need to come up with the snobby term correct idiom. (You consider a piece of code with error handling to be correct idiom? You're fired). I really like the second one, thanks. Not for production use, of course. :) Why not? It's not significant different from your correct idiom. It misses error handling (but then, so does your correct idiom), but that's easily handled: just add a `// die "slurp: $!";`.	[reply] [d/l] [select]
Re^4: Common Perl Pitfalls by dwalin (Monk) on Apr 10, 2012 at 23:50 UTC
Re^5: Common Perl Pitfalls by JavaFan (Canon) on Apr 10, 2012 at 23:57 UTC
Some notes below your chosen depth have not been shown here
Re: Common Perl Pitfalls by Jenda (Abbot) on Apr 09, 2012 at 23:55 UTC
Re: Regex in a loop The first while statement is perfectly fine ... if you intend to modify the variable. The first loop reads "while the variable still matches the regular expression do something with the variable" while the second reads "while there's still something more to match in the variable do something with the match". The rule is that in the first case you SHOULD modify the variable within the loop, while in the second case you SHOULD NOT modify it. Re: Deleting some array elements You should use grep(): `@filtered = grep {whatever('test', $you, want_with($_, 'aliased to an array element'))} @all;` Re: Slurping gone wrong `my $data = do {local $/; <INPUTFILE>};` or use File::Slurp Re: True is 1, false is...? Don't print just the value, print some more info to make sure you are looking at the result of the print statement you think you are and always put some kind of quotes around the variable: `print "Computed the number of angels: >$angel_count<\n";` Jenda Enoch was right! Enjoy the last years of Rome.	[reply] [d/l] [select]
Re^2: Common Perl Pitfalls by Joe_ (Beadle) on Apr 10, 2012 at 18:15 UTC
I agree on almost all counts. I remember having used the option you talked about (modifying the regex and matching in the loop without 'g') before. I just don't like it, though. I feel that it's quite unstable and will turn into an infinite loop the second you're not looking...	[reply]
Re: Common Perl Pitfalls by JavaFan (Canon) on Apr 09, 2012 at 23:16 UTC
Deleting some array elements I'd write that as: `@array = @array[grep {!should_delete($_)} 0..$#array];` [download] If only because your splice solution can be quadratic worst case, while the above is linear (assuming `should_delete` has a running time bounded by a constant).	[reply] [d/l] [select]
Re^2: Common Perl Pitfalls by Joe_ (Beadle) on Apr 09, 2012 at 23:24 UTC
That's a really great one, too. I've only recently started coming to grips with grep and map. I've always felt that this problem can be tackled by a one-liner but I just couldn't put my hands on it. Thanks for finally providing it :)	[reply]
Re^2: Common Perl Pitfalls by Joe_ (Beadle) on Apr 09, 2012 at 23:50 UTC
Care to elaborate on that "quadratic" comment? How do you figure? I'm not that good with complexity theory, I'm afraid...	[reply]
Re^3: Common Perl Pitfalls by JavaFan (Canon) on Apr 10, 2012 at 13:22 UTC
Care to elaborate on that "quadratic" comment? Say you want to delete all elements in the second half of the array. The first `N/2` iterations of your loop, no splicing happens. But on the `N/2 + 1^st` iteration, the splicing takes at least `N/2 - 1` steps, as that many array elements need to be moved. On the `N/2 + 2^nd` iteration, the splicing takes at least `N/2 - 2` steps. In total, you will be moving `Σ^N/2-1_i=1(i)` array elements. If I've done my math correctly, the above sum equals `(N² - 2N + 4)/8`. Which means your algorithm runs in `Ω(N²)` time.	[reply]
Re^4: Common Perl Pitfalls by Joe_ (Beadle) on Apr 10, 2012 at 18:10 UTC
Re: Common Perl Pitfalls by JavaFan (Canon) on Apr 09, 2012 at 23:28 UTC
The thing is, Perl treats the result of a false logical test as the empty string (in scalar context) It's actually a dual (triple) var: `$ perl -MDevel::Peek -wE '$x = 1 < 0; Dump $x' SV = PVNV(0x97f45a8) at 0x9805ad0 REFCNT = 1 FLAGS = (IOK,NOK,POK,pIOK,pNOK,pPOK) IV = 0 NV = 0 PV = 0x9801438 ""\0 CUR = 0 LEN = 4` [download]	[reply] [d/l]
Re^2: Common Perl Pitfalls by Joe_ (Beadle) on Apr 10, 2012 at 18:12 UTC
I will have to RTFM on that one :)	[reply]
Re: Common Perl Pitfalls by Anonymous Monk on Apr 09, 2012 at 23:56 UTC
I know that most of these will probably look too obvious to the veterans. But definitely not obvious to remember. None of the first three (I didn't read past these) are listed in perltrap, and I don't remember seeing them listed as traps in one place like that See also Common Beginner Mistakes, Common Beginner Errors in Perl, NO, THAT'S WRONG! Common Perl Pitfalls, Modern Perl	[reply]
Re^2: Common Perl Pitfalls by Joe_ (Beadle) on Apr 10, 2012 at 18:22 UTC
Thanks for the links. They seem really interesting.	[reply]
Re: Common Perl Pitfalls by JavaFan (Canon) on Apr 10, 2012 at 06:49 UTC
Of course if you had meant the string $to_replace is an actual regex to match against, you're better off using the qr operator: I don't get this point. You started off that section with: `$to_replace='some_string'; $my_string=~ s/$to_replace/$better_data/;` [download] and doomed this catastrophically unsafe, because `$to_replace` may actually contain characters that have a special meaning. But if `$to_replace` is actually a regexp, the premises is gone -- any special characters are intentional. In fact, it's quite fine in that case to use the above.	[reply] [d/l]
Re^2: Common Perl Pitfalls by Joe_ (Beadle) on Apr 10, 2012 at 18:21 UTC
I actually meant to say that one shouldn't use a scalar as a regex anyway. I meant to say that, even if your correct semantics didn't require the use of \Q and \E (i.e. you actually needed the metacharacters) then you're better off using the qr// operator instead of building your regex as a literal string.	[reply]
Re^3: Common Perl Pitfalls by JavaFan (Canon) on Apr 10, 2012 at 21:29 UTC
Well, that's nice you want to say that, but can you back up your statement with an argument?	[reply]
Re^4: Common Perl Pitfalls by Joe_ (Beadle) on Apr 10, 2012 at 22:10 UTC
Re^5: Common Perl Pitfalls by JavaFan (Canon) on Apr 10, 2012 at 23:19 UTC
Some notes below your chosen depth have not been shown here
Re: Common Perl Pitfalls by sundialsvc4 (Abbot) on Apr 12, 2012 at 13:33 UTC
I wish that this thread had not promptly become “threaded” so much, thereby diluting its content such that now someone would have to wade through a lot of back-and-forth conversation to glean the “final” meaning out of it -- some of those conversations seeming to be fairly nit-picking anyhow. Threads, particularly in the Meditations section, are going to be referred-to for many years to come. I’d therefore suggest that they ought to be built-up in this way, when possible ... you are building a sort of reference article when you write in this particular section. If you want to debate fine-points, do it over in Seekers, then put a summarization or edit over here, with appropriate hyperlinks to the relevant discussions. If you want to do major edits, come back up to the first level of reply-nesting. I think that would make for better final-content, IMHO. The `<strike>` ~~strike tag~~ is great to explicitly show edits. Someone ought to be able to read just the top-article ~~and perhaps the first-level replies~~ and come away with an accurate reply (incorporating all of the various back-and-forths which they didn’t have to wade through) with a minimal amount of reading. I just think that would be better... it reflects how I use this resource (constantly), anyhow, and what I personally would prefer as a reader.	[reply]
Re: Common Perl Pitfalls by girarde (Hermit) on Apr 11, 2012 at 18:06 UTC
This post would have been imporoved by the use of `<continue>` tags.	[reply] [d/l]
Re^2: Common Perl Pitfalls by Joe_ (Beadle) on Apr 18, 2012 at 15:35 UTC
Thanks for the advice. I do agree. The <continue> tags didn't work so I used spoilers instead.	[reply]
Re: Common Perl Pitfalls by muba (Priest) on Apr 29, 2012 at 19:32 UTC
`while($string=~ m/reg(ex)/) { $string=~ s/$1/ister/; }` [download] That frightens me a bit. Your substitution replaces the first occurance "ex" with "ister" in the string, no matter whether "ex" is part of "regex" or something else: `my $string = "Sometimes there are extra effects you didn't foresee whe +n using a regex."; while($string=~ m/reg(ex)/) { $string=~ s/$1/ister/; } print $string;` [download] Output: `Sometimes there are istertra effects you didn't foresee when using a r +egister.` [download] Istertra?	[reply] [d/l] [select]
Re^2: Common Perl Pitfalls by Joe_ (Beadle) on May 14, 2012 at 22:21 UTC
You are quite right. It's a terrible mistake on my part. I will edit it.	[reply]

Back to Meditations