![]() |
|
The stupid question is the question not asked | |
PerlMonks |
perlman:perlop2by gods (Initiate) |
on Aug 25, 1999 at 06:09 UTC ( [id://378]=perlman: print w/replies, xml ) | Need Help?? |
perlop2Current Perl documentation can be found at perldoc.perl.org. Here is our local, out-dated (pre-5.6) version: Gory details of parsing quoted constructsWhen presented with something which may have several different interpretations, Perl uses the principle DWIM (expanded to Do What I Mean - not what I wrote) to pick up the most probable interpretation of the source. This strategy is so successful that Perl users usually do not suspect ambivalence of what they write. However, time to time Perl's ideas differ from what the author meant. The target of this section is to clarify the Perl's way of interpreting quoted constructs. The most frequent reason one may have to want to know the details discussed in this section is hairy regular expressions. However, the first steps of parsing are the same for all Perl quoting operators, so here they are discussed together. Some of the passes discussed below are performed concurrently, but as far as results are the same, we consider them one-by-one. For different quoting constructs Perl performs different number of passes, from one to five, but they are always performed in the same order.
I/O Operators
There are several
I/O operators you should know about.
A string enclosed by backticks (grave accents) first undergoes variable substitution just like a double quoted string. It is then interpreted as a command, and the output of that command is the value of the pseudo-literal, like in a shell. In scalar context, a single string consisting of all the output is returned. In list context, a list of values is returned, one for each line of output. (You can set
Evaluating a filehandle in angle brackets yields the next line from that
file (newline, if any, included), or undef at end of file. Ordinarily you must assign that value to a variable, but
there is one situation where an automatic assignment happens. If and ONLY if the input symbol is the only thing inside the conditional of a
while (defined($_ = <STDIN>)) { print; } while ($_ = <STDIN>) { print; } while (<STDIN>) { print; } for (;<STDIN>;) { print; } print while defined($_ = <STDIN>); print while ($_ = <STDIN>); print while <STDIN>;
and this also behaves similarly, but avoids the use of
while (my $line = <STDIN>) { print $line } If you really mean such values to terminate the loop they should be tested for explicitly:
while (($_ = <STDIN>) ne '0') { ... } while (<STDIN>) { last unless $_; ... }
In other boolean contexts,
The filehandles
STDIN,
STDOUT, and
STDERR are predefined. (The filehandles
If a <FILEHANDLE> is used in a context that is looking for a list, a list consisting of all the input lines is returned, one line per list element. It's easy to make a LARGE data space this way, so use with care.
The null filehandle <> is special and can be used to emulate the
behavior of sed and awk. Input from <> comes either from standard input, or from each file
listed on the command line. Here's how it works: the first time <>
is evaluated, the
while (<>) { ... # code for each line } is equivalent to the following Perl-like pseudo code:
unshift(@ARGV, '-') unless @ARGV; while ($ARGV = shift) { open(ARGV, $ARGV); while (<ARGV>) { ... # code for each line } }
except that it isn't so cumbersome to say, and will actually work. It really does shift array
You can modify
If you want to set
@ARGV = grep { -f && -T } glob('*') unless @ARGV; You can even set them to pipe commands. For example, this automatically filters compressed arguments through gzip:
@ARGV = map { /\.(gz|Z)$/ ? "gzip -dc < $_ |" : $_ } @ARGV; If you want to pass switches into your script, you can use one of the Getopts modules or put a loop on the front like this:
while ($_ = $ARGV[0], /^-/) { shift; last if /^--$/; if (/^-D(.*)/) { $debug = $1 } if (/^-v/) { $verbose++ } # ... # other switches }
while (<>) { # ... # code for each line }
The <> symbol will return undef for end-of-file only once. If you call it again after this it will assume you are processing another If the string inside the angle brackets is a reference to a scalar variable (e.g., <$foo>), then that variable contains the name of the filehandle to input from, or its typeglob, or a reference to the same. For example:
$fh = \*STDIN; $line = <$fh>;
If what's within the angle brackets is neither a filehandle nor a simple
scalar variable containing a filehandle name, typeglob, or typeglob
reference, it is interpreted as a filename pattern to be globbed, and
either a list of filenames or the next filename in the list is returned,
depending on context. This distinction is determined on syntactic grounds
alone. That means
One level of double-quote interpretation is done first, but you can't say
while (<*.c>) { chmod 0644, $_; } is equivalent to
open(FOO, "echo *.c | tr -s ' \t\r\f' '\\012\\012\\012\\012'|"); while (<FOO>) { chop; chmod 0644, $_; }
In fact, it's currently implemented that way. (Which means it will not work on filenames with spaces in them unless you have
chmod 0644, <*.c>;
Because globbing invokes a shell, it's often faster to call
A glob evaluates its (embedded) argument only when it
is starting a new list. All values must be read before it will start over.
In a list context this isn't important, because you automatically get them
all anyway. In scalar context, however, the operator returns the next value
each time it is called, or a undef value if you've just run out. As for filehandles an automatic defined is generated when the glob occurs in the test part of a
($file) = <blurch*>; than
$file = <blurch*>; because the latter will alternate between returning a filename and returning FALSE.
It you're trying to do variable interpolation, it's definitely better to use the
@files = glob("$dir/*.[ch]"); @files = glob($files[$i]);
Constant FoldingLike C, Perl does a certain amount of expression evaluation at compile time, whenever it determines that all arguments to an operator are static and have no side effects. In particular, string concatenation happens at compile time between literals that don't do variable substitution. Backslash interpretation also happens at compile time. You can say
'Now is the time for all' . "\n" . 'good men to come to.' and this all reduces to one string internally. Likewise, if you say
foreach $file (@filenames) { if (-s $file > 5 + 100 * 2**16) { } } the compiler will precompute the number that expression represents so that the interpreter won't have to.
Bitwise String Operators
Bitstrings of any size may be manipulated by the bitwise operators ( If the operands to a binary bitwise op are strings of different sizes, or and xor ops will act as if the shorter operand had additional zero bits on the right, while the and op will act as if the longer operand were truncated to the length of the shorter.
# ASCII-based examples print "j p \n" ^ " a h"; # prints "JAPH\n" print "JA" | " ph\n"; # prints "japh\n" print "japh\nJunk" & '_____'; # prints "JAPH\n"; print 'p N$' ^ " E<H\n"; # prints "Perl\n";
If you are intending to manipulate bitstrings, you should be certain that
you're supplying bitstrings: If an operand is a number, that will imply a numeric bitwise operation. You may explicitly show which type of operation you
intend by using
$foo = 150 | 105 ; # yields 255 (0x96 | 0x69 is 0xFF) $foo = '150' | 105 ; # yields 255 $foo = 150 | '105'; # yields 255 $foo = '150' | '105'; # yields string '155' (under ASCII)
$baz = 0+$foo & 0+$bar; # both ops explicitly numeric $biz = "$foo" ^ "$bar"; # both ops explicitly stringy
Integer ArithmeticBy default Perl assumes that it must do most of its arithmetic in floating point. But by saying
use integer; you may tell the compiler that it's okay to use integer operations from here to the end of the enclosing BLOCK. An inner BLOCK may countermand this by saying
no integer; which lasts until the end of that BLOCK.
The bitwise operators (``&'', ``|'', ``^'', ``~'', ``<<``, and ''>>``) always produce integral results. (But see also Bitwise String Operators.) However,
Floating-point Arithmetic
While Floating-point numbers are only approximations to what a mathematician would call real numbers. There are infinitely more reals than floats, so some corners must be cut. For example:
printf "%.20g\n", 123456789123456789; # produces 123456789123456784 Testing for exact equality of floating-point equality or inequality is not a good idea. Here's a (relatively expensive) work-around to compare whether two floating-point numbers are equal to a particular number of decimal places. See Knuth, volume II, for a more robust treatment of this topic.
sub fp_equal { my ($X, $Y, $POINTS) = @_; my ($tX, $tY); $tX = sprintf("%.${POINTS}g", $X); $tY = sprintf("%.${POINTS}g", $Y); return $tX eq $tY; }
The
POSIX module (part of the standard perl distribution) implements
Rounding in financial applications can have serious implications, and the rounding method used should be specified precisely. In these cases, it probably pays not to trust whichever system rounding is being used by Perl, but to instead implement the rounding function you need yourself.
Bigger NumbersThe standard Math::BigInt and Math::BigFloat modules provide variable precision arithmetic and overloaded operators. At the cost of some space and considerable speed, they avoid the normal pitfalls associated with limited-precision representations.
use Math::BigInt; $x = Math::BigInt->new('123456789123456789'); print $x * $x;
# prints +15241578780673678515622620750190521 Return to the Library |
|