http://qs321.pair.com?node_id=21576

Item Description: Perl CGI

Review Synopsis: Less than worthless: dangerous

Note: I have since realized that the rude comments that I made in this post distracted from the content. I am embarrassed by how I handled this post and I hopefully have learned a valuable lesson: always be as kind as possible, even when one is being critical. I would love to see this node go away, but that would be unfair to the people who have rightly criticized the attitude used in this review. My sincerest apologies to Elizabeth Castro and all others who may be offended by this.

Introduction

I needed to learn Perl and CGI and I needed to learn it fast. Having purchased, and enjoyed, another Visual QuickStart Guide, I went ahead and bought "Perl and CGI for the World Wide Web" by Elizabeth Castro, first edition, Peachpit Press. I did learn some Perl and I thought the book was handy. Now, if I go to amazon.com and read all of those glowing reader reviews, I can understand why people like this book. In fact, the only significant problem that I have with this book is that it was published.

I'll pause now to let that sink in.

In Ms. Castro's introduction, she writes, "This book starts at the very beginning. Even if you've never, ever programmed before, you'll be able to understand this book." I was definitely a beginner when I bought the book and that's my only excuse for saying I liked it.

Let's face it, in a perfect world, we don't give knives to little children, we don't ask the police officer if we can finish our beer before we step out of the car and we don't let novice programmers cut their teeth on CGI!!!

The real world is different. CGI is sexy. Your boss was impressed when you pointed out that his mouse had buttons so he asks you to do the company Web site. You want a career change and make more money. Whatever the reason, absolute amateurs will want/need to learn Web programming. That being said, I can see a need for a "Perl/CGI for Newbies" book. Castro's book should not be that book.

The Premise

The idea behind the "Visual QuickStart" series should appeal to many programmers. From the cover: The "Visual QuickStart Guide uses pictures rather than lengthy explanations."

Rather than have a dense, technical tome that would scare off prospective buyers, Visual QuickStart guides have each page split in half. The outer half is an incredibly simple-minded explanation of how to type and the inner half has pictures of the result. For Castro's HTML book, this was a good combination. With "Perl and CGI," we have pictures of words. The outer-half also goes into explanations of what is going on and this is probably Castro's greatest strength: she can explain things in a clear, concise manner. If you pay attention, you can really come up to speed fairly quickly with Visual QuickStart guides.

The Problem

This book is not only aimed at novice programmers, it also assumes that the reader may know nothing about HTML. In fact, all of Chapter 3 is spent hand-holding the reader through creating a simple Web form. Given this, I would feel that the novice programmer needs a serious heart-to-heart talk about security. In fact, Peachpit Press also feels that there are some concerns there. In their "Notice of Liability", they state that "every precaution has been taken in the preparation of this book." No, it hasn't. Ms. Castro obviously feels that security is so important that she prominently displays a section on it buried in the back of the book in Appendix C (for Rush Limbaugh fans: the preceeding sentence is what we humans call "sarcasm").

Poor treatment of security is the number one problem with this book and is the main reason why I cannot recommend it. There are numerous other problems why this book and your fireplace should be intimately familiar with one another, but this is the worst. I'll detail this and mention the other problems in passing.

First, except for a couple of brief mentions in the text, security is not covered until Appendix C, page 235, near the end of the book. Many people never bother with appendices. Why should they bother now? Further, if they do read it, Castro hints that her scripts may not be secure (they aren't, but she doesn't come right out and say that), but tells the wide-eyed newbie that they "must be sure that [the scripts] pose no security risk to your system." While she does list several URLs that the reader can peruse to learn about these issues, she then proceeds to give terrible security advice. For example, she mentions that you can use the "HTTP_REFERER environment variable to make sure the page that accesses the script is on your server and, indeed, located in a particular directory." To her credit, she points out later that this can be spoofed, but she then demonstrates a small Perl script to check for this value anyway, but it has no -T, no -w, no strict, and prints out an the untainted REFERER to a Web page, if the script doesn't like it.1

Two pages later, she finally gets around to explaining the -T switch for untainting data. She still doesn't bother with -w or strict, but who am I to be picky? Here are Ms. Castro's rigorous instructions for untainting data:

To untaint data:

  1. Type $outside_data = ~ /regex/; where some part of the regex is enclosed in parentheses to capture the desired portion of the outside data.
  2. Type $clean_data = $1; The scalar variable $1 is automatically set to contain the matched expression from step 1 (see page 150).
Needless to say, that's going to get some programmer handed a pink sheet. The closest that Castro comes to actually explaining what is going on is when she says "it's your responsibility to ensure that your regular expression does a good job of ensuring that the data is what it should be." Unfortunately, she never clarifies what goes into ensuring that "data is what it should be" or what the programmer should be looking out for.

Here's two other fun comments that she makes:

Other Problems

There are too many other problems to go into detail, so I'll just give highlights.

No -w switch or use strict;. Nope, she doesn't use 'em once. She once coyly mentioned that she was going to be "strict" and declare a variable, but that's about it.

Bad code: On page 87, Castro explains how to modify all members of an array. The example she gives prints a list of numbers, calculates their square roots, and then displays the list of square roots.

print "The numbers you entered were: "; foreach $number (@numbers) { print "<LI>$number"; } foreach $number (@numbers) { $number = sqrt($number); } print "<P>The square roots of those numbers are: "; foreach $number (@numbers) { print "<LI>$number"; }
Yup, that's right. She has iterated over the same array three times in succession. She mentioned that she could reduce this to two loops (with clearer HTML, she could do this in one). I have to disagree with an author showing sloppy code and mentioning in a footnote what can be done to clean it up.

I also got a good laugh noticing that this "HTML expert" couldn't be bothered to close her <LI> tags in the above code, but I'm just a lowly Perl programmer, so what do I know?

No CGI.pm: I have no problem with an author showing how to decode form variables by hand. After all, it's good to know what's under the hood. However, if I pick up a Perl book that purports to teach CGI and doesn't mention Lincoln Stein's CGI.pm module, I just put it back on the shelf. With the exception of books written before Perl 5 was available, I haven't seen a CGI book that fails to mention CGI.pm and was worth anything.

Summary

I could go on at length about the problems in this book, but I think I've hit the high points. The security issue is, unfortunately, an all too common one for CGI books. Lazy authors will publish insecure code and then mention in passing that this is a bad idea. I cannot stress enough how dangerous such an attitude is. When we have a book that is specifically aimed at new programmers, what better time to teach them good habits?

Summary of the Summary

Don't waste your money on this piece of trash.

1. This is a problem because the script could easily be modified to save the Web page rather than printing it out. Someone can then include <!--#exec cmd="arbitrary command"--> in the REFERER and your system may very well attempt to execute that command when the page is accessed. This is a common problem with Web security. A script with no obvious security holes is created and someone does a cut-and-paste of the features they need, but then uses those features in a completely insecure manner.

Further, we have the common problem of people adding links to porn images or dangerous javascript or all sorts of other goodies that can pose problems for people if user data is blindly printed to a page.