Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Okay. I need a place to vent on the topic of CGI and this seems like the perfect place.

I've been doing some research on HTML4, parsing HTML, and related topics. I've recently been trying to build a browser in Perl (and yes, I use the modules when I know about them).

As the very first parsing I did, I grabbed all the <h1> to <h6> tags. These tags, when used correctly, should give a good outline of what's on the page. But guess what? I checked a lot of major sites, like search engines, news sites, then some great discussion sites that are success stories for Perl, then a few random Monk home pages. Almost nobody uses these tags. I thought there was a problem with my program! Everybody is using the <font> tag instead of the classic header notations.

Then I got curious, so I headed to an HTML validator. I checked all the same pages again. I found two pages that were even close to "valid" compared to the standard. And one of those was w3's own home page. The other was missing a single alt tag on a gif.

So here's my sore spot. Using is obviously recommended. I can't think of a single reason not to use something that comes with every default install of Perl. That would be like writing foreach loops to perform an action on list elements instead of using map. But even those that I can't imagine are not using (slashdot or perlmonks, for instance) do not generate valid HTML. Not even close according to the error report.

While I've seen plenty of loud complaints when people roll their own form parsers, I am not seeing those same loud complaints when people (mis)use CGI to generate suboptional HTML or who use it to parse the forms, but then completely ignore it for generating HTML. It would appear that even those people who are using can't count on it to put in some default alt tag for their gifs when they forget-- it creates correctly formed HTML that may be gibberish according to the standards.

The module might as well be CGI::ParseForms and skip all the HTML building routines, for the ways it seems to be used in the wild. And frankly, given how much trouble I have with fonts that get too small, or pages that are completely unreadable in text-only mode (yes, I like to browse in Lynx sometimes just to get away from all the image rendering issues and time wasted waiting for them download over the modem), I'd like to see us make stronger, more frequent recommendations to use CGI for building HTML and then to remember that using it is no guarantee of perfect HTML either.

Some of the above is altered and Update: based on responses below, I'm not sure what I said that muddied my point. What I'm saying is simple. Feel free to keep harping away on the poor souls who roll their own parsing routines instead of using But please, consider applying the same critical eye to people who use only 5% of the functionality of the module and continue to hand code HTML (often hard coding large chunks of it into their scripts), or who use the module to create crummy HTML by subverting the fact that while it writes well-formed HTML it does not validate tags, attributes, or block/inline nesting.

In reply to (ichimunki) re: use CGI or die; by ichimunki
in thread use CGI or die; by Ovid

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others contemplating the Monastery: (2)
    As of 2020-11-26 05:01 GMT
    Find Nodes?
      Voting Booth?

      No recent polls found