http://qs321.pair.com?node_id=173786


in reply to CGI fails to urlencode & chars in outbound url's

You also may use CGI.pm's own escape function:

use CGI; print CGI::escape("Chalk & Cheese"); # prints: Chalk%20%26%20Cheese

alex pleiner <alex@zeitform.de>
zeitform Internet Dienste

Replies are listed 'Best First'.
Re: Re: CGI fails to urlencode & chars in outbound url's
by BrowserUk (Patriarch) on Jun 12, 2002 at 12:22 UTC

    I have now found a CGI::escapeHTML() function in the perldoc CGI, but not a CGI:escape()? Your snippet works (for me also) though so it is obviously there, I just can't find any docs to it.

    The perldoc CGI suggests that escapeHTML() will (often automatically if autoescaping is on, which it is by default and I haven't changed it) handle the conversion of & to &amp;, but this contradicts the evidence I am seeing - which would cause me to re-evaluate the evidence except that:

    1) I can see that the spaces are being escaped to %20, but the & stays resolutely unchanged.

    2) Adding URI::uri_escape() around $path/$_ in the original line, cures my problem.

    However, escapeHTML seems to be dependant (I haven't understood the docs fully yet) upon having or using character set of ISO-8859-1?

    I'm passing this along incase this is something that isn't confined to just my system/OS.

      CGI::escape comes from CGI::Util and is not documented (the code is the documentation :-). It is used within CGI.pm and is usable outside, too.

      #### from CGI/Util.pm sub escape { shift() if ref($_[0]) || (defined $_[1] && $_[0] eq $CGI::DefaultCla +ss); my $toencode = shift; return undef unless defined($toencode); $EBCDIC = "\t" ne "\011"; if ($EBCDIC) { $toencode=~s/([^a-zA-Z0-9_.-])/uc sprintf("%%%02x",$E2A[ord($1)] +)/eg; } else { $toencode=~s/([^a-zA-Z0-9_.-])/uc sprintf("%%%02x",ord($1))/eg; } return $toencode; }

      For more info see Dump a directory as links from CGI.

      escapeHTML is fine to produce HTML, but not useful for URIs (your "&" is the best example). If we have an unescaped "&" then this is a parameter delimiter.

      alex pleiner <alex@zeitform.de>
      zeitform Internet Dienste

        Thankyou! for (1..1000000);

        Thankyou at two levels.

        1) For taking the time to follow this up for me, and for the Dump-a-dir link. I really wish I had read this a week ago!!

        2) For making me feel sane again. The very fact that someone of merlyn's stature says:

        Getting the HTML and URI escaping right for creating links and labels is fun, not!

        ...makes feel better, cos it has taken me a week and the assistance of many good folks here to get this to work. I was really beginning to beleive that I was thick (which I may still be, but this isn't proof!:).

        I guess that it was bad luck to pick this as my first try at Perl! Still hopefully I've learned stuff that I might not otherwise have done.

        $EBCDIC = "\t" ne "\011";
        Could someone explain this snippet of a snippet ? I've tried guessing using my limited knowledge of operator precedence.
        I think i'm missing something obvious !
Re: Re: CGI fails to urlencode & chars in outbound url's
by Juerd (Abbot) on Jun 12, 2002 at 16:10 UTC

    You also may use CGI.pm's own escape function:

    This is probably irrelevant to this discussion. The a() function is being used, which already escapes its arguments. Or at least CGI.pm version 2.752 does.

    2;0 juerd@ouranos:~$ perl -MCGI=a -le'print a({ -href => "&&&" }, "asd +f")' <a href="&amp;&amp;&amp;">asdf</a>

    - Yes, I reinvent wheels.
    - Spam: Visit eurotraQ.
    

      But that's exactly what nobody wants, as the & of &amp; is treated as parameter delimiter.

      #!/usr/bin/perl -w use strict; use CGI; my $q = new CGI; print $q->header; print "<pre>\n"; print $_, "=", $q->param($_), "\n" for $q->param; print "\n</pre>\n";
      prints
      foo=bar amp= baz=
      for
      http://electra.igd.fhg.de/cgi-bin/test3.pl?foo=bar&amp;baz
      but it should print
      foo=bar&baz
      You need to use CGI::escape or URI::Escape but not CGI::escapeHTML.

      Update: Just for completeness: a() calls CGI::Util::make_attributes on each attribute, the latter calls CGI::Util::simple_escape for escaping ("&"->"&amp;" and some others). CGI::Util::escape does something different (see my last post).

      alex pleiner <alex@zeitform.de>
      zeitform Internet Dienste