Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: CGI.pm and encoding HTML entities in param()

by pc88mxer (Vicar)
on Jun 18, 2008 at 01:29 UTC ( #692606=note: print w/replies, xml ) Need Help??


in reply to CGI.pm and encoding HTML entities in param()

This is probably not CGI's doing but your browser's. For instance, if the charset of your pages is iso-8859-1, and you enter a non-latin1 character (like ā) into a text field, Firefox will represent the character in entity form (ā). This is essentially the best it can do since there is no way to represent a non-latin1 character in the latin1 encoding. This situation is explained well in the following article:

Character Conversions from Browser to Database

As for CGI, the values returned by param() are byte strings, not code-point strings. Due to the way the web standards evolved there just isn't enough information in the request to convert the parameter values to code-points. So this is something your application has to do based on what it knows about the encoding of the forms and web pages that will be calling it.

This thread sheds some additional light on the problem: CGI::Application - Which is the proper way of handling and outputting utf8. As Juerd notes, it would be helpful if CGI was (or could be made) encoding aware so that parameter values could automatically be passed through a decoding function.

A good way to help avoid character encoding problems is to 1) always explicitly specify the charset of your pages, and 2) settle on one encoding that can handle everything, e.g. UTF-8.

Replies are listed 'Best First'.
Re^2: CGI.pm and encoding HTML entities in param()
by skazat (Chaplain) on Jun 23, 2008 at 03:47 UTC
    My Goodness, I think you're right. Thanks for such an eloquent reply! Charsets And Encoding aren't my favorite gremlins to attempt to solve, especially in 8+ year code! - s

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://692606]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2022-05-16 15:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (63 votes). Check out past polls.

    Notices?