in reply to CGI.pm and encoding HTML entities in param()
Character Conversions from Browser to Database
As for CGI, the values returned by param() are byte strings, not code-point strings. Due to the way the web standards evolved there just isn't enough information in the request to convert the parameter values to code-points. So this is something your application has to do based on what it knows about the encoding of the forms and web pages that will be calling it.
This thread sheds some additional light on the problem: CGI::Application - Which is the proper way of handling and outputting utf8. As Juerd notes, it would be helpful if CGI was (or could be made) encoding aware so that parameter values could automatically be passed through a decoding function.
A good way to help avoid character encoding problems is to 1) always explicitly specify the charset of your pages, and 2) settle on one encoding that can handle everything, e.g. UTF-8.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: CGI.pm and encoding HTML entities in param()
by skazat (Chaplain) on Jun 23, 2008 at 03:47 UTC |