Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Encode/Decode Problem

by macPerl (Beadle)
on Mar 04, 2005 at 16:38 UTC ( [id://436637]=perlquestion: print w/replies, xml ) Need Help??

macPerl has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

I have been banging my head against a wall on this issue for the last few hours and cannot make sense of it ...

A html form gets submitted with a non-standard character (e.g. ) in it's escaped format €

When the form is POSTed, it arrives on STDIN as %E2%82%AC

This translates to € when it goes thro' my (bog- standard, Perl-101) POSTParser

my (%searchField, $buffer, $pair, @pairs); read(STDIN, $buffer, $ENV{ 'CONTENT_LENGTH'}); @pairs = split(/&/, $buffer); foreach $pair (@pairs){ my ($name, $value) = split(/=/, $pair); $value =~ tr/+/ /; $value =~ s/%([a-fA-f0-9][a-fA-f0-9])/pack("C", hex($1))/eg; $name =~ tr/+/ /; $name =~ s/%([a-fA-f0-9][a-fA-f0-9])/pack("C", hex($1))/eg; $searchField{ $name } = $value; } return %searchField;

What can I do to deal effectively with this and other similar characters (á,é,í,ó,ú, etc)?

Replies are listed 'Best First'.
Re: Encode/Decode Problem
by JediWizard (Deacon) on Mar 04, 2005 at 16:44 UTC

    Is there a reason you are not using CGI.pm?

    Update: I have yet to hear a good reason to try to re-invent that particular wheel, and frankly doubt there is one


    A truely compassionate attitude towards other does not change, even if they behave negatively or hurt you

    —His Holiness, The Dalai Lama

Re: Encode/Decode Problem
by Joost (Canon) on Mar 04, 2005 at 16:44 UTC

      Joost, Jedi, et al

      Reason for not using CGI is due to lack of experience when starting this project ... (kindof guessed I was inviting these comments).

      Right now it is late Friday pm and I would love a solution - if nothing else it will help fill in gaps in my understanding.

      The charset of the form is ISO-8859-1. I did try UTF-8, but was just stabbing in the dark.

      Don't understand character encoding ... most discussions (incl W3C HTML4 Spec) refer to docs going from server to browser and how User Agents handle docs.

      On Form Submission (i.e. going from Browser to Server), I did come across some discussions on how to "steer" the server in the right direction with a hidden field (nice one borisz), but don't think it is applicable in this situation

      A lot of other info indicates that what has been proposed for determining/specifying character sets when submitting a form is at best patchily implemented.

      Will have another test of Encode and see how it goes ... only tested coding to utf-8 !

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://436637]
Approved by Joost
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2024-03-29 15:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found