Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Re: utf8 problems

by jbrugger (Parson)
on Jun 02, 2006 at 12:44 UTC ( #553276=note: print w/replies, xml ) Need Help??

in reply to utf8 problems

Thanks for both answeres, yes the problem is vague, and vague to describe...
We have as string and have the internal utf8 flag switched to on:
# my example may be any string from a database /user input etc. my $example = "just a string that i nd to be utf8 encoded, but can't + see it in the chars, so guess::encode won't work"; Encode::_utf8_on($example); # ... use open ':utf8'; use open ':std'; #... my $cgi = new CGI; print $cgi->header( -type => 'text/html', -expires => '-1d', -cookie => [$cookie], -charset => 'UTF-8', ) print $example;
Now the first time it's called using mod::perl regestry it prints:
just a string that i nd to be utf8 encoded, but can't see it in the +chars, so guess::encode won't work
but the second time:
just a string that i nééd to be utf8 encoded, but can't see it in th +e chars, so guess::encode won't work
This is weird, and i have no controll over how mod perl internally stores it's values.

"We all agree on the necessity of compromise. We just can't agree on when it's necessary to compromise." - Larry Wall.

Replies are listed 'Best First'.
Re^2: utf8 problems
by bpphillips (Friar) on Jun 03, 2006 at 13:18 UTC
    Two things you should check to make this example work how you're attempting.

    - Is your file UTF-8 encoded (I usually use the *NIX file command or check VI's :set fileencoding to verify this -- although there may be other ways to do this)
    - Do you have a use utf8 at the beginning of your script?

    Whenever you're using UTF-8 content within the body of your script (as you're doing in your example at least) you need to make sure you tell perl that it should use character semantics rather than byte semantics on that data. This is accomplished by placing a use utf8 within the lexical scope that you're using UTF-8 data. This also makes it unnecessary to perform the Encode::_utf8_on() operation.

    However, as noted in bold in the utf8 docs: "Do not use this pragma for anything else than telling Perl that your script is written in UTF-8". If you're retrieving data from a GET/POST parameter or from a database, it's a different story.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://553276]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (9)
As of 2020-05-26 13:52 GMT
Find Nodes?
    Voting Booth?
    If programming languages were movie genres, Perl would be:

    Results (150 votes). Check out past polls.