Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

andal : " ... First of all, you have to worry about representation of characters in the octets that you receive from external applications. That depends on locale settings ... "

OP " ... I assume that the problem is my code and not the data coming in since one can usually depend on people to get their own names right ... "

It would appear that my initial assumption was incorrect. I challenged that & as it turns out what I am dealing with is a mixture of localized character sets taken as input from across Europe, cut & pasted between spreadsheets in an HR department spanning multiple offices.

(ノωノ)

These are conscientious people, mind you, who are concerned about getting the characters just right by potentially editing with multiple programs along the way...

I'm glad that I asked and should have done so sooner.

Now, in the proper mindset and having done my revision a big thing I was missing was that I was using :utf8 instead of :encoding(utf8) which allowed me to regain the trust factor in the data.

I had all kinds of stupid ideas and bad assumptions that led me to chase phantoms. Now at least I can identify mangled input on the way in.



Wait! This isn't a Parachute, this is a Backpack!

In reply to Re: How to concatenate utf8 safely? by gregor42
in thread How to concatenate utf8 safely? by gregor42

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (3)
As of 2024-04-24 21:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found