Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
And output should also be utf8 encoded unicode. Which it already is so I modified the step to skip the wrong encode step (new step 3) - am I doing it right now?

It would be easier to answer that if you showed us a relevant code snippet. And if you try the snippet yourself, that will probably answer the question. Check out this little unicode tool (shameless plug for a prog I posted recently), in case that helps to validate your data.

For the interested reader: in fact I use storable to serialize my resulting data structure as whole, then I gzip the freeze'd data and write it to disk with a simple binmode (and thus not :utf8) filehandle. Any problems here? utf8 data and utf8-flag should stay intact over the pipeline.

The utf8 flag is strictly a perl-internal attribute of scalar values. Once data is written to any sort of file (including any pipe), it's just data, and what happens to it after that point depends on what sort of process is reading it, and how that process chooses to interpret what is being read.

There is a section of the Storable man page about utf8 (under the heading "FORWARD COMPATIBILITY"), which you should consult. It looks like it will "do the right thing" for you by default (retain the utf8 flag as part of the "freeze"d data structure so that a downstream "thaw" gets it), but it'll be worth testing to be sure. (I haven't used it, so I don't know.)


In reply to Re^2: The unicode / utf8 struggle, part 2: regexes by graff
in thread The unicode / utf8 struggle, part 2: regexes by isync

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (2)
As of 2024-04-20 05:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found